This Research Lab (Forschungspraktikum/Projektpraktikum) in summer term 2019 organizes participants (max. 12) into four groups that address various areas of the multi-faced phenomenon of political polarization online. Each group will be provided with datasets and tasks that are specific to four thematic areas: A. news bias, B. hate speech, C. rumour sequences, D. visual crawling.
Motivation: Currently, much effort is invested by academia, government, and industry into counteracting misinformation online. These are important efforts for our future, but focusing on which content is “fact vs. fake” is not enough. In reality, online opinions follow the gravity of political views, which are kinds of bias. This means that controversial topics consist of competing truths that clash aggressively, but are still “equally true”, at least for the clusters of same-minded actors involved in debates.
For registration: Please send email with details until February 1st 2019, then attend the introductory meeting on February 8th 2019, 2-4pm (C 208).
Each group has one github project based on the four following tasks, which have varying portions of foci on textual information and visual content:
A. News bias group
Aim: Detecting hyperpartisanship in news articles. Given a news article text, decide whether it follows a hyperpartisan argumentation, i.e., whether it exhibits blind, prejudiced, or unreasoning allegiance to one party, faction, cause, or person.
- analysis on author language → classify neutral or subjective
- detection of misleading headlines
- context analysis
B. Hate speech group
Aim: Multilingual detection of hate speech against immigrants and women in Twitter (hatEval)
- Hate Speech Detection against Immigrants and Women: a two-class (or binary) classification where systems have to predict whether a tweet in English or in Spanish with a given target (women or immigrants) is hateful or not hateful.
- Aggressive behavior and Target Classification: where systems are asked first to classify hateful tweets for English and Spanish (e.g., tweets where Hate Speech against women or immigrants has been identified) as aggressive or not aggressive, and second to identify the target harassed as individual or generic (i.e. single human or group).
C. Rumour sequences group
Aim: Sequence classification of rumours on Twitter/Reddit conversations
- Task A is to classify the type of interaction between a source tweet containing rumour and a reply tweet, as support, query, deny or comment.
- Task B is to predict the veracity of a given rumour by using output of Task A as a feature.
D. Visual crawler group
Aim: Crawling datasets of political figure, hate speech on given source containing visual images of politicians.
Data (choose one or more):
- Instagram (crawling social accounts of political figures)
- Crawling datasets
- Unsupervised labelling of the dataset for hate speech.
- Course outcomes and requirements
- Each group must submit prototypes that include the following:
- Components of the prototypes
- Group D (visual) configuration (i.e data repository) should be described as a docker-compose file at: https://docs.docker.com/compose/
- You can implement your system with any programming language (java, Python)
- Github repository must include README file describing each components in your design
- If you use 3rd party library, you should include it in this file.
- Until February 1st 2019 (max. 12 participants)
- Kick-off meeting
- Design, implementation, 2-page report
- Evaluation of 2-page reports (rotating by all participants, conference review style)
- Submission of prototypes
Please register via email and come to the kick-off meeting. Email with following content:
- Topic: “Research Lab: Political polarization”
- Your name
- Your semester
- Your course of study
- Three lines about your motivation to participate
- Group you want to join (three nominations, ranked). For example: “1. A, 2. C, 3. B”
No preparation needed. Gentle introduction that includes social science state of art. Group github repos will be set up.
Preparation needed. 3 students per group present an initial plan of how they will fulfill the group tasks. The groups must think about, organize, and prepare their plan with slides. Each groups has 15 minutes of presentation time, followed by QA.
Design, implementation, 2-page report
The remaining weeks will be spent by
- Independent meetings and task-oriented team work by the separate groups
- Weekly/ bi-weekly master meetings with all four groups and lecturers
- Towards the end of semester, each group submits 2-page reports of their work, which will be reviewed (conference-style) by other groups and the lecturers.
- Allcott, H., & Gentzkow, M. (2017). Social Media and Fake News in the 2016 Election. National Bureau of Economic Research.
- Boyd, L. & Vraga, E (2015) In Related News, That Was Wrong: The Correction of Misinformation Through Related Stories Functionality in Social Media, Journal of Communication, 65 (4): 619-638.
- Christopher Bail, Lisa Argyle, Taylor Brown, John Bumpus, Haohan Chen, M.B. Hunzaker, Jaemin Lee, Marcus Mann, Friedolin Merhout, and Alexander Volfovsky. 2018. "Exposure to Opposing Views can Increase Political Polarization: Evidence from a Large Scale Field Experiment on Social Media". SocArXiv. doi:10.17605/OSF.IO/4YGUX.
- Auter, Z. J., & Fine, J. A. (2016). Negative campaigning in the social media age: Attack advertising on Facebook. Political Behavior, 38(4), 999-1020.
- Jaidka, Kokil and Zhou, Alvin and Lelkes, Yphtach, Brevity is the soul of Twitter: The constraint affordance and political discussion (November 20, 2018).
- Theocharis, Y., Barberá, P., Fazekas, Z., Popa, S. A., & Parnet, O. (2016). A bad workman blames his tweets: the consequences of citizens‘ uncivil Twitter use when interacting with party candidates. Journal of communication, 66(6), 1007-1031.
- Pablo Barberá. 2015. "Birds of the same feather tweet together: Bayesian ideal point estimation using Twitter data". Political Analysis 23 (1): 76-91. doi:10.1093/pan/mpu011.
- Monroe, Burt L., Michael P. Colaresi, and Kevin M. Quinn. "Fightin’ words: Lexical feature selection and evaluation for identifying the content of political conflict." Political Analysis 16.4 (2008): 372-403.
- Iyengar S. and S. J. Westwood (2015) Fear and Loathing across Party Lines: New Evidence on Group Polarization, American Journal of Political Science. Vol. 59, No. 3 (July 2015), pp. 690-707
- King, G, J. Pan & M. Roberts, (May 2016) How the Chinese Government Fabricates Social Media Posts for Strategic Distraction, not Engaged Argument, Harvard University, http://gking.harvard.edu/files/gking/files/50c.pdf?m=1463587807
- Marco Bastos and Dan Mercea. 2018. "Parametrizing Brexit: Mapping Twitter political space to parliamentary constituencies". Information, Communication & Society. doi:10.1080/1369118X.2018.1433224
- Deen Freelon. "Analyzing online political discussion using three models of democratic communication". New Media & Society 12.7 (2010), pp. 1172-1190. doi: 10.1177/1461444809357927
- Maurice Vergeer, Liesbeth Hermans, and Steven Sams. "Online social networks and micro-blogging in political campaigning: The exploration of a new campaign tool and a new campaign style". Party Politics 19.3 (2013), pp. 477-501.
- Yu, Bei, Stefan Kaufmann, and Daniel Diermeier. 2008. "Classifying Party Affiliation
- from Political Speech". Journal of Information, Technology, and Politics. 5(1).
- Kick-off meeting slides as PDF
- List of media position indices based on two papers (read those papers first, and double-check if Twitter handles match the news media before using this list for crawling etc. LINK)
- even more extensive list of media bias labels: Allsides. Check how it can be used for research, for example here.
- An index of unreliable news websites https://www.poynter.org/ifcn/unreliable-news-index/
- Crawler for Media Bias/Fact Check: Labels contain factual (HIGH, MIXED, LOW) and bias (conspiracy, left, right, center, pro-science, ...) https://github.com/JeffreyATW/mbfc_crawler