Political text scaling aims to linearly order parties and politicians across political dimensions (e.g., left-to-right ideology) based on textual content (e.g., politician speeches or party manifestos). Existing models, such as Wordscores and Wordfish, scale texts based on relative word usage; by doing so, they do not take into consideration topical information and cannot be used for cross-lingual analyses. In our talk, we present our efforts toward developing topic-based and semantically aware text scaling approaches. First, we introduce our initial work, TopFish, a multilevel computational method that integrates topic detection and political scaling and shows its applicability for temporal aspect analyses of political campaigns (pre-primary elections, primary elections, and general elections). Next, we present SemScale, a new text scaling approach that leverages semantic representations of text. We show its robustness with an extensive quantitative evaluation over a collection of speeches from the European Parliament in five different languages and by employing several different text preprocessing features. We conclude by describing the functionalities of an easy-to-use online demo and a Python implementation of SemScale.
Bio: Federico Nanni is a Post-Doctoral researcher at the University of Mannheim, affiliated with the Data and Web Science Group and the Political Science Department. His research deals with adopting (and adapting) Natural Language Processing methods for supporting studies in the Digital Humanities and Computational Social Science.