Ideological scaling of political text across different contexts with contextualized word embeddings

The one-dimensional left-right scaling of political text is a simple yet efficient idea that is used widely. For scaling unknown text, researchers often use automatic methods of relative word distributions and document distances, or reference documents for scoring words. In this thesis, the student applies contextualized word embeddings (BERT) on media labels (, then classifies three types of political text sources: a) other media b) politician tweets c) manifestos. These three types are of increasing difficulty in terms of words and theory, which the student will explore by testing and comparing word distributions and characteristics of sources (who generates this text how, when, and why?). The final aim is therefore the description of the performance of a state-of-the-art method on ideological scaling for several types of political text.


