Sven-Oliver Proksch, Jonathan B. Slapin
How to Avoid Pitfalls in Statistical Analysis of Political Texts: The Case of Germany
The statistical analysis of political texts has received a prominent place in the study of party politics, coalition formation and legislative decision making in Germany. Yet we still lack a thorough understanding of the conditions under which such analysis produces valid estimates of policy positions. This article examines the properties of the word scaling method 'Wordfish' and uses the technique to estimate party positions in Germany. Through Monte Carlo simulations, we investigate the effects of the choice of texts on party position estimates, including the number of documents included in the analysis and their length. Moreover, we present guidelines on how to process linguistic information for political scientists interested in using the technique, focusing specifically on German texts. Finally, we present an analysis of the German party system from 1969-2005 using the Wordfish algorithm. We demonstrate the robustness of the algorithm to extract left-right positions for various subsets of words, but show that agenda effects dominate when estimating a long-time series if the entire manifesto corpus is analysed.