United in Diversity? Contextual Biases in LLM-Based Predictions of the 2024 European Parliament Elections

Utrecht
,
2025

von der Heyde, Leah, Anna-Carolina Haensch, Alexander Wenz, Bolei MA

It has been proposed that “synthetic samples” based on large language models (LLMs) could serve as efficient alternatives to surveys of humans, considering LLM outputs are based on training data that includes information on human attitudes and behaviour. However, LLM-synthetic samples might exhibit bias, for example due to training data and fine-tuning processes being unrepresentative of diverse contexts. Such biases risk reinforcing existing biases in research, policymaking, and society. Therefore, researchers need to investigate if and under which conditions LLM-generated synthetic samples can be used for public opinion prediction. In this study, we examine to what extent LLM-based predictions of individual public opinion exhibit context-dependent biases by predicting the results of the 2024 European Parliament elections. Prompting three LLMs with individual-level background information of 26,000 eligible European voters, we ask the LLMs to predict each person’s voting behaviour. By comparing them to the actual results, the study shows that LLM-based predictions of future voting behaviour largely fail, their accuracy is unequally distributed across national and linguistic contexts, and they require detailed attitudinal information. The findings emphasise the limited applicability of LLM-synthetic samples to public opinion prediction. In investigating their contextual biases, this research contributes to the understanding and mitigation of inequalities in the development of LLMs and their applications in computational social science.