United in Diversity? Contextual Biases in LLM-Based Predictions of the 2024 European Parliament Elections

St. Louis
,
2025

von der Heyde, Leah, Anna-Carolina Haensch, Alexander Wenz, Bolei MA

Synthetic samples” based on large language models (LLMs) could serve as efficient alternatives to surveys of humans, considering LLM outputs are based on training data that includes information on human attitudes and behavior. However, LLM-synthetic samples might exhibit bias, for example due to training data and fine-tuning processes being unrepresentative of diverse contexts. Such biases can challenge the validity of findings using LLM-synthetic samples, and risk reinforcing existing biases in research, policymaking, and society. Therefore, researchers need to investigate the conditions under which LLM-generated synthetic samples can be applied for public opinion prediction by comparing different linguistic, political, social, and digital contexts. In this study, we examine to what extent LLM-based predictions of individual public opinion exhibit context-dependent biases by predicting the results of the 2024 European Parliament elections. We prompt three LLMs with individual-level background information of 26.000 eligible European voters, varying the language of and information contained in the prompts, and ask the LLMs to predict each person’s voting behavior. Comparing them to the actual election results, we show that LLM-based predictions of future voting behavior largely fail, their accuracy is unequally distributed across national and linguistic contexts, and they require detailed attitudinal information. Our findings emphasize the limited applicability of LLM-synthetic samples to public opinion prediction, especially given limited information about the population of interest ahead of unobserved outcomes. In investigating their contextual biases, our research contributes to the understanding and mitigation of inequalities in the development of LLMs and their applications in survey research.