Mannheim Research Colloquium on Survey Methods (MaRCS): Automated classification for open-ended questions with BERT

01.12.2022 - 12:30 bis 13:30
A 5,6 Raum A 231/230 + Zoom
Art der Veranstaltung: 
Matthias Schonlau (University of Waterloo and visiting professor at GESIS)


Answers to open-ended questions are often manually coded into different categories. This is time consuming. Automated coding uses statistical/machine learning to train on a small subset of manually coded text answers. The state of the art in NLP (natural language processing) has shifted: A general language model is first pre-trained on vast amounts of unrelated data, and then this model is adapted to a specific application data set. After summarizing some earlier work, we empirically investigate whether BERT, the currently dominant pre-trained language model, is more effective at automated coding of answers to open-ended questions than non-pre-trained statistical learning approaches.


Please use this link to attend the colloquium via Zoom:


You can sign-up to the MaRCS mailing list here to receive invitations to the upcoming seminars.


Florian Keusch (University of Mannheim, School of Social Sciences)

Henning Silber (GESIS – Leibniz Institute for the Social Sciences)

Bernd Weiß (GESIS – Leibniz Institute for the Social Sciences)

Alexander Wenz (University of Mannheim, MZES)