Mannheim Research Colloquium on Survey Methods (MaRCS): Automated classification for open-ended questions with BERT
Abstract:
Answers to open-ended questions are often manually coded into different categories. This is time consuming. Automated coding uses statistical/machine learning to train on a small subset of manually coded text answers. The state of the art in NLP (natural language processing) has shifted: A general language model is first pre-trained on vast amounts of unrelated data, and then this model is adapted to a specific application data set. After summarizing some earlier work, we empirically investigate whether BERT, the currently dominant pre-trained language model, is more effective at automated coding of answers to open-ended questions than non-pre-trained statistical learning approaches.
Please use this link to attend the colloquium via Zoom: https://us02web.zoom.us/j/89939661329?pwd=d0dzU3FvbUdQc3dnejEzOUliODFVUT09
You can sign-up to the MaRCS mailing list here https://lists.gesis.org/mailman/listinfo/marcs to receive invitations to the upcoming seminars.
Florian Keusch (University of Mannheim, School of Social Sciences)
Henning Silber (GESIS – Leibniz Institute for the Social Sciences)
Bernd Weiß (GESIS – Leibniz Institute for the Social Sciences)
Alexander Wenz (University of Mannheim, MZES)