New Methods for Job and Occupation Classification

Research question/goal: 

Currently, most surveys use open-ended questions to ask participants about their occupation. The verbatim responses are coded afterwards into a classification with hundreds of categories and thousands of jobs, which is an error-prone, time-consuming, and costly task. When textual answers have a low level of detail, accurate coding may be impossible.
The project aimed to improve the measurement process using a novel instrument: during the survey, respondents were asked to answer a closed question about their occupations, directly after they answered an initial open-ended question. A supervised machine learning algorithm was trained to suggest a short list of candidate job categories, from which respondents could select the most appropriate one. Owing to the careful design of the instrument’s layout, the interaction between interviewers and respondents, and the job descriptions that are used for communication, high usability standards can be ensured.
The new instrument has been tested in different population surveys, and it has been shown that interviewers and respondents feel comfortable using the instrument. We argue that data quality improves when respondents can self-select the most appropriate occupational category. However, a detailed analysis of data quality turned out to be complex and is left for future research.

Fact sheet

2014 to 2021
Data Sources: 
ALWA and NEPS survey data, additional sources
Geographic Space: 



Foster, Ian, Rayid Ghani, Ron S. Jarmin, Frauke Kreuter and Julia Lane (Eds.) (2017): Big Data and Social Science: A Practical Guide to Methods and Tools. London: Chapman & Hall / CRC Press. [Chapman & Hall/CRC Statistics in the Social and Behavioral Sciences] more