New Methods for Job and Occupation Classification

Research question/goal: 

Currently, most surveys ask for occupation with open-ended questions. The verbatim responses are coded afterwards into a classification with hundreds of categories and thousands of jobs, which is an error-prone, time-consuming and costly task. When textual answers have a low level of detail, exact coding may be impossible. The project investigates how to improve this process by asking response-dependent questions during the interview. Candidate job categories are predicted with a machine learning algorithm and the most relevant categories are provided to the interviewer. Using this job list, the interviewer can ask for more detailed information about the job. The proposed method is tested in a telephone survey conducted by the Institute for Employment Research (IAB). Administrative data are used to assess the relative quality resulting from traditional coding and interview coding. This project is carried out in cooperation with Arne Bethmann (IAB, University of Mannheim), Manfred Antoni (IAB), Markus Zielonka (LIfBi), Daniel Bela (LIfBi), and Knut Wenzig (DIW).

Current stage: 

At the end of the first funding period, we designed a new instrument for occupation coding during the interview. We implemented the instrument in two surveys (a telephone survey and a face-to-face survey) and collected occupational information from more than 2,000 respondents. The results are promising: more than fifty percent of the text responses can be coded with the newly developed tool, and there is no evidence that the use of this tool is an additional burden to interviewers and respondents. However, since many interview-coded responses do not match those obtained using professional coders, possible reasons for these deviations are currently being evaluated. In fall 2020 we submitted a proposal for continued work on this issue to the German Research Foundation (DFG).

Fact sheet

2014 to 2021
Data Sources: 
ALWA and NEPS survey data, additional sources
Geographic Space: 



Foster, Ian, Rayid Ghani, Ron S. Jarmin, Frauke Kreuter and Julia Lane (Eds.) (2017): Big Data and Social Science: A Practical Guide to Methods and Tools. London: Chapman & Hall / CRC Press. [Chapman & Hall/CRC Statistics in the Social and Behavioral Sciences] more