New Methods for Job and Occupation Classification

Research question/goal: 

Currently, most surveys ask for occupation with open-ended questions. The verbatim responses are coded afterwards into a classification with hundreds of categories and thousands of jobs, which is an error-prone, time-consuming and costly task. When textual answers have a low level of detail, exact coding may be impossible. The project investigates how to improve this process by asking response-dependent questions during the interview. Candidate job categories are predicted with a machine learning algorithm and the most relevant categories are provided to the interviewer. Using this job list, the interviewer can ask for more detailed information about the job. The proposed method is tested in a telephone survey conducted by the Institute for Employment Research (IAB). Administrative data are used to assess the relative quality resulting from traditional coding and interview coding. This project is carried out in cooperation with Arne Bethmann (IAB, University of Mannheim), Manfred Antoni (IAB), Markus Zielonka (LIfBi), Daniel Bela (LIfBi), and Knut Wenzig (DIW).

Current stage: 

The promising results from a pilot study for occupation coding during the interview were accepted for publication in an international scientific journal. Yet, the pilot uncovered some shortcomings that required a complete revision of the instrument. For example, we carefully reworded answer options (common job titles and job descriptions) for more than thousand occupations. Additionally, a new algorithm to generate better answer options is under development. A retest of the instrument that will implement these improvements is planned for 2018.

Fact sheet

2014 to 2019
Data Sources: 
ALWA and NEPS survey data, additional sources
Geographic Space: 



Foster, Ian, Rayid Ghani, Ron S. Jarmin, Frauke Kreuter and Julia Lane (Eds.) (2017): Big Data and Social Science: A Practical Guide to Methods and Tools. London: Chapman & Hall / CRC Press. [Chapman & Hall/CRC Statistics in the Social and Behavioral Sciences] more