Statistical Modeling Using Mouse Movements to Model Measurement Error and Improve Data Quality in Web Surveys

Research question/goal: 

Online surveys have become very prominent across many different disciplines both within the academic and private sector and also increasingly in official statistics. Despite best efforts of questionnaire designers, respondents regularly answer questions incorrectly, often because they do not understand what the question is asking. In offline surveys, interviewers could help respondents through difficulties. In web surveys this is no longer possible. However, respondents leave clues in keystrokes, response times and mouse movements on respondent difficulties and breakdowns in the measurement process. Those paradata can be used to check and improve data quality. The conventional approach in web surveys is to use response latency, where very low and very high response times are used as indication for bad data quality. The only web survey work involving mouse movements focused on the overall distance traveled by the mouse to identify questions and respondents with low data quality. However, mouse movements contain much more information than captured by either bare response times or simple summary statistics such as total distance and other predefined patterns. Despite the fast growing use of web surveys in commercial as well as official statistics, so far no large-scale research investigates the value of mouse movements in web surveys.

To fill this research gap, the proposed project will develop statistical methods to automatically analyze mouse movements in web surveys. In particular, we want to exploit the information that is contained in the mouse movements and use it to better understand measurement error and question difficulty. In the future, this work can be helpful as a basis to detect respondent difficulty and adaptively offer help in a responsive questionnaire design and to adjust for measurement error in subsequent analyses of the web survey answers.

Current stage: 

The data collection for this project has been completed. Together with our project partners at HU Berlin, we have developed novel analysis techniques for a large set of mouse-tracking data collected during survey interviews as well as a software to collect mouse-tracking data and implement the analysis methods. Results are currently presented at international conferences; papers are being prepared. A detailed analysis of attitudes towards privacy regarding the use of mouse movement data is close to completion.

Fact sheet

Funding: 
DFG
Duration: 
2017 to 2021
Status: 
ongoing
Data Sources: 
IAB survey

Publications