Sebastian Bähr, Georg-Christoph Haas, Florian Keusch, Frauke Kreuter, Mark Trappmann
Missing Data and Other Measurement Quality Issues in Mobile Geolocation Sensor Data

Social Science Computer Review, 2022: 40, Heft 1, S. 212–235
ISSN: 0894-4393 (print), 1552-8286 (online)

As smartphones become increasingly prevalent, social scientists are recognizing the ubiquitous data generated by the sensors built into these devices as an innovative data source. Passively collected data from sensors that measure geolocation or movement provide an unobtrusive way to observe participants in everyday situations and are free from reactivity biases. Information on day-to-day geolocation could provide valuable insights into human behavior that cannot be collected via surveys. However, little is known about the quality of the resulting data. Using data from a 2018 German population-based probability app study, this article focuses on the measurement quality of geolocation sensor data, with a strong focus on missing measurements. Geolocation sensor data are an example of an available data type that is of interest to social science research. Our findings can be applied to the wider subject of sensor data. In our article, we demonstrate (1) that sensor data are far from error-free. Instead, device-related error sources, such as the manufacturer and operating system settings, design decisions of the research app, third-party apps, and the participant, can interfere with the measurement. To disentangle the different influences, we (2) apply a multistage error model to analyze and control the error sources in the specific missingness process of geolocation data. We (3) raise awareness of error sources in geolocation measurement, such as the use of GPS falsifier apps, or device sharing among participants. By identifying the different error sources and analyzing their determinants, we recommend (4) identification strategies for future research.