Life at Scale: What Population-Wide Data and Our Future Health UK Reveal About Families, Health, and Bias
This talk is organised in cooperation with the Academy of Sociology.
Abstract
The scale and coverage of whole-population data linkage and emerging next-generation large biobanks and data collection offers unique opportunities to advance multiple social science and medical research domains. This talk first provides some applied examples of family research where we linked whole-population hospital and birth administration records to uncover rare and new predictors of childlessness, findings from family network and genetic linkage, sibling and multigenerational family data. I then discuss our analysis of the latest December 2025 release of the ~2.5 Million person Our Future Health UK data, which strives to collect up to 5 Million. This new data resource includes survey, biological (genotype, proteomic, metabolic data) linked to multiple sources including self-report diagnoses and medication intake, in- and out-patient visits, cancer registry and cause of death records. We show how different data sources measure outcomes in a different manner and assess participation bias by comparing sociodemographic, lifestyle and health characteristics against UK Census, UK Biobank and related surveys. Some sociodemographic populations are underrepresented, prevalence of some self-reported conditions is higher, associations with known clinical correlates replicate across cohorts and medication-use patterns and cancer prevalence follow expected age-related gradients, with several notable differences (breast, prostate, lung cancer). The talk concludes with consideration of the influence of the timing of post-pandemic data collection and discusses the implications of participation bias focussing on challenges of introducing sampling weights, collider bias and generalisability of results.