Algorithmic proiling is increasingly used in the public sector with the hope of allocating limited public resources more efectively and objectively. One example is the prediction-based proiling of job seekers to guide the allocation of support measures by public employment services. However, empirical evaluations of potential side-efects such as unintended discrimination and fairness concerns are rare in this context. We systematically compare and evaluate statistical models for predicting job seekers’ risk of becoming long-term unemployed concerning subgroup prediction performance, fairness metrics, and vulnerabilities to data analysis decisions. Focusing on Germany as a use case, we evaluate proiling models under realistic conditions using large-scale administrative data. We show that despite achieving high prediction performance on average, proiling models can be considerably less accurate for vulnerable social subgroups. In this setting, diferent classiication policies can have very diferent fairness implications. We therefore call for rigorous auditing processes before such models are put to practice.