ABSTRACT
Tools like Snipper or the Admixture Model count as state-of-the-art methods in forensic science for biogeographical ancestry. However, they have not been systematically compared to classifiers widely used in other disciplines. Noting that genetic data have a tabular form, this study addresses this gap by benchmarking forensic classifiers against TabPFN, a cutting-edge, general-purpose machine learning classifier for tabular data. The comparison evaluates performance using metrics such as accuracy—the proportion of correct classifications—and ROC AUC. We examine classification tasks for individuals at both the intracontinental and continental levels, based on a published dataset for training and testing. Our results reveal significant performance differences between methods, with TabPFN consistently achieving the highest accuracy and ROC AUC across the datasets. Based on these findings, we recommend adopting TabPFN for population classification in forensic science.
Competing Interest Statement
Frank Hutter cofounded the tabular foundation company PriorLabs that open sourced TabPFN and is working on better models. The authors declare that there are no other conflicts of interest.