Abstract
The proportion of samples with one or more close relatives in a genetic dataset increases rapidly with sample size, necessitating relatedness modeling and enabling pedigree-based analyses. Despite this, relatives are generally unreported and current inference methods typically detect only the degree of relatedness of sample pairs and not pedigree relationships. We developed CREST, an accurate and fast method that identifies the pedigree relationships of close relatives. CREST utilizes identical by descent (IBD) segments shared between a pair of samples and their mutual relatives, leveraging the fact that sharing rates among these individuals differ across pedigree configurations. Furthermore, CREST exploits the profound differences in sex-specific genetic maps to classify pairs as maternally or paternally related—e.g., paternal half-siblings—using the locations of autosomal IBD segments shared between the pair. In simulated data, CREST correctly classifies 91.5-99.5% of grandparent-grandchild (GP) pairs, 70.5-97.0% of avuncular (AV) pairs, and 79.0-98.0% of half-siblings (HS) pairs compared to PADRE’s rates of 38.5-76.0% of GP, 60.5-92.0% of AV, 73.0-95.0% of HS pairs. Turning to the real 20,032 sample Generation Scotland (GS) dataset, CREST correctly determines the relationship of 99.0% of GP, 85.7% of AV, and 95.0% of HS pairs that have sufficient mutual relative data, completing this analysis in 10.1 CPU hours including IBD detection. CREST’s maternal and paternal relationship inference is also accurate, as it flagged five pairs as incorrectly labeled in the GS pedigrees— three of which we confirmed as mistakes, and two with an uncertain relationship—yielding 99.7% of HS and 93.5% of GP pairs correctly classified.