RT Journal Article SR Electronic T1 Re-identification of genomic data using long range familial searches JF bioRxiv FD Cold Spring Harbor Laboratory SP 350231 DO 10.1101/350231 A1 Yaniv Erlich A1 Tal Shor A1 Shai Carmi A1 Itsik Pe’er YR 2018 UL http://biorxiv.org/content/early/2018/06/19/350231.abstract AB Consumer genomics databases reached the scale of millions of individuals. Recently, law enforcement investigators have started to exploit some of these databases to find distant familial relatives, which can lead to a complete re-identification. Here, we leveraged genomic data of 600,000 individuals tested with consumer genomics to investigate the power of such long-range familial searches. We project that half of the searches with European-descent individuals will result with a third cousin or closer match and will provide a search space small enough to permit re-identification using common demographic identifiers. Moreover, in the near future, virtually any European-descent US person could be implicated by this technique. We propose a potential mitigation strategy based on cryptographic signature that can resolve the issue and discuss policy implications to human subject research.