TY - JOUR T1 - Selective constraints and pathogenicity of mitochondrial DNA variants inferred from a novel database of 196,554 unrelated individuals JF - bioRxiv DO - 10.1101/798264 SP - 798264 AU - Alexandre Bolze AU - Fernando Mendez AU - Simon White AU - Francisco Tanudjaja AU - Magnus Isaksson AU - Misha Rashkin AU - Johnathan Bowes AU - Elizabeth T. Cirulli AU - William J. Metcalf AU - Joseph J. Grzymski AU - William Lee AU - James T. Lu AU - Nicole L. Washington Y1 - 2019/01/01 UR - http://biorxiv.org/content/early/2019/10/08/798264.abstract N2 - Robust characterization of mitochondrial variation provides an opportunity to map regions under high constraint, and identify essential functional domains. We sequenced the mitochondrial genomes of 196,554 unrelated individuals, and identified 15,035 unique variants. We found that 47% of the mitochondrial genome was invariant across the population, and generated a map of constrained mitochondrial regions. We find that the longest intervals in the mitochondrial genome without any variant were in the two rRNA genes (26 of 40 intervals >10bp long). We also showed that the 13 protein-coding genes in the mitochondrial genome did not tolerate loss-of-function variants. The only frameshift or nonsense variant observed at homoplasmic levels was a nonsense at the start codon of MT-ND1, which may be rescued by the methionine at amino acid position 3. Lastly, we applied these data to variants reported to be pathogenic for Leber’s Hereditary Optic Neuropathy (LHON). We found that 42% of variants (19 of the 45) reported to be pathogenic have a frequency above the maximum credible population allele frequency for an LHON-causing variant, including the primary LHON mutation m.14484T>C, which suggests that m.14484T>C cannot be causing LHON by itself. This result showed that allele frequency information across a large unselected population is important to assess the pathogenicity of variants in the context of rare mitochondrial disorders. We made HelixMTdb -- the list of variants and their allele frequency in 196,554 unrelated individuals -- publicly available. ER -