Summary
Large-scale reference data sets of human genetic variation are critical for the medical and functional interpretation of DNA sequence changes. Here we describe the aggregation and analysis of high-quality exome (protein-coding region) sequence data for 60,706 individuals of diverse ethnicities generated as part of the Exome Aggregation Consortium (ExAC). The resulting catalogue of human genetic diversity contains an average of one variant every eight bases of the exome, and provides direct evidence for the presence of widespread mutational recurrence. We show that this catalogue can be used to calculate objective metrics of pathogenicity for sequence variants, and to identify genes subject to strong selection against various classes of mutation; we identify 3,230 genes with near-complete depletion of truncating variants, 72% of which have no currently established human disease phenotype. Finally, we demonstrate that these data can be used for the efficient filtering of candidate disease-causing variants, and for the discovery of human “knockout” variants in protein-coding genes.
Footnotes
↵# List of collaborators to appear at the end of manuscript
↵61 Department of Medicine, University of Texas Health Science Center, San Antonio, TX, USA
↵62 Instituto Nacional de Ciencias M_dicas y Nutrici—n Salvador Zubir‡n, Mexico City, Mexico
↵63 Departments of Medicine and Genetics, Albert Einstein College of Medicine, New York City, NY, USA
↵64 Department of Natural Science, University of Haifa, Haifa, Israel
↵65 Department of Clinical Science, University of Bergen, Bergen, Norway
↵66 Department of Pediatrics, Haukeland University Hospital, Bergen, Norway
↵67 Department of Biomedicine, University of Bergen, Bergen, Norway
↵68 The Toronto Western Research Institute, University Health Network, Toronto, Canada
↵69 The Hospital for Sick Children, Toronto, Canada
↵70 Departments of Medicine and Human Genetics, University of Chicago, Chicago, IL, USA
↵71 Department of Medicine, University of Chicago, Chicago, IL, USA
↵72 South Texas Diabetes and Obesity Institute, University of Texas Health Science Center, San Antonio, TX, USA
↵73 University of Texas Rio Grande Valley, Brownsville, TX, USA
↵74 Department of Biochemistry, Wake Forest School of Medicine, Winston-Salem, NC, USA
↵75 Center for Genomics and Personalized Medicine Research, Wake Forest School of Medicine, Winston-Salem, NC, USA
↵76 Center for Diabetes Research, Wake Forest School of Medicine, Winston-Salem, NC, USA
↵77 North Shore-Long Island Jewish Health System, Manhasset, NY, USA
↵78 Instituto Nacional de Medicina Gen—mica, Mexico City, Mexico
↵79 Department of Epidemiology and Biostatistics, Imperial College London, London, UK
↵80 Department of Cardiology, Ealing Hospital NHS Trust, Southall, UK
↵81 Imperial College Healthcare NHS Trust, Imperial College London, London, UK
↵82 Nuffield Department of Population Health, University of Oxford, Oxford, UK
↵83 Center for Neurobehavioral Genetics, University of California, Los Angeles, CA, USA
↵84 Vanderbilt Genetics Institute, Vanderbilt University School of Medicine, Nashville, TN, USA
↵85 Department of Genetics, Texas Biomedical Research Institute, San Antonio, TX, USA
↵86 University of Exeter Medical School, University of Exeter, Exeter, UK
↵87 Instituto Nacional de Salud Publica, Mexico City, Mexico
↵88 Department of Psychiatry and Behavioral Sciences, Johns Hopkins University School of Medicine, Baltimore, MD, USA
↵89 Department of Clinical Sciences, Lund University Diabetes Centre, Malm_, Sweden
↵90 Department of Preventive Medicine, University of Southern California, Los Angeles, CA, USA
↵91 Human Genetics Center, The University of Texas Health Science Center, Houston, TX, USA
↵92 Epidemiology Program, University of Hawaii Cancer Center, Honolulu, HI, USA
↵93 Instituto de Investigaciones Biom_dicas, Mexico City, Mexico
↵94 Instituto Mexicano del Seguro Social, Mexico City, Mexico
↵95 Department of Genetics, Yale University School of Medicine, New Haven, CT, USA
↵96 MRC Centre for Neuropsychiatric Genetics and Genomics, Cardiff University, Cardiff, UK
↵97 Center for Genome Science, Korea National Institute of Health, Chungcheongbuk-do, Republic of Korea
↵98 Stanley Institute for Cognitive Genomics, Cold Spring Harbor Laboratory, Woodbury, NY, USA
↵99 Department of Psychiatry, University of Utah, Salt Lake City, UT, USA
↵100 Nuffield Department of Medicine, University of Oxford, Oxford, UK
↵101 Department of Psychiatry, University of Florida, Gainesville, FL, USA
↵102 General Medicine Division, Massachusetts General Hospital, Boston, MA, USA
↵103 Institute of Human Genetics, Technische Universit_t MŸnchen, Munich, Germany
↵104 Institute of Human Genetics, German Research Center for Environmental Health, Neuherberg, Germany
↵105 Diabetes Research Center (Diabetes Unit), Massachusetts General Hospital, Boston, MA, USA
↵106 Research Program in Computational Biology, Barcelona Supercomputing Center, Barcelona, Spain
↵107 Universidad Aut—noma Metropolitana, Mexico City, Mexico
↵108 Estonian Genome Centre,University of Tartu,Tartu,Estonia, University of Tartu, Tartu, Estonia
↵109 Department of Biostatistics, University of Liverpool, Liverpool, UK
↵110 Center for Medical Genetics and Molecular Medicine, Haukeland University Hospital, Bergen, Norway
↵111 Interdisoiplinary Program in Bioinformatics, Seoul National University, Seoul, Republic of Korea
↵112 Department of Statistics, Seoul National University, Seoul, Republic of Korea
↵113 Department of Functional Genomics, University of Amsterdam, Amsterdam, The Netherlands
↵114 Department of Clinical Genetics, VU Medical Centre, Amsterdam, The Netherlands
↵115 Department of Child and Adolescent Psychiatry, Erasmus University Medical Centre, Rotterdam, The Netherlands
↵116 Department of Psychiatry, University of Toronto, Toronto, Canada
↵117 Department of Laboratory Medicine, University of California, San Francisco, CA, USA
↵118 Blood Systems Research Institute, San Francisco, CA, USA
↵119 Department of Human Genetics, McGill University, Montreal, Canada
↵120 Department of Medicine, McGill University, Montreal, Canada
↵121 McGill University and G_nome Qu_bec Innovation Centre, Montreal, Canada
↵122 Department of Twin Research and Genetic Epidemiology, King's College London, London, UK
↵123 Saw Swee Hock School of Public Health, National University of Singapore, Singapore, Singapore
↵124 Department of Medicine, National University of Singapore, Singapore, Singapore
↵125 Cardiovascular & Metabolic Disorders Program, Duke-NUS Graduate Medical School Singapore, Singapore, Singapore
↵126 Department of Biological Sciences, Columbia University, New York, NY, USA