PT - JOURNAL ARTICLE AU - , AU - Lek, Monkol AU - Karczewski, Konrad J AU - Minikel, Eric V AU - Samocha, Kaitlin E AU - Banks, Eric AU - Fennell, Timothy AU - O’Donnell-Luria, Anne H AU - Ware, James S AU - Hill, Andrew J AU - Cummings, Beryl B AU - Tukiainen, Taru AU - Birnbaum, Daniel P AU - Kosmicki, Jack A AU - Duncan, Laramie AU - Estrada, Karol AU - Zhao, Fengmei AU - Zou, James AU - Pierce-Hoffman, Emma AU - Cooper, David N AU - DePristo, Mark AU - Do, Ron AU - Flannick, Jason AU - Fromer, Menachem AU - Gauthier, Laura AU - Goldstein, Jackie AU - Gupta, Namrata AU - Howrigan, Daniel AU - Kiezun, Adam AU - Kurki, Mitja I AU - Moonshine, Ami Levy AU - Natarajan, Pradeep AU - Orozco, Lorena AU - Peloso, Gina M AU - Poplin, Ryan AU - Rivas, Manuel A AU - Ruano-Rubio, Valentin AU - Ruderfer, Douglas M AU - Shakir, Khalid AU - Stenson, Peter D AU - Stevens, Christine AU - Thomas, Brett P AU - Tiao, Grace AU - Tusie-Luna, Maria T AU - Weisburd, Ben AU - Won, Hong-Hee AU - Yu, Dongmei AU - Altshuler, David M AU - Ardissino, Diego AU - Boehnke, Michael AU - Danesh, John AU - Elosua, Roberto AU - Florez, Jose C AU - Gabriel, Stacey B AU - Getz, Gad AU - Hultman, Christina M AU - Kathiresan, Sekar AU - Laakso, Markku AU - McCarroll, Steven AU - McCarthy, Mark I AU - McGovern, Dermot AU - McPherson, Ruth AU - Neale, Benjamin M AU - Palotie, Aarno AU - Purcell, Shaun M AU - Saleheen, Danish AU - Scharf, Jeremiah AU - Sklar, Pamela AU - Sullivan, Patrick F AU - Tuomilehto, Jaakko AU - Watkins, Hugh C AU - Wilson, James G AU - Daly, Mark J AU - MacArthur, Daniel G TI - Analysis of protein-coding genetic variation in 60,706 humans AID - 10.1101/030338 DP - 2015 Jan 01 TA - bioRxiv PG - 030338 4099 - http://biorxiv.org/content/early/2015/10/30/030338.short 4100 - http://biorxiv.org/content/early/2015/10/30/030338.full AB - Large-scale reference data sets of human genetic variation are critical for the medical and functional interpretation of DNA sequence changes. Here we describe the aggregation and analysis of high-quality exome (protein-coding region) sequence data for 60,706 individuals of diverse ethnicities. The resulting catalogue of human genetic diversity has unprecedented resolution, with an average of one variant every eight bases of coding sequence and the presence of widespread mutational recurrence. The deep catalogue of variation provided by the Exome Aggregation Consortium (ExAC) can be used to calculate objective metrics of pathogenicity for sequence variants, and to identify genes subject to strong selection against various classes of mutation; we identify 3,230 genes with near-complete depletion of truncating variants, 79% of which have no currently established human disease phenotype. Finally, we show that these data can be used for the efficient filtering of candidate disease-causing variants, and for the discovery of human “knockout” variants in protein-coding genes.