TY - JOUR T1 - The human functional genome defined by genetic diversity JF - bioRxiv DO - 10.1101/082362 SP - 082362 AU - Julia di Iulio AU - Istvan Bartha AU - Emily H.M. Wong AU - Hung-Chun Yu AU - Michael Hicks AU - Naisha Shah AU - Victor Lavrenko AU - Ewen F. Kirkness AU - Martin M. Fabani AU - Dongchan Yang AU - Inkyung Jung AU - William H. Biggs AU - Bing Ren AU - J. Craig Venter AU - Amalio Telenti Y1 - 2016/01/01 UR - http://biorxiv.org/content/early/2016/10/21/082362.abstract N2 - Large scale efforts to sequence whole human genomes provide extensive data on the non-coding portion of the genome. We used variation information from 11,257 human genomes to describe the spectrum of sequence conservation in the population. We established the genome-wide variability for each nucleotide in the context of the surrounding sequence in order to identify departure from expectation at the population level (context-dependent conservation). We characterized the population diversity for functional elements in the genome and identified the coordination of conserved sequences of distal and cis enhancers, chromatin marks, promoters, coding and intronic regions. The most context-dependent conserved regions of the genome are associated with unique functional annotations and a genomic organization that spreads up to one megabase. Importantly, these regions are enriched by over 100-fold of non-coding pathogenic variants. This analysis of human genetic diversity thus provides a detailed view of sequence conservation, functional constraint and genomic organization of the human genome. Specifically, it identifies highly conserved non-coding sequences that are not captured by analysis of interspecies conservation and are greatly enriched in disease variants. ER -