A genome-wide mutational constraint map quantified from variation in 76,156 human genomes

The depletion of disruptive variation caused by purifying natural selection (constraint) has been widely used to investigate protein-coding genes underlying human disorders, but attempts to assess constraint for non-protein-coding regions have proven more difficult. Here we aggregate, process, and release a dataset of 76,156 human genomes from the Genome Aggregation Database (gnomAD), the largest public open-access human genome reference dataset, and use this dataset to build a mutational constraint map for the whole genome. We present a refined mutational model that incorporates local sequence context and regional genomic features to detect depletions of variation across the genome. As expected, proteincoding sequences overall are under stronger constraint than non-coding regions. Within the noncoding genome, constrained regions are enriched for regulatory elements and variants implicated in complex human diseases and traits, facilitating the triangulation of biological annotation, disease association, and natural selection to non-coding DNA analysis. More constrained regulatory elements tend to regulate more constrained genes, while non-coding constraint captures additional functional information underrecognized by gene constraint metrics. We demonstrate that this genome-wide constraint map provides an effective approach for characterizing the non-coding genome and improving the identification and interpretation of functional human genetic variation.


Introduction
The expansion in the scale of human whole-genome or exome reference data has characterized the patterns of variation in human genes. With these data it is possible to directly assess the strength of negative selection on loss-of-function (LoF) and missense variation by modeling "constraint," the reduction of variants in a gene compared to an expectation conditioned on that gene's mutability. Using coding variant data from sequencing of more than 125K humans 1 , we have previously developed a constraint metric that classifies each protein-coding gene along a spectrum of LoF intolerance 1 , which provides a valuable resource for studying the functional significance of human genes [2][3][4][5] . Although of outsized biological importance, protein-coding regions comprise less than 2% of the human genome, and the vast non-coding genome has been much less characterized, even though the importance of noncoding variation in human complex diseases has been long recognized [6][7][8][9][10] .
Several challenges arise when extending the gene constraint model to the non-coding space. First, the sample size of human whole-genome reference data has been relatively small compared to the exome, limiting the power of detecting depletions of variation at a fine scale. Second, our detailed understanding of coding region exon structure and effect of specific variants on amino acid translation enables a precision not available in non-coding analysis. Third, strong expectation from Mendelian genetics and existing constraint analyses that the coding regions, while a small fraction of the genome, harbor a massively disproportionate amount of rare and common disease mutations under selection. Fourth, the mutation rate in non-coding regions is highly heterogeneous, which can be affected not only by local sequence context as commonly modeled in gene constraint metrics but also by a variety of genomic features at larger scales 11,12 .
Current methods attempting to evaluate non-coding constraint can be broadly divided into three categories: 1) context-dependent mutational models that assess the deviation of observed variation from an expectation based on the sequence composition of k-mers (e.g., Orion 13 , CDTS 14 ); 2) machine-learning classifiers that are trained to differentiate between disease-associated variants and benign variants (e.g., CADD 15 , GWAVA 16 , gwRVIS 17 ); 3) evolutionary methods that employ phylogenetic conservation scores to predict the consequence of non-coding variants on fitness (e.g., LINSIGHT 18 ). While all these methods aid in our understanding of the non-coding genome, each suffer from limitations/biases, respectively as 1) overlooking the influence of regional genomic features beyond the scale of flanking nucleotides on mutation rate; 2) strong dependence on the availability of well-characterized functional mutations as training data; 3) compromised ability to detect regions that have only been under selection recently in the human lineage, which may have a functional impact on human phenotypic traits or diseases.
Here we present a genome-wide map of human constraint, generated from a high-quality set of variant calls from 76,156 whole-genome sequences (gnomAD v3.1.2 https://gnomad.broadinstitute.org). We describe an improved model of human mutation rates that jointly analyzes local sequence context and regional genomic features, and quantify depletion of variation in a tiled window across the entire genome. By building a fuller picture of genic constraint than solely focusing on coding variation, we demonstrate that it facilitates the functional interpretation of non-coding regions and improves the characterization of gene function in the context of the regulatory network. . Our study aims to depict a genome-wide view We quantified the deviation from expectation for each 1kb window using a Z score 22 (Methods; Extended  Fig. 1d,e), which was centered slightly below zero for non-coding regions (median=-0.50), and was significantly higher (more constrained) for windows containing any protein-coding sequences (median=0.17, Wilcoxon P<10 -200 ; Fig. 1a). The constraint Z score positively correlated with the percentage of coding bases in a window and presented a substantial shift towards higher constraint for exonic sequences calculated from directly concatenating coding exons into 1kb windows (median=1.48; Extended Fig. 2a-c). About 13.5% and 0.5% of the non-coding windows exhibited constraint as strong as the 50 th and 90 th percentile of exonic regions (Extended Fig. 2d). Comparing our Z score against the adjusted proportion of singletons (APS) score, a measure of constraint developed for structural variation (SV) 23 , we found a strong correlation (linear regression P=1.54×10 -40 , Fig. 1b; Methods), providing an internal validation of our approach.

Investigating genomic properties of non-coding regions under constraint
To further validate our constraint metric and investigate the functional relevance of non-coding regions under selection, we examined the correlation between our Z score and several annotations of putatively functional non-coding sequences (Fig. 2a). First, we found that candidate cis-regulatory elements (cCREs, derived from ENCODE 24 integrated DNase-and ChIP-seq data) are significantly enriched in the most constrained percentile of the genome (Z≥4, OR=1.28 compared to the genome-wide average, Fisher's exact P=2.72×10 -44 ), indicating that a large fraction of the constrained non-coding regions may serve a regulatory role. Similarly, a significant enrichment was found for an independent set of active, in vivo-transcribed enhancers (identified by FANTOM CAGE analyses 25 ; OR=1.36, P=1.03×10 -12 ). Moreover, we found that super enhancers 26 -groups of enhancers in close genomic proximity -exhibited a stronger enrichment (OR=1.44, P=4.98×10 -110 ), which is in line with their prominent roles in regulating genes important for cell type specification 27 . We also detected a modest but significant enrichment for long noncoding RNAs (lncRNAs, implicated by the FANTOM CAT catalog 28 ; OR=1.21, P=2.71×10 -25 ), lending support to their potential functionality in the pervasive transcription in the human genome 29 .
A pronounced level of constraint was observed for the 9p21 locus (OR=7.10, Fisher's exact P=2.90×10 -4 ; Fig. 2b), a widely recognized and replicated risk factor for coronary artery disease (CAD) and type 2 diabetes (T2D) 30 . Although devoid of genes, the locus tended to be as constrained as regions encompassing coding sequences (median Z=0. 38 Fig. 2c). A well-replicated CAD marker rs10738607 32-34 , for example, was found to reside in a region under strong selection (Z=4.48, chr9:22088000-22089000) ~80kb upstream of the cell cycle suppressor gene CDKN2B, where three consecutive enhancer signatures occupied >80% of its sequence 24 . The risk allele of rs10738607 was predicted to disrupt a binding site for STAT5A 30 , a known transcriptional activator of CDKN2B that negatively regulates cell proliferation 35,36 ; reduced expression of CDKN2B and enhanced proliferation have been previously-recognized phenotypes linked to the 9p21 CAD risk locus [37][38][39][40][41] . Therefore, our constraint analysis adds weight to the hypothesis that loss of STAT5A-CDKN2B antiproliferative effect may present one mechanism that explains the genotype-phenotype association for 9p21 risk alleles.

Prioritizing non-coding variants through the constraint map
The genome-wide constraint map allows us to systematically evaluate each genetic variant in the genome, particularly expanding our ability to study variants in the vast and under-characterized non-coding regions. Examining the distribution of non-coding variants identified by GWAS on the constraint spectrum, we found a significant enrichment for non-coding GWAS hits in the constrained end of the genome (621/14,808 constrained windows [Z≥4] overlapped with GWAS Catalog 31 annotations, OR=1.47 compared to the genome-wide average of 48,045/1,665,599, Fisher's exact P=3.76×10 -19 , Fig. 3a; Methods). The enrichment appeared to become increasingly stronger for hits that had a replication experiment (internally replicated within the same study: OR=1.54, P=1.39×10 -9 ; externally replicated by a different study: OR=1.76, P=1.51×10 -6 ). Furthermore, substantial selection signals were found for likely causal GWAS variants fine-mapped from 94 complex diseases and traits in UK Biobank (UKB) 42,43 ( Fig. 3b; Methods). Leading enrichments were found for pathological conditions including CAD, inguinal hernia, insomnia, fibroblastic disorders, and hypothyroidism (OR>2.5), as well as quantitative molecular traits such as hemoglobin, uric acid, and bilirubin levels. These results revealed a high positive correlation between the level of constraint and the occurrence of candidate functional variants in the non-coding genome. Thus, our constraint metric could effectively serve to prioritize non-coding variants discovered in large-scale human disease or trait association studies.
For example, in the constraint analysis of CAD GWAS, we found seven variants from a 95% credible set located in highly constrained non-coding regions related to the gene PLG -four were mapped to its antisense transcript ENST00000659713 (rs144679016, rs116480834, rs10945688, rs148517610), one resided in its 9 th intron (rs143556831), and two colocalized with its enhancer marks ~55-57kb downstream of the gene (rs12529023, rs139095347). PLG encodes the plasminogen protein that circulates in blood plasma and is converted to the active protease, plasmin, which dissolves the fibrin of blood clots. Dysregulation of the PLG-plasmin system has been frequently implicated in the pathogenesis of CAD [44][45][46][47][48][49] . Therefore, together with previous evidence on PLG, our findings suggest a plausible role of the prioritized non-coding variants in increasing CAD risk, likely acting through changes in the constrained regulatory elements of PLG.
Exploring non-coding dosage sensitivity in the constrained genome Copy number variants (CNVs) or dosage alterations (deletions/loss or duplications/gain) of DNA are wellestablished risk factors for human developmental disorders [50][51][52][53][54][55] . CNVs can exert pathogenic effects in more than one mechanism 56,57 , such as direct gene dosage effect, positional effect, and transvection effect. Still, the most common is alteration of dosage-sensitive genes within the loci, whereas the functional relevance of the large proportion of non-coding sequences being affected are much less understood. With our genome-wide constraint map, we explored the possibility that constrained noncoding regions are also sensitive to a dosage effect, which may underlie the pathogenicity of corresponding CNVs.
We surveyed a collection of ~90K CNVs from a genome-wide CNV morbidity map of developmental delay and congenital birth defects 58,59 . We observed a substantial excess of CNVs that affect constrained noncoding regions (Z≥4) among individuals with developmental disorders (DD cases) in comparison to healthy controls (42.2% versus 10.9%, OR=6.0, Fisher's exact P<10 -200 , Fig. 4a; Methods). More strikingly, of the 18 loci that had been previously classified as pathogenic 58 , 16 (88.9%) coincided with constrained noncoding regions. The high incidence was recapitulated in a curated set of ~4K putatively pathogenic CNVs (ClinVar 60 ), of which 82.8% exhibited high non-coding constraint (Fig. 4a). Importantly, the case-control enrichment remained significant, albeit attenuated, after adjusting for the size and gene content of each CNV and when being tested in the subtype of CNV deletions and duplications ( Fig. 4b; Methods). Noncoding constraint manifested a comparable level of association as gene constraint when modeled simultaneously to predict DD CNVs (log[OR]=1.13 and 1.48, Logit P<2×10 -16 for both), suggesting that dosage imbalance of constrained non-coding regions could confer risk for developmental disorders in addition to gene dosage sensitivity.
One classic example of non-coding dosage imbalance is a set of duplications involving the regulatory domain of IHH associated with synpolydactyly and craniosynostosis [61][62][63] . The four implicated duplications covered a stretch of ~102kb sequence upstream of IHH, with a smallest ~10kb critical overlapping region (Fig. 4c). The critical region contained no exons but showed high non-coding constraint (median Z=1.22, Wilcoxon P=0.011 compared to the rest of the genome). Interestingly, the highest constraint signal colocalized with the major enhancer of IHH, the duplication of which has been shown to result in dosagedependent IHH misexpression and consequently syndactyly and malformation of the skull 63 . Collectively, we suggest that non-coding constraint can be a useful indicator of dosage-sensitive regulatory CNVs and improves understanding of potential non-coding mechanisms underlying CNV disease associations.

Leveraging non-coding constraint to improve gene function characterization
Given the significant roles of constrained non-coding regions in gene regulation, it is natural to expect that more constrained regulatory elements would regulate more constrained genes. To test this, we surveyed enhancer-gene links from the Roadmap Epigenomics Enhancer-Gene Linking database 64 (Methods), where we found overall, more constrained non-coding regions were more frequently linked to a gene (Fig.  5a), and more constrained enhancers tend to be associated with more constrained genes (e.g., haploinsufficient genes; median Z=2.21 versus 1.43, Wilcoxon P=4.24×10 -18 , Fig. 5b; Methods).
On the other hand, a particular interesting set of associations are the links between constrained enhancers and the "unconstrained" genes classified by gene constraint metrics, because these links may reflect functional significance of the "unconstrained" genes that had been previously underrecognized. More than 40% of the least constrained genes (last decile scored by LOEUF [loss-of-function observed/expected upper bound fraction] 1 ) were linked with a constrained enhancer (last decile with constraint Z≥2.23); the lack of predicted gene constraint can be explained by the intrinsic design of LOEUF as a measure of intolerance to rare LoF variation, where small genes with few expected LoF variants are likely underpowered. Indeed, when stratifying genes by the number of expected LoF variants, we found a significantly higher enhancer constraint for genes that were underpowered (with ≤5 expected LoF variants by LOEUF) compared to those that were sufficiently powered and scored as unconstrained (median Z=2.06 versus 1.68, Wilcoxon P=1.68×10 -3 , Fig. 5b). This pattern suggests that a portion of these underpowered genes may play an important functional role but were not recognized in gene constraint evaluation. For instance, ASCL2, a basic helix-loop-helix (bHLH) transcription factor, had only 0.57 expected LoFs (versus 0 observed) across >125K exomes 1 ; although being depleted for LoF variation, the absolute difference was too small to obtain a precise estimate of LoF intolerance. Yet, we found ASCL2 had a highly constrained enhancer (Z=6.26), located ~16kb upstream of the gene, where >40% of the expected variants were depleted (200.6 expected versus 112 observed, chr11:2286000-2287000). The same genomic window also contained an eQTL chr11:2286192:G>T that was predicted to be significantly associated with ASCL2 expression 65 ; elevated ASCL2 expression has been implicated in the development and progression of several human cancers [66][67][68] . This example highlights the value of non-coding constraint -as a complementary metric to gene constraint -for identifying functionally important genes.
Gene-set enrichment analysis of the underpowered genes with constrained enhancers identified strong enrichment for genes encoding histones, secreted proteins involved in cell signaling and communication (e.g., cytokines, hormones, growth factors; Fig. 5c). These genes are generally small and are thus likely to be underrecognized by LoF-intolerance-based metrics. Applying an alternative, phylogeny-based conservation score 69 to evaluate these underpowered genes, we found a significantly higher evolutionary constraint on these genes compared to those with less constrained enhancers (Wilcoxon P=4.28×10 -12 ; Fig.  5d), providing additional evidence for their functional significance. Moreover, by dissecting enhancers in a tissue-specific manner, we examined how tissue-specific enhancer constraint correlates with tissuespecific gene regulation. We found that even conditioning on gene constraint, enhancer constraint remained a significant predictor for the expression level of target genes in matched tissue types ( Fig. 5e; Methods). These results further support the application of our constraint metric to improve the characterization of human gene function.
A practical implementation of this notion would be refining gene constraint models to incorporate contributions from constrained regulatory elements. This essentially borrows power from extending the functional unit of a gene in the context of the gene regulatory network. Such an approach would allow constraint modeling for specific tissue/cell types and conditions given the diverse range of epigenomics data. Taking enhancer-gene links from brain tissue as a proof-of-principle, a simple logistic regression test conditioning on gene constraint indicated significant contribution from enhancer constraint for predicting functionally relevant genes (e.g., targets of Fragile X Mental Retardation Protein (FMRP) 70 , Logit P=2.84×10 -8 ; Methods). While we acknowledge that the biological consequences of mutations in enhancers are not clearly understood and thus natural selection may differ in its interest depending on mechanistic consequence, an extended model to incorporate non-coding variation information in a biologicallyinformed way hold the promise to provide a finer characterization of gene function and facilitate a better understanding of the molecular mechanisms underlying selection.

Discussion
We have previously developed constraint metrics that leverage population-scale exome and genome sequencing data to evaluate genic intolerance to LoF variation for each protein-coding gene 1,19 . Here, we adopted the same principle with an extended mutational model to assess constraint across the entire genome, using our latest release of gnomAD (v3.1.2), a dataset of harmonized high-quality whole-genome sequences from 76,156 individuals of diverse ancestries. Improvements to constraint modeling include unified fitting of mutation rate for all substitution and trinucleotide context and inclusion of regional genomic features to refine the expected variation in non-coding regions (Methods). We validated our metric using a series of external functional annotations, with a focus on the non-coding genome, and demonstrated the value of our metric for prioritizing non-coding elements and identifying functionally important genes. We have made the constraint scores publicly accessible via the gnomAD browser (https://gnomad.broadinstitute.org).
One key challenge in quantifying non-coding constraint is to distinguish between selection and neutral variation. To this end, we intended not to include features that are directly reflective of regulatory functions in our model, such as histone modifications as commonly used in classifiers predicting functional non-coding variants. Instead, we employed such features as part of our validation, where we showed that our constraint Z score is positively correlated with the regulatory elements derived from epigenomic signatures. Meanwhile, to demonstrate the ability of our metric in prioritizing non-coding variants independent of established regulatory elements, we re-examined the enrichment analyses with previously annotated regulatory sequences excluded -all results persisted (Extended Fig. 3a,b). Likewise, we demonstrated that our constraint metric captures additional information to phylogeny-based conservation scores, which are likely to be blind to functional non-coding regions that have high evolutionary turnover (e.g., enhancers 71-76 ; Extended Fig. 4). We also note that we restricted all our validation analyses to non-coding regions to explicitly evaluate the metric for characterizing the noncoding genome, and we further eliminated potential bias from nearby genes by recapitulating the results within regions >10kb away from any protein-coding genes (Extended Fig. 5). Further validating our metric on experimental data from multiplex assays of variant effect (MAVE) 77 , we showed that our constraint Z score correlates well with the experimental measurements, along with comparison to other genome-wide predictive scores (Extended Fig. 6; Methods). Altogether, our constraint metric presented reliable and consistent performance in identifying important non-coding regions in the human genome.
Our analyses revealed considerable correlation between constraint of the non-coding regulatory elements and the functional importance of their target genes. The implication is twofold. First, the constraint metric can be applied to further prioritize existing regulatory annotations. For instance, categorizing ENCODE cCREs by constraint Z score identified an increasingly stronger GWAS signals in the more constrained cCREs (Extended Fig. 3c). This also suggests the integration of multiple lines of evidence indicative of function. Second, the prioritization of regulatory elements can be applied to improve identification of important genes. The example of ASCL2 demonstrates how leveraging the non-coding functional variation of a gene can build out a clearer picture of its importance from a regulatory point of view. The analyses of tissue-specific enhancer constraint further demonstrate how non-coding constraint could be applied to refine gene constraint modeling.
Despite the clear constraint signal identified for non-coding regions, many limitations exist. First, the mutational model explains ~75% of the variation (R 2 =0.746) in de novo mutation rate, indicating underrecognized sources or properties of spontaneous human mutation. This situation can be partially alleviated by an increased sample size of de novo mutation data, which would allow inclusion of additional genomic features at smaller scales as well as more accurate estimation of feature contributions. Even with a sufficiently large dataset, however, comprehensive modeling of the heterogeneity in the mutation rate requires advanced knowledge about the underlying mutational mechanisms and ideally a more interpretable, biologically-informed statistical framework. Further, the practical interpretation of noncoding constraint, especially in the context of gene regulation, can only be informative when considered in a particular context, such as a tissue/cell type, developmental stage, or environment. Such information is not directly built in current constraint scores nor in the mutational dataset, thus downstream analysis of functional genomics data (e.g., incorporating tissue-specific enhancer signatures as in this study) is often necessary for justifying specific biological implications. It should also be noted that, since the detection of depletion of variation is immune to negative selection after reproductive age, genomic regions involved in late-onset phenotypes are therefore likely to go underrecognized.
Finally, while this is the largest dataset of human genomes examined to date for non-coding constraint, our method will substantially increase in power and resolution as sample sizes increase.Using constraint seen in coding regions as benchmarks (Extended Fig. 7), we are currently well-powered to detect noncoding regions with constraint as strong as the 90 th percentile of coding exons of similar size, and we estimate a sample size of ~700K to detect constraint that is about the average of coding regions. Moreover, as functional non-coding elements are generally small, to further increase our resolution, for example at a 100bp scale, we would need ~4.8M individuals to detect individual constrained non-coding elements.
Overall our study demonstrates the value of the genome-wide constraint map in characterizing both noncoding regions and protein-coding genes, providing a significant step towards a comprehensive catalog of functional genomic elements for humans.

Genome data aggregation, variant-calling, and quality control
We aggregated whole genome sequence data from 153,030 individuals spanning projects from casecontrol consortia and population cohorts, in a similar fashion to previous efforts 1 . We harmonized these data using the GATK Best Practices pipeline and joint-called all samples using Hail 78 , and developed and utilized an updated pipeline of sample, variant, and genotype QC to create a high-quality callset of 76,156 individuals, computing frequency information for several strata of this dataset based on attributes such as ancestry and sex for each of 644,267,978 short nuclear variants (see Supplementary Information).

Mutation rate model
To estimate the baseline mutation rate for each substitution and context, we tallied each trinucleotide context in the human genome. As previously shown 19 , the methylated CpG variants begin to saturate at a sample size of ~10K genomes, and therefore we downsampled the gnomAD dataset to 1,000 genomes for use in calculating the mutation rate. Sites were further excluded if there were variants observed in gnomAD but flagged as low quality, or found in greater than 5 copies in the downsampled set. Using this dataset, we computed the proportion of possible variants observed for each trinucleotide change (XY1Z -> XY2Z), with CpG sites stratified by their methylation levels (see DNA methylation analysis). These proportions represent the relative probability of a given nucleotide mutating into one of the three other possible bases in a trinucleotide context, and we scaled this factor so that the weighted genome-wide average is the human per-base, per-generation mutation rate (1.2×10 -8 ) to obtain the absolute mutation rates ì. The ì estimates were well correlated with the proportion of possible variants observed in the 76,156 gnomAD genomes (R 2 =0.999, Extended Fig. 1a,b; Supplementary Data 1), a dataset of 6,079,733,538 possible variants at 2,026,577,846 autosomal sites with 30-32X coverage, where 390,393,900 high-quality rare (AF≤1%) variants were observed. These fitted proportion observed (mutabilities) were then used for establishing the context-dependent expected variation in construction of constraint Z scores.

DNA methylation analysis
To correct for the effect of DNA methylation on the mutation rate at CpG sites, we stratified all CpG sites by their methylation levels and computed the proportion observed within each context and methylation level. As an improvement to our previous methylation annotation (from averaging different tissues 1 ), we analyzed methylation data from germ cells across 14 developmental stages, comprising eight from preimplantation embryos (sperm, oocyte, pronucleus, two-cell-, four-cell-, eight-cell-, morula-, and blastocyst-stage embryos) 79 and six from primordial germ cells (7Wk, 10Wk, 11Wk, 13Wk, 17Wk, and 19Wk) 80 . For each stage, we computed methylation level at each CpG site as the proportion of wholegenome bisulfite sequencing reads corresponding to the methylated allele. In order to derive a composite score from the 14 stages, we regressed the observation of a CpG variant in gnomAD (0 or 1) on the methylation computed at the corresponding site (a vector of 14), and we used the coefficients from the regression model as weights to compute a composite methylation score for each CpG site. This metric was further discretized into 16 levels (by a minimum step of 0.

Construction of constraint Z score
We created a signed Z score to quantify the depletion of variation (constraint) at a 1kb scale by comparing the observed variation to an expectation: Here, the observed variant count ( ) is the number of unique rare (AF<0.1%) variants in a 1kb window identified in the gnomAD dataset. To establish the expected number of variants ( ), we first created a context-dependent mutability for each window by summing all trinucleotide mutabilities (proportion observed modeled by ì) within the 1kb sequence. At the same time, we computed for each window 17 genomic features that have been shown to influence mutation rate (e.g., replication timing and recombination rate; see Collection of genomic features). To determine the relative contribution of the 18 features (trinucleotide context plus 17 genomic features), we trained these features to predict the occurrence of de novo mutations (DNMs), as a proxy of spontaneous mutations, using a random forest regression model. A set of 413,304 unique DNMs were compiled from two large-scale family-based wholegenome sequencing studies{Halldorsson, 2019 #17;An, 2018 #18}. Given the data sparsity, we counted the incidence of DNMs per 1Mb and regressed it against the 18 features computed correspondingly (R 2 =0.746 by a 90/10 train/test split, Extended Fig. 1c). The fitted model was then applied to establish the expected variation ( ) in the gnomAD dataset.
Constraint Z scores were created for 2,442,347 non-overlapping 1kb windows across the human genome (2,330,252 on autosomes and 112,095 on chromosome X). Due to the lack of DNM data on chromosome X, a model was trained using autosomal regions and was extrapolated to chromosome X for computing constraint scores. We performed downstream analyses separately for autosomes and chromosome X and presented the former as primary, with the latter provided in Extended Fig. 8. In the analyses, we filtered the dataset to windows where 1) the sites contained at least 1,000 possible variants, 2) at least 80% of the observed variants passed all variant call filters, and 3) the mean coverage in the gnomAD genomes was between 25-35X (or 20-25X for chromosome X). This resulted in 1,797,153 windows (73.6% of initial) for the primary analyses (Supplementary Data 2), with 131,554 encompassing coding sequences and 1,665,599 exclusively non-coding.

Correlation between constraint Z and APS
As an internal validation, we compared our constraint Z score against the SV constraint score APS 23 . For each SV from the original study (gnomAD-SV), we assessed its constraint by assigning the highest Z score among all overlapping 1kb windows. The correlation between constraint Z and APS was evaluated across 205,699 high-quality autosomal SVs scored by both metrics, using a linear regression test. In Fig. 1b, the correlation was presented by the mean value of APS across ascending constraint Z bins, with 95% confidence intervals computed from 100-fold bootstrapping. To assess the correlation between constraint Z scores and the collected candidate functional elements, we intersected each annotation with the scored 1kb windows binned by constraint Z (<-4 We note that we performed all above analyses restricting to exclusively non-coding windows to evaluate the use of our constraint metric in characterizing the non-coding genome.

Coordination between non-coding constraint and gene constraint
To examine whether constraint of non-coding regulatory elements implicates the constraint of their target genes, we compared constraint Z scores of enhancers linked to constrained genes and unconstrained genes. The former included well-established gene sets of 189 ClinGen 87 haploinsufficient genes, 2,454 MGI 88 essential genes mapped to human orthologs, 1, 771 OMIM 89 autosomal dominant genes, and 1,920 LOEUF 1 first-decile genes; and the later included a curated list of 356 olfactory receptor genes and 189 LOEUF last-decile genes with at least 10 expected LoF variants (which are sufficiently powered to be classified into the most constrained decile). The LOEUF underpowered list included 1,117 genes with ≤5 expected LoF variants. Enhancers associated with each gene were obtained from the Roadmap Epigenomics Enhancer-Gene Linking database, which used correlated patterns of activity between histone modifications and gene expression to predict enhancer-gene links 90,91 . For each gene, we aggregated and merged enhancers predicted from all 127 reference epigenomes and assigned the most constrained enhancer to each gene for analysis of enhancer-gene constraint coordination (Supplementary Data 3).
In the analysis of correlation between tissue-specific enhancer constraint and tissue-specific gene expression, we processed the enhancer-gene links with the same principle as described above but within specific tissue types (as defined in the Roadmap Epigenomics metadata 64 ). For each gene and tissue type, we searched for tissue-specific gene expression in the Genotype-Tissue Expression (GTEx 65 ) database (RNASeQCv1.1.9) and computed a normalized median expression for each gene (log2(TPM+1)). Enhancer constraint and gene expression levels were calculated for 11 matched tissue types and their correlation within each tissue type was evaluated by regressing gene expression on the enhancer constraint of corresponding gene. Importantly, we included gene constraint (LOEUF score) as a covariate to test the additional contribution from enhancer constraint. Similarly, we used a logistic regression model conditioning on LOEUF to examine the value of brain-tissue enhancer constraint in identifying genes functional in brain. As a proof-of-principle, we regressed genes encoding the mRNA targets of Fragile X Mental Retardation Protein (FMRP 70 ; 0 or 1) against the constraint of enhancers derived from brain tissue, including LOEUF score as a covariate.

Correlation of constraint Z and other predictive scores with MAVE
We compared our metric with other four genome-wide predictive scores -Orion 13 , CDTS 14 , gwRVIS 17 , and JARVIS 17 ) -in their correlation with experimental measurements on 11 enhancers tested by MAVE 77 . Each predictive score was downloaded from the original study, lifted over to GRCh38 (for Orion), and applied to score enhancers by taking the average over corresponding genomic regions. Enhancers were ranked by the proportion of mutations identified by MAVE as causing significant changes in gene expression (high to low: SORT1, IRF4, IRF6, ZRS, ZFAND3, RET, TCF7L2, MYC rs11986220 [MYCs2], BCL11A, UC88, MYC rs11986220 [MYCs1]), and the rank correlation between MAVE and each predictive score was evaluated using a Spearman's rank correlation test. The correlation was computed on all scorable enhancers for each score, as well as enhancers that are scorable by all metrics ( =7; RET and IRF4 were not scored by constraint Z due to feature missingness near the telomere/centromere and MYCs1 and SORT1 were excluded due to quality filtering yet their scores are provided in the "Unfiltered Z" penal. In addition, we excluded UC88 in favor of other scores as it appeared to be an "outlier" deflecting their ranks (Extended Fig. 6). with ≥ 1bp coding sequence; red) overall exhibit a higher constraint Z (stronger negative selection) than windows that are exclusively non-coding ( =1,665,599; blue). b, The correlation between constraint Z score and the adjusted proportion of singletons (APS) score developed for structural variation (SV) constraint. A collection of 205,699 autosomal SVs were assessed using constraint Z score by assigning each SV the highest Z among all overlapping 1kb windows, which shows a strong positive correlation with the SV constraint metric APS. Error bars indicate 100-fold bootstrapped 95% confidence intervals of the mean values.   The non-coding constraint of a given CNV was assessed by the highest constraint Z score among all non-coding windows being affected. CNVs were categorized and the frequency of constrained CNVs (Z≥4) was compared based on pathogenic potential: constrained CNVs are more common in DD cases than controls (6,631/15,717=42.2% versus 8,107/74,706=10.9%) and are most frequent for CNVs previously implicated as pathogenic (16/18=88.9% by DD and 3,219/3,886=82.8% by ClinVar). b, The contribution of non-coding constraint in predicting CNVs in DD cases versus controls. Non-coding constraint remains a strong predictor for the case/control status of CNVs after adjusting for gene constraint (LOEUF score), gene number, and size of CNVs, using a logistic regression test. Error bars indicate 95% confidence intervals of the log odds ratios. c, CNVs at the IHH locus associated with synpolydactyly and craniosynostosis. The four implicated duplications (grey bars) overlap in a ~10kb region that exhibit high non-coding constraint (blue), with the highest Z score coinciding with the major IHH enhancers (dark blue). Each blue bar shows the constraint Z score of a 1kb window within the locus; gaps indicate windows removed by quality filters.

Fig. 5:
Relationship between non-coding constraint and gene constraint. a, The proportion of non-coding 1kb windows overlapping with enhancers that were predicted to regulate specific genes, as a function of their constraint Z scores. More constrained non-coding regions are more frequently linked to a gene. Error bars indicate standard errors of the proportions. b, Comparison of the constraint Z scores of enhancers linked to constrained and unconstrained genes. Enhancers of established sets of constrained genes (four blue boxes) are more constrained than enhancers of presumably less constrained genes (two grey boxes).
Enhancers of genes that are underpowered for gene constraint detection ("LOEUF underpowered") present a higher constraint than those powered yet unconstrained genes ("LOUEF unconstrained"). c, Gene sets enriched for the LOUEF underpowered genes. Enrichment analysis was performed for a set of 506 genes underpowered by LOEUF while being linked with a constrained enhancer. Red dashed line indicates FDR=0.05. d, Comparison of evolutionary constraint for the LOUEF underpowered genes with and without a constrained enhancer. PhastCons, a phylogeny-based conservation metric, supports the functional importance of genes with constrained enhancers. e, Correlations between enhancer constraint and gene expression in specific tissue types. Enhancer constraint was reprocessed in a tissue-specific manner and was modeled to predict the expression level of target genes in matched tissue types; a linear regression test suggests significant contribution from enhancer constraint even conditioning on gene constraint (LOEUF score). Error bars indicate 95% confidence intervals of the beta coefficient estimates. Fig. 1: Construction of mutational model and constraint Z score. a,b, Estimating the relative mutability for each trinucleotide context across the genome. The proportion of possible variants observed in 76,156 gnomAD genomes (mutability; y-axis) is exponentially correlated with the absolute mutation rate estimated from 1,000 downsampled genomes (mu; x-axis). Fit lines were modeled separately for human autosomes (a) and chromosome X (b). c, Estimating the relative contribution of local sequence context and regional genomic features in predicting the expected variation. Mutational model incorporating trinucleotide context and 17 genomic features was trained on DNMs using a random forest regression model (with a 90/10 train/test split). Trinucleotide mutability appears the most important feature in predicting the expected variation. d,e, The distribution of constraint Z score as a function of expected and observed variation. Each point represents Z score for a 1kb window on the genome ( = 1,797,153 on autosomes (d) and = 49,936 on chromosome X (e)), which quantifies the deviation of observed variation from expectation; positive Z (red) indicates deletion of variation (obs<exp) and the higher the Z the stronger the depletion (i.e., more constrained). Fig. 2: Comparison of constraint Z score between coding and non-coding regions. a, Exonic regions (1kb windows created by directly concatenating coding exons, =26,987; purple) exhibit a significantly higher constraint Z than windows that are exclusively non-coding ( =1,665,599; blue). b, The proportion of highly constrained (Z≥4) windows is positively correlated with the percentage of coding sequences in a window and is substantially higher for the exonic windows. c, The proportion of highly constrained (Z≥4) windows increases linearly as more exonic windows are included in the analysis through random sampling . d, Constraint Z score percentiles of non-coding versus exonic windows. About 0.5% (100-99.5) and 13.5% (100-86.5) of the non-coding windows exhibit similar constraint to top 10% (90 th percentile) and top 50% of exonic regions. Fig. 3: Distributions of GWAS variants in non-coding regions with regard to non-coding constraint and ENCODE cCRE annotations. a,b, The enrichment of GWAS catalog (a) and UKB GWAS (b) variants in constrained non-coding regions persists after excluding candidate cis-regulatory elements (cCREs) annotated by ENCODE. c, GWAS variants occur more frequently in cCREs under higher constraint, suggesting the value of non-coding constraint in prioritizing existing functional annotations. Fig. 4: Enrichment of enhancers (a) and GWAS variants (b) across the spectrum of constraint Z and conservation (PhastCons) score. Non-coding 1kb windows were binned by their constraint Z and PhastCons conservation scores, from the least constrained/conserved (1 st decile) to the most constrained/conserved (10 th decile), and enrichment within each bin was evaluated by comparing the proportion of 1kb windows that overlap with an enhancer annotation (a) or a GWAS hit (b) to the genome-wide average. Red color indicates enriched and blue color indicates depleted; odds ratio is shown for nominally significant (p 0.05) enrichment/depletion. As expected, enrichment increases as both scores increase (upper right), yet each score captures independent signals -for instance, regions with high constraint Z scores present significant enrichment across the deciles of conservation score (rightmost column). Fig. 5: The enrichment of candidate regulatory elements (a) and GWAS variants (b, GWAS Catalog; c, UKB GWAS) in constrained non-coding regions persists when restricting to regions farther away from protein-coding genes (±10kb). Fig. 6: Correlation of constraint Z and other predictive scores with experimental data on enhancers tested by multiplex assays of variant effect (MAVE). a, Predictive scores on MAVE-tested enhancers, the importance of which was ranked by the proportion of mutations MAVE identified as causing significant changes in gene expression. Predictive scores of enhancers were aligned below in the same order and were modified when necessary (multiplied by -1 for CDTS and gwRVIS) such that a higher value represents higher importance for all scores. An increasing trend (dotted line) indicates positive correlation between the predictive score and the experimental measurement from MAVE (Spearman's rank correlations are shown in 1b). Gaps indicate unavailable values; RET and IRF4 were not scored by constraint Z due to feature missingness near the telomere/centromere and MYCs1 and SORT1 were excluded due to quality filtering (yet they are provided in the "Unfiltered Z" panel). b, Spearman's rank correlation between MAVE and each predictive score. Constraint Z score exhibits the highest correlation with the experimental measurements, either on all scorable enhancers (All) or the subset of seven enhancers scored by constraint Z (Z scored); the high performance persists even when excluding UC88 (Exl. UC88) in favor of other scores as it appeared to be an outlier deflecting their ranks. *Of note, JARVIS and LINSIGHT employed enhancers (as well as other functional annotations) in constructing their scores, which may introduce circularity in favor of their performance in scoring enhancers.          Conservation score decile