PT - JOURNAL ARTICLE AU - Lord, Jenny AU - Gallone, Giuseppe AU - Short, Patrick J. AU - McRae, Jeremy F. AU - Ironfield, Holly AU - Wynn, Elizabeth H. AU - Gerety, Sebastian S. AU - He, Liu AU - Kerr, Bronwyn AU - Johnson, Diana S. AU - McCann, Emma AU - Kinning, Esther AU - Flinter, Frances AU - Temple, I. Karen AU - Clayton-Smith, Jill AU - McEntagart, Meriel AU - Lynch, Sally Ann AU - Joss, Shelagh AU - Douzgou, Sofia AU - Dabir, Tabib AU - Clowes, Virginia AU - McConnell, Vivienne P. M. AU - Lam, Wayne AU - Wright, Caroline F. AU - FitzPatrick, David R. AU - Firth, Helen V. AU - Barrett, Jeffrey C. AU - Hurles, Matthew E. AU - , TI - Pathogenicity and selective constraint on variation near splice sites AID - 10.1101/256636 DP - 2018 Jan 01 TA - bioRxiv PG - 256636 4099 - http://biorxiv.org/content/early/2018/08/30/256636.short 4100 - http://biorxiv.org/content/early/2018/08/30/256636.full AB - Mutations which perturb normal pre-mRNA splicing are significant contributors to human disease. We used exome sequencing data from 7,833 probands with developmental disorders (DD) and their unaffected parents, as well as >60,000 aggregated exomes from the Exome Aggregation Consortium, to investigate selection around the splice site, and quantify the contribution of splicing mutations to DDs. Patterns of purifying selection, a deficit of variants in highly constrained genes in healthy subjects and excess de novo mutations in patients highlighted particular positions within and around the consensus splice site of greater functional relevance. Using mutational burden analyses in this large cohort of proband-parent trios, we could estimate in an unbiased manner the relative contributions of mutations at canonical dinucleotides (73%) and flanking non-canonical positions (27%), and calculated the positive predictive value of pathogenicity for different classes of mutations. We identified 18 patients with likely diagnostic de novo mutations in dominant DD-associated genes at non-canonical positions in splice sites. We estimate 35-40% of pathogenic variants in non-canonical splice site positions are missing from public databases.