Abstract
Recent studies have identified many genes with rare de novo mutations in autism, but a limited number of these have been conclusively established as disease-susceptibility genes due to lack of recurrence and confounding background mutations. Such extreme genetic heterogeneity severely limits recurrence-based statistical power even in studies with a large sample size. In addition, the cellular contexts in which these genomic lesions confer disease risks remain poorly understood. Here we investigate the use of cell-type specific expression profiles to differentiate mutations in autism patients or unaffected siblings. Using 24 distinct cell types isolated from the mouse central nervous system, we identified an expression signature shared by genes with likely gene disrupting (LGD) mutations detected by exome-sequencing in autism cases. The signature reflects haploinsufficiency of risk genes enriched in transcriptional and post-transcriptional regulators, with the strongest positive associations with specific types of neurons in different brain regions, including cortical neurons, cerebellar granule cells, and striatal medium spiny neurons. Based on this signature, we assigned a D score to all human genes to prioritize candidate autism-susceptibility genes. When applied to genes with only a single LGD mutation in cases, the D score achieved a precision of 40% as compared to the 15% baseline with a minimal loss in sensitivity. Further improvement was made by combining D score and mutation intolerance metrics from ExAC which were derived from orthogonal data sources. The ensemble model achieved precision of 60% and predicted 117 high-priority candidates. These prioritized lists can facilitate identification of additional autism-susceptibility genes.