Abstract
Autism spectrum disorder (ASD) is a neurodevelopmental disorder with substantial phenotypic and etiological heterogeneity. It is estimated that 10-20% of cases are due to copy number variations (CNVs). Here we apply a newly developed CNV association tool (SNATCNV) to reanalyse CNV data from 19,663 autistic and 6,479 control subjects from the AutDB database. We demonstrate that SNATCNV outperforms existing CNV association methods by finding smaller genomic regions that better discriminate cases and controls. By integrating data from the FANTOM5 expression atlas we show that both known ASD causal genes identified by the SFARI and MSSNG consortia and genes within the CNVs identified by SNATCNV have brain enriched expression patterns; both brain-enriched coding and long-non-coding RNA genes are over-represented. We provide full lists of these brain enriched coding and lncRNA genes as a resource to the research community. We also go on to show that each CNV region is associated with a distinct set of phenotypes, that some are sex biased and highlight one deleted region where a brain-enriched lncRNA is the only gene present. Our analyses identify 47 high confidence ASD associated CNV regions and identifies brain-enriched genes which underlie this neurodevelopmental disorder.