TY - JOUR T1 - Big data approaches to decomposing heterogeneity across the autism spectrum JF - bioRxiv DO - 10.1101/278788 SP - 278788 AU - Michael V. Lombardo AU - Meng-Chuan Lai AU - Simon Baron-Cohen Y1 - 2018/01/01 UR - http://biorxiv.org/content/early/2018/03/17/278788.abstract N2 - Autism is a diagnostic label based on behavior. While the diagnostic criteria attempts to maximize clinical consensus, it also masks a wide degree of heterogeneity between and within individuals at multiple levels of analysis. Understanding this multi-level heterogeneity is of high clinical and translational importance. Here we present organizing principles to frame the work examining multi-level heterogeneity in autism. Theoretical concepts such as ‘spectrum’ or ‘autisms’ reflect non-mutually exclusive explanations regarding continuous/dimensional or categorical/qualitative variation between and within individuals. However, common practices of small sample size studies and case-control models are suboptimal for tackling heterogeneity. Big data is an important ingredient for furthering our understanding heterogeneity in autism. In addition to being ‘feature-rich’, big data should be both ‘broad’ (i.e. large sample size) and ‘deep’ (i.e. multiple levels of data collected on the same individuals). These characteristics help ensure the results from a population are more generalizable and facilitate evaluation of the utility of different models of heterogeneity. A model’s utility can be shown by its ability to explain clinically or mechanistically important phenomena, but also by explaining how variability manifests across different levels of analysis. The directionality for explaining variability across levels can be bottom-up or top-down, and should include the importance of development for characterizing change within individuals. While progress can be made with ‘supervised’ models built upon a priori or theoretically predicted distinctions or dimensions of importance, it will become increasingly important to complement such work with unsupervised data-driven discoveries that leverage unknown and multivariate distinctions within big data. Without a better understanding of how to model heterogeneity between autistic people, progress towards the goal of precision medicine may be limited. ER -