TY - JOUR T1 - Anomaly detection in multimodal MRI identifies rare individual phenotypes among 20,000 brains JF - bioRxiv DO - 10.1101/2021.05.10.441017 SP - 2021.05.10.441017 AU - Zhiwei Ma AU - Daniel S. Reich AU - Sarah Dembling AU - Jeff H. Duyn AU - Alan P. Koretsky Y1 - 2021/01/01 UR - http://biorxiv.org/content/early/2021/05/10/2021.05.10.441017.abstract N2 - The UK Biobank (UKB) is a large-scale epidemiological study and its imaging component focuses on the pre-symptomatic participants. Given its large sample size, rare imaging phenotypes within this unique cohort are of interest, as they are often clinically relevant and could be informative for discovering new processes and mechanisms. Identifying these rare phenotypes is often referred to as “anomaly detection”, or “outlier detection”. However, anomaly detection in neuroimaging has usually been applied in a supervised or semi-supervised manner for clinically defined cohorts of relatively small size. There has been much less work using anomaly detection on large unlabeled cohorts like the UKB. Here we developed a two-level anomaly screening methodology to systematically identify anomalies from ∼19,000 UKB subjects. The same method was also applied to ∼1,000 young healthy subjects from the Human Connectome Project (HCP). In primary screening, using ventricular, white matter, and gray matter-based imaging phenotypes derived from multimodal MRI, every subject was parameterized with an anomaly score per phenotype to quantitate the degree of abnormality. These anomaly scores were highly robust. Anomaly score distributions of the UKB cohort were all more outlier-prone than the HCP cohort of young adults. The approach enabled the assessments of test-retest reliability via the anomaly scores, which ranged from excellent reliability for ventricular volume, white matter lesion volume, and fractional anisotropy, to good reliability for mean diffusivity and cortical thickness. In secondary screening, the anomalies due to data collection/processing errors were eliminated. A subgroup of the remaining anomalies were radiologically reviewed, and a substantial percentage of them (UKB: 90.1%; HCP: 42.9%) had various brain pathologies such as masses, cysts, white matter lesions, infarcts, encephalomalacia, or prominent sulci. The remaining anomalies of the subgroup had unexplained causes and would be interesting for follow-up. Finally, we show that anomaly detection applied to resting-state functional connectivity did not identify any reliable anomalies, which was attributed to the confounding effects of brain-wide signal variation. Together, this study establishes an unsupervised framework for investigating rare individual imaging phenotypes within large heterogeneous cohorts.Competing Interest StatementThe authors have declared no competing interest.VVventricular volumeWMLVwhite matter lesion volumeFAfractional anisotropyMDmean diffusivityCThcortical thicknessRSFCresting-state functional connectivityUKBUK BiobankHCPHuman Connectome ProjectSDstandard deviation ER -