Summary
Background Repeat expansions cause over 20 neurogenetic disorders that can present with overlapping clinical phenotypes, making molecular diagnosis challenging. Single gene or small panel PCR‐based methods are employed to identify the precise genetic cause, but can be slow and costly, and often yield no result. Genomic analysis via whole exome and whole genome sequencing (WES and WGS) is being increasingly performed to diagnose genetic disorders. However, until recently analysis protocols could not identify repeat expansions in these datasets.
Methods A new method for the identification of repeat expansions using either WES or WGS was developed. Four retrospective cohorts of individuals with eight different known repeat expansion disorders were analysed with the new method. Results were assessed by comparing to the known disease status. Performance was also compared to a recently published genotyping-based method, ExpansionHunter.
Findings Expansion repeats were successfully identified in WES and WGS datasets. The new method demonstrated very high predictive capabilities, achieving a median area under the curve (AUC) of 0.9. The new robust method achieved a median specificity and sensitivity of 0.99 and 0.75 respectively, compared to ExpansionHunter, a recently published genotyping-based method (median specificity = 0.99, median sensitivity = 0.56).
Interpretation The new method, called exSTRa (expanded STR algorithm), is available from https://github.com/bahlolab/exSTRa. It can be applied to existing WES or WGS data to identify likely repeat expansions. We demonstrate that exSTRa can be effectively utilized as a screening tool to interrogate WES and WGS sequencing data generated with PCR-based library preparations which can then be followed up with specific diagnostic tests.
Funding The Australian National Health and Medical Research Council, the Australian Research Council, The Murdoch Children’s Research Institute, The Walter and Eliza Hall Institute of Medical Research, and the Victorian Government.