TY - JOUR T1 - Seeing beyond the target: Leveraging off-target reads in targeted clinical tumor sequencing to identify prognostic biomarkers JF - bioRxiv DO - 10.1101/2021.05.28.446240 SP - 2021.05.28.446240 AU - Serghei Mangul AU - Jaqueline J Brito AU - Stefan Groha AU - Noah Zaitlen AU - Alexander Gusev Y1 - 2021/01/01 UR - http://biorxiv.org/content/early/2021/05/29/2021.05.28.446240.abstract N2 - Clinical tumor sequencing is rapidly becoming a standard component of clinical care, providing essential information for selecting amongst treatment options and providing prognostic value. Here we develop a robust and scalable software platform (SBT: Seeing Beyond the Target) that mines discarded components of clinical sequences to produce estimates of a rich set of omics features including rDNA and mtDNA copy number, microbial species abundance, and T and B cell receptor sequences. We validate the accuracy of SBT via comparison to multimodal data from the TCGA and apply SBT to a tumor panel cohort of 2,920 lung adenocarcinomas to identify associations of clinical value. We replicated known associations of somatic events in TP53 with changes in rDNA (p=0.012); as well as diversity of BCR and TCR repertoires with the biopsy site (p=2.5×10−6, p<10−20). We observed striking differences in EGFR mutant lung cancers versus wild-type, including higher rDNA copy number and lower immune repertoire diversity. Integrating clinical outcomes, we identified significant prognostic associations with overall survival, including SBT estimates of 5S rDNA (p=1.9×10−4, hazard ratio = 1.22) and TCR diversity (p=2.7×10−3, hazard ratio=1.77). Both novel survival associations replicated in 1,302 breast carcinoma and 1,651 colorectal cancer tumors. We anticipate that feature estimates derived by SBT will yield novel biomarker hypotheses and open research opportunities in existing and emerging clinical tumor sequencing cohorts.Competing Interest StatementThe authors have declared no competing interest. ER -