Abstract
Continual reduction in sequencing cost is expanding the accessibility of genome sequencing data for routine clinical applications. However, the lack of methods to construct machine learning-based predictive models using these datasets has become a crucial bottleneck for the application of sequencing technology in clinics. Here we developed a new algorithm, eTumorMetastasis, which transforms tumor functional mutations into network-based profiles, and identify network operational gene signatures (NOG signatures) which model the tipping point at which a tumor cell shifts from a state that doesn’t favor recurrences to one that does. We showed that NOG signatures derived from genomic mutations of tumor founding clones (i.e., the ‘most recent common ancestor’ of the cells within a tumor) significantly distinguished recurred and non-recurred breast tumors. These results imply that somatic mutations of tumor founders are association with tumor recurrence and can be used to predict clinical outcomes. Finally, the concepts underlying the eTumorMetastasis pave the way for the application of genome sequencing in predictions for other complex genetic diseases.