ABSTRACT
Molecular phenotypes of cancer are complex and influenced by a multitude of factors. Conventional unsupervised clustering of heterogeneous cancer patient populations is inevitably driven by the dominant variation from major factors such as cell-of-origin or histology. Drawing from ideas in supervised text classification, we developed survClust, an outcome-weighted clustering algorithm for integrative patient stratification. We show survClust outperforms unsupervised clustering in identifying cancer patient subpopulations characterized by specific genomic phenotypes with more aggressive clinical behavior. The algorithm and tools we developed have direct utility toward clinically relevant patient stratification based on tumor genomics to inform clinical decision-making.
Competing Interest Statement
The authors have declared no competing interest.