Conditional prediction of consecutive tumor evolution using cancer progression models: What genotype comes next?

Juan Diaz-Colunga; Ramon Diaz-Uriarte

doi:10.1101/2020.12.16.423099

Abstract

Accurate prediction of tumor progression is key for adaptive therapy and precision medicine. Cancer progression models (CPMs) can be used to infer dependencies in mutation accumulation from cross-sectional data and provide predictions of tumor progression paths. But their performance when predicting the complete evolutionary paths is limited by violations of assumptions and the size of available data sets. Instead of predicting full tumor progression paths, we can focus on short-term predictions, more relevant for diagnostic and therapeutic purposes. Here we examine if five distinct CPMs can be used to answer the question “Given that a genotype with n mutations has been observed, what genotype with n + 1 mutations is next in the path of tumor progression” or, shortly, “What genotype comes next”. Using simulated data we find that under specific combinations of genotype and fitness landscape characteristics CPMs can provide predictions of short-term evolution that closely match the true probabilities, and that some genotype characteristics (fitness and probability of being a local fitness maximum) can be much more relevant than global features. Thus, CPMs can provide short-term predictions even when global, long-term predictions are not possible because fitness landscape- and evolutionary model-specific assumptions are violated. When good performance is possible, we observe significant variation in the quality of predictions of different methods. Genotype-specific and global fitness landscape characteristics are required to determine which method provides best results in each case. Application of these methods to 25 cancer data sets shows that their use is hampered by lack of the information needed to make principled decisions about method choice and what predictions to trust. Fruitful use of these methods for short-term predictions requires adapting method’s use to local genotype characteristics and obtaining reliable indicators of performance; it will also be necessary to clarify the interpretation of the method’s results when key assumptions do not hold.

Competing Interest Statement

The authors have declared no competing interest.

Footnotes

Figure 3 revised and discussion updated for more clarity
https://github.com/rdiaz02/what_genotype_next

The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license.