Abstract
Understanding the migration history of cancer cells is essential for advancing metastasis research and developing therapies. Existing migration history inference methods often rely on parsimony criteria such as minimizing migrations, comigrations, and seeding locations. Importantly, existing methods either yield a single optimal solution or are probabilistic algorithms without guarantees on optimality nor comprehensiveness of the returned solution space. As such, current methods are unable to capture the full extent of uncertainty inherent to the data. To address these limitations, we introduce MACH2, a method that systematically enumerates all plausible migration histories by exactly solving the Parsimonious Migration History with Tree Refinement (PMH-TR). In addition to the migration and comigration criteria, MACH2 employs a novel parsimony criterion that minimizes the number of clones unobserved in their inferred location of origin. MACH2 allows one to specify both the order and the set of criteria to include during optimization, allowing users to adapt the model to specific analysis needs. MACH2 also includes a summary graph to identify high-confidence migrations. Finally, we introduce MACH2-viz, an interactive webtool for visualizing and exploring MACH2 solution spaces. Using simulated tumors with known ground truth, we show that MACH2, especially the version that prioritizes the new unobserved clone criterion, outperforms existing methods. On real data, MACH2 detects extensive uncertainty in non-small cell lung, ovarian, and prostate cancers, and infers migration histories consistent with orthogonal experimental data.
Competing Interest Statement
The authors have declared no competing interest.
Footnotes
In the revised article we have added several changes and new analyses. First, we streamlined the text to emphasize that MACH2-UMC is the default and preferred ordering of parsimony objectives, this included simplifying Fig. 2 and the simulation section. Additionally, we provided more intuition and justification for several key concepts of the PMH-TR problem, including the new unobserved clones criterion, polytomy resolution and the summary graph. Finally, we included two new analyses showing the importance of enforcing temporal consistency as well as the use of the seeding location criterion as a post-processing step.