A statistical test on single-cell data reveals widespread recurrent mutations in tumor evolution

Jack Kuipers; Katharina Jahn; Benjamin J. Raphael; Niko Beerenwinkel

doi:10.1101/094722

Abstract

The infinite sites assumption, which states that every genomic position mutates at most once over the lifetime of a tumor, is central to current approaches for reconstructing mutation histories of tumors, but has never been tested explicitly. We developed a rigorous statistical framework to test the assumption with single-cell sequencing data. The framework accounts for the high noise and contamination present in such data. We found strong evidence for recurrent mutations at the same site in 8 out of 9 single-cell sequencing datasets from human tumors. Six cases involved the loss of earlier mutations, five of which occurred at sites unaffected by large scale genomic deletions. Two cases exhibited parallel mutation, including the dataset with the strongest evidence of recurrence. Our results refute the general validity of the infinite sites assumption and indicate that more complex models are needed to adequately quantify intra-tumor heterogeneity.

The presence of mutational heterogeneity within tumors due to somatic cell evolution is known to be a major cause of treatment failure ^1,2. With the emergence of next-generation sequencing techniques it is possible to systematically analyze individual tumors at a genetic level from admixed cell samples, and more recently from sequencing the DNA of individual tumor cells ^3,4. These technical advances, together with a prospect of high-precision cancer therapies, have spurred the development of a variety of computational approaches to reconstruct not only the clonal structure but also the entire mutation history of individual tumors ^{5⇓⇓⇓⇓⇓⇓⇓⇓⇓⇓⇓⇓⇓-19}. A common feature of all these approaches is the use of the infinite sites assumption (ISA) ²⁰ to exclude the possibility of the same genomic site being affected by multiple mutations throughout the lifetime of a tumor. However, the ISA has never been explicitly tested in the context of tumor evolution on sequencing tumor data. Only in the context of copy number alterations it has been recently suggested to allow multiple changes of the same site while still excluding recurrences of the same state ^21,22.

The ISA is convenient to make, as it substantially restricts the search space of possible mutation histories ²³, but its validity is unproven and hard to test, as many factors such as mutation rate, cell division rate, copy number changes and the presence of mutational hotspots influence the probability of multiple mutations hitting the same site. On larger scales, multiple mutations have been observed to affect the same gene at different genomic sites in different spatial areas and phylogenetic branches of tumors ^24,25 indicating convergent evolution for these driver genes. Structurally different copy number alterations have also been observed to affect the same genes in ovarian cancer ²². This raises the specter of recurrence at the scale of individual bases and violations of the ISA (Figure 1).

Figure 1: Somatic mutations occurring during tumor evolution could violate the infinite sites assumption. (a) The mutation indicated by the red diamond occurs in parallel in two different lineages. (b) The mutation depicted by the orange circle is lost in the left branch due to a loss of heterozygosity. The mutation drawn as a yellow triangle is lost in the right branch by reverting to its original state, denoted a back mutation.

In fact, the idea that every genomic position mutates at most once over the life-time of a tumor can be disproved by a generalization of the birthday problem (Online Methods). This is a classic math puzzle that asks for the probability that two people in a group share the same birthday. Perhaps surprisingly, this probability is already greater than with only 23 people. Using the same reasoning and estimates of the cumulative number of stem cell divisions ²⁶ and mutation rates ²⁷, we found that the probability of violating the ISA in any tissue is almost 1 (Supplementary Note).

It is a different question, however, whether recurrent mutations are likely to be observed in practice, as only a small fraction of the evolutionary history is reconstructable from the limited number of tumor cells that are typically sequenced (Supplementary Figure 1). So although it is almost certain that the ISA is violated within the tumor tissue, there may still be a low chance that a violation occurs with a small set of mutations observed in a small sample of cells (Supplementary Note).

Therefore we developed a statistical framework (Figure 2) based on real tumor data to test the infinite sites model (ISM), 𝓜_I, that comprises all histories with a single event for every mutated site, against a model 𝓜_F that allows multiple mutations at the same site, referred to as finite sites model (FSM) (Online Methods). The test is defined as a model selection problem where we compute the Bayes factor (BF) ^28,29 of the two alternative models based on single-cell sequencing data, 𝐷,

Figure 2: Testing the infinite sites assumption starts from the single-cell mutation data. The data is examined under both the infinite sites model of all trees with no recurrent mutations as well as under the finite sites model of trees with one recurrence. The two competing models of tumour evolution are compared on how well they explain the single-cell data, with one model selected via the Bayes factor.

When the FSM fits the data better than the ISM, the BF is greater 1, and the larger the value, the stronger is the evidence against the ISA. Neatly, the BF can be combined with estimates of the prior odds of each model to provide the posterior odds:

The computation of the BF is based on our earlier work on reconstructing mutation histories from mutation profiles of single cells ¹⁸, which we generalize here to allow a single recurrent mutation (Online Methods). The recurrent mutation can be either a back mutation, if the second event occurs in the same cell lineage, or a parallel mutation that occurs in a different lineage (Figure 1). The reconstruction accounts for the noise in single-cell sequencing data, particularly the high levels of allelic dropout.

Single-cell sequencing data can additionally be contaminated by doublets, the inadvertent sequencing of more than one cell together, with some platforms having rates as high as 40% ³⁰. We observed that high doublet contamination rates affect the quality of the reconstructed mutation histories and thereby can confound the model selection process. Therefore we extended both models to account for doublets and to learn their incidence rates from the data (Online Methods).

Results

Evaluation of our framework on simulated data sets with realistic noise levels and contamination with doublets revealed that our test has a high specificity of 95% using a BF cutoff of 1 (Supplementary Note). The sensitivity increases with the number of sequenced cells. With 2–3 cells per mutation, we find a moderate sensitivity of 50–60% with the same BF cutoff. While this means that some recurrent mutations will be overlooked, any signaling of violations of the infinite sites assumption in real data can be trusted. We analyzed nine published single-cell tumor datasets, three from whole-exome sequencing (Table 1) and six from targeted sequencing (Table 2). The details of the inferred parameters and trees are discussed in the Supplementary Note, with the results presented here.

View this table:

Table 1:

Characteristics of the three exome sequencing datasets^31⇓-33 along with the inferred recurrent mutations and Bayes factors.

View this table:

Table 2:

Characteristics of the panel sequencing datasets of six leukemia patient samples³⁴ along with their inferred recurrent mutations and Bayes factors. The genomic positions are according to the hg19 assembly.

Evidence for recurrent mutations in single-cell exome sequencing data

Looking at a JAK2-negative myeloproliferative neoplasm (essential thrombocythemia) for which the exomes of 58 tumor cells were sequenced, we focussed on the 18 mutations classified as cancer-related ³¹ and find evidence for a recurrence of the same point mutation in the RETSAT gene (Supplementary Figure 11). Both mutations are late events that have happened at the end of two neighboring branches. This recurrence is supported by a BF estimate of 30 constitutes strong evidence for a violation of the ISA.

Next, we analyzed a clear cell renal cell carcinoma for which exome sequencing data of a total of 17 tumor cells is available ³². Performing the model comparison based on the 35 sites informative for mutation tree reconstruction, we obtain a BF below 1. There is therefore no evidence for a violation of the ISA, although any such violation would be hard to detect with the low number of sequenced cells.

In a dataset of 47 cells of an estrogen-receptor positive (ER⁺) breast cancer with 40 informative mutation sites³³, we found that the tree topology under both models consists of a linear chain of mutations on top of a rather branched structure further down. Under the FSM a back mutation of the early PANK3 mutation, changes the upper tree structure substantially compared to the tree under the infinite sites model where the mutation is forced into a side branch (Supplementary Figure 13). Computing the BF, we find a value of 2000 providing very strong evidence that the model with the back mutation fits the data much better than the infinite sites model.

For the small number of cells sequenced, and assuming a uniform distribution of mutations with no selection and that all mutations are observed, we obtain the conservative estimate of the probability of the same site among 40 changing twice via point mutations to be rather small at 2.5 × 10⁻⁵ (Supplementary Table 5). We therefore tested loss of heterozygosity (LOH) as an alternative explanation for the back mutation: If the only allele carrying the mutation is lost at some point in the tree, sequencing descendant cells will only yield reads from the normal allele thereby mimicking a back mutation (Figure 1). Based on copy number data from breast cancer samples from the TCGA Research Network (cancergenome.nih.gov/), LOH on the PANK3 gene occurred with a probability of approximately 2 × 10⁻³ and thereby much higher than for the uniform reversion of a point mutation among 40. Copy number estimates are also provided ³³ for the sequenced cells, although it is difficult to determine whether LOH has occurred in the respective region. The reason being that PANK3 is located on chromosome 5 which was amplified early in the tumor evolution. Of the sequenced cells most of them seem to still exhibit an amplification of chromosome 5, but this is less certain for all cells. Some cells may then have lost a copy later, giving a possible explanation of our observation of the back-mutation.

Evidence for recurrent mutations in single-cell panel data

We found the strongest evidence against the ISA in single-cell sequencing data from the personalized panels of six childhood acute lymphoblastic leukemia (ALL) patients ³⁴. Our test returns extremely high BFs in the range of 10⁵ to 10¹⁵ (Table 2) for five of the cases, and a more modest but still highly significant BF estimate of 330 for one patient sample (patient 2). For all samples apart from patient 5, the recurrent mutation is a back mutation. Looking at the trees (Supplementary Figures 14–19) we notice that for three patients the lost mutation is actually the first one that happened in their trees: They affect the MAL2 gene in patient 1, RIMS2 in patient 2 and SUSD2 in patient 6. For patient 4, the lost mutation was in IKBKB which was also acquired in the tree trunk, while the last case, patient 3, lost a mutation in CUL3 that was acquired further down in a branch of the tree. Interestingly also three out of the five back mutations occur on chromosome 8. The overrepresentation of reversions of early clonal mutations could hint at changing selective pressure that renders an early trunk mutation expendable or even hindering in later tumor stages. Signs of this biological possibility have recently been observed for Barrett’s oesophagus³⁵.

Since LOH events are the most likely causes of back mutations, we compared to the 16 LOH events (>10kb) detected from the bulk data of the 6 leukemia patients³⁴. However, the single-cell data showed that the large majority (13 out of 16) appeared in all clones and were ancestral³⁴. None of the five back mutations we identified appeared in any of the LOH regions of the respective patient, emphasizing that they are unlikely to be the result of large scale deletions. smaller scale deletions. The data then indicate either smaller scale deletions or genuine back mutations with a reversion of the individual loci.

We further examined whether the LOH at the loci we identified are common in ALL. To obtain such statistics, we performed a comparison with large scale (>5Mb) copy number deletions found in a large study of 142 children and 123 adults with ALL ³⁶. Patients 1 and 2 had back mutations on the 8q chromosome, which were not observed in any of the study samples, except a whole loss of chromosome 8 in one adult. Patient 3 had a back mutation at chromosome 2q, which also was not observed in any of the study samples. The back mutations for patients 1–3 therefore also do not seem to match common large-scale deletion events, but could be the result of smaller losses. The 8q back-mutations of patient 1 and 2 are close to MYC which plays an important role in ALL ³⁷. Patient 4’s back mutation was at 8p which was lost in 4 children and 4 adults, while patient 6’s back mutation was at 22q and chromosome 22 was subject to large-scale deletion for 4 children and one adult in the ALL sample ³⁶. Their large BFs could be related to these relatively common LOH events. Patient 6’s back mutation also happened to be near IGL which is rarely translocated with MYC.

For patient 5, we observed (Figure 3) a parallel mutation in C1orf105 with a BF of 4.8 × 10¹⁵ so that allowing the mutation to occur twice explains the data much better than enforcing the ISA. Since sequencing bias is an unlikely explanation for the extreme BF based on analyzing the read counts in the cells (Supplementary Note), our conclusion is that we are observing here a real signal of the same genomic position mutating twice in different subpopulations of a tumor.

Figure 3: (a) The data matrix of the 105 mutations detected in the 96 single cells of patient 5 of the leukemia dataset ³⁴. Unmutated positions are left white, mutations are colored blue and the recurrent mutation in C1orf105 colored red. (b) The inferred mutational history under the finite sites model when allowing a recurrence of the point mutation in C1orf105. The two occurrences appear at the ends of different lineages in the tree, separated in the two branches by 35 and 18 other mutations. The very large Bayes factor of 4.8 × 10¹⁵ shows that allowing the parallel mutation fits the data much better than enforcing the infinite sites assumption.

Signs of secondary parallel mutations

Since back mutations violate the ISA but may have a simpler biological cause from LOH than the single genomic position reverting, we wished to examine parallel mutations more closely because these act at the level of individual bases. In particular, we restricted our search to consider only the highest scoring parallel mutation for each dataset. This may reveal additional violations of the ISA.

For the exome data, the recurrent mutation uncovered from the myeloproliferative neoplasm ³¹ is already parallel and no other parallel mutation scored highly. No evidence for infinite sites violations was discovered for the kidney cancer ³², and for the breast cancer samples ³³ no parallel mutation scored highly. For the panel data ³⁴ on the other hand, we find parallel mutations for patients 1–4 with BFs larger than 1 (Supplementary Table 3). Three of them have moderate BFs, but for patient 3 we find a large BF of 2.4 × 10⁶ which indicates multiple violations of the infinite sites hypothesis.

For patient 5 we also found multiple parallel mutations. The top-scoring recurrence was already a parallel mutation (Table 2), but the second highest scoring recurrence is also parallel with a very large BF of 4.1 × 10¹⁰. That mutation occurs on chromosome 9 at position 139923258 (hg19) which is at the ends of the ABCA2 and C9orf139 genes.

Discussion

We have developed a statistical framework to test the infinite sites assumption in single-cell sequencing data. Application of our framework to published patient data (one myeloproliferative neoplasm³¹, one renal cell carcinoma³², one breast tumor³³, and six leukemia patients³⁴) suggests that the assumption is frequently violated. We showed that these findings can not be explained by the background mutation rate alone, as the prior probability of mutating the same base twice among a selected set of bases is low if mutations are spread uniformly across the genome (Supplementary Table 5).

Most of the observed violations of the infinite sites assumption present as back mutations, typically as the loss of an early clonal mutation. This may be the result of random losses of passenger mutations, but observing this pattern in many patient samples would also be compatible with selection driven by the micro-environmental or the genetic context. For example early driver mutations may become obsolete once the tumor is established, or may even hinder the tumor at later stages so their loss becomes positively selected for. Hints of changing selective pressures on particular aberrations have recently been observed for Barrett’s oesophagus³⁵. Loss of a copy of the p16-locus seemed to provide a fitness advantage for clones experiencing acid reflux but a disadvantage when the acid is suppressed under treatment. Clones that regain the p16 copy could then potentially experience positive selection. For half of the leukemia patients the backmutation occurs on chromosome 8 pointing to a particular role in the development of the disease. A simpler explanation for back mutations is LOH, the loss of a chromosomal segment that comprises a mutated site. In tumors rich in copy number alterations such an event would have a reasonably high prior probability, as the same site is much easier hit by two or more such large-scale alterations than by two point mutations. In the leukemia dataset ³⁴, the back mutations we identified did not occur in genomic regions affected by large scale deletions. While our findings on the incidence of back mutations are limited to the small number of patient samples available at this point, they may be of importance in the context of treatment strategies that target early trunk mutations in cancer therapy. Our method can be used to generate the trunk mutations more accurately, as evident particularly for the breast cancer sample ³³ (Supplementary Figure 13).

We also found evidence for parallel mutations in two of the studied cases, patient 5 of the leukemia dataset ³⁴ (Figure 3) and the JAK2-negative myeloproliferative neoplasm ³¹. In both cases, the two mutation copies appear at the end of different lineages, which could also point to selective pressure from the tumor environment or the genetic context. Having corrected for the possibility of doublet samples in our model, the event of a mutation hitting the same site twice appears here to be the most plausible explanation. Conservative estimates of the prior odds of recurrent mutations among a small set of mutations of interest were obtained by spreading mutations uniformly across the genome and assuming that all mutations are observed (Supplementary Table 5). With these low prior estimates, the posterior probability of the infinite site hypothesis is still larger for the exome data of the myeloproliferative neoplasm ³¹. For patient 5 of the leukemia panel data ³⁴ the BF is large enough that the posterior odds are certainly in favor of the infinite sites hypothesis being violated. These data are then the ‘smoking gun’ showing that the possibility of infinite sites violations needs to be seriously considered and treated for single-cell data. Again larger sample sizes will be needed to better assess the practical implications of these findings but modeling single cell data while allowing violations of the infinite sites hypothesis provides the statistical framework for exactly that.

The possibility of violations of the infinite sites assumption necessitates substantial adaptations in present-day models for reconstructing mutation histories of tumors. For example, in models designed for bulk sequencing data, a core assumption to deconvolve admixed mutation profiles is that the cellular frequency of a point mutation distributes over a single clade in the tumor phylogeny, a restriction that is contrary to the recurrence of a mutation in different parts of the tree. When looking at models based on single-cell data such as SCITE ¹⁸, the changes necessary to accommodate finite sites seem less profound, as indicated by the extension introduced in this paper to allow a single recurrent mutation. We also employed this method to search for multiple recurrences by restricting the recurrence to parallel mutations in data where higher scoring back mutations had been observed. This uncovered evidence of multiple violations of the ISA, but a strict statistical test would need to account for the higher scoring recurrences as well. However the generalization towards the recurrence of an unknown number of mutations in unknown multiplicities entails a vast extension of the underlying search space.

For single-cell data, we additionally have the issue of high doublet rates which, as we have seen, can severely affect reconstruction quality when not being explicitly modeled. While the accidental sequencing of more than one cell could be relatively easily prevented by rigorously checking samples prior to sequencing, it is likely to take some time before this issue is solved reasonably well for all technology platforms including high-throughput assays. Meanwhile it is essential to integrate doublets in models for reconstructing mutation histories from single-cell data. Especially for testing the ISA, modeling doublets is necessary since even a small number of doublets can interfere with the test. As we have shown in this work, modeling doublets is straightforward for a mutation-centric approach like SCITE ¹⁸. For sample-centric approaches such as BitPhylogeny ¹⁶ and OncoNEM ¹⁷, the integration of doublets may be a bit more involved, as the topology underlying the evolutionary history is no longer tree-like in the presence of admixed samples.

We focused in this work on testing the infinite sites assumption for point mutations in tumor evolution. This extends more generally to any cell lineages and their phylogeny where we know that violations become increasingly likely for larger sets of cells and mutations. Looking at larger-scale lesions in cancer, such as copy number alterations, the importance of allowing recurrent mutations becomes even more pronounced. These alterations typically affect larger segments which make it much more likely that the same site is affected multiple times. To model this type of lesions either alone or together with SNVs to integrate LOH, dropping the infinite sites assumption becomes even more crucial. Recent work using the less restrictive infinite alleles assumption ²¹ or Dollo parsimony ²² are promising first steps, but additional work on accurate models of tumor evolution and their inference from data is essential.

Author contributions

JK and KJ developed and implemented the method. All authors conceived and designed the study. All authors drafted the manuscript and approved the final version.

Funding

JK was supported by ERC Synergy Grant 609883 (erc.europa.eu/). KJ was supported by SystemsX.ch RTD Grant 2013/150 (www.systemsx.ch/).

Competing Interests

The authors declare that they have no competing financial interests.

Correspondence

Correspondence and requests for materials should be addressed to Niko Beerenwinkel (email: niko.beerenwinkel{at}bsse.ethz.ch).

Acknowledgements

We thank Giusi Moffa for very useful discussions about the Bayes Factor comparison ²⁹ and Jochen Singer for bioinformatics support with the leukemia data³⁴.

References

1.↵
Greaves, M. & Maley, C. C. Clonal evolution in cancer. Nature 481, 306–313 (2012).
OpenUrl CrossRef PubMed Web of Science
2.↵
Ding, L. et al. Clonal evolution in relapsed acute myeloid leukaemia revealed by whole-genome sequencing. Nature 481, 506–510 (2012).
OpenUrl CrossRef PubMed Web of Science
3.↵
Van Loo, P. & Voet, T. Single cell analysis of cancer genomes. Current Opinion in Genetics & Development 24, 82–91 (2014).
OpenUrl
4.↵
Navin, N. E. Cancer genomics: one cell at a time. Genome Biology 15 (2014).
5.↵
Strino, F., Parisi, F., Micsinai, M. & Kluger, Y. TrAp: a tree approach for fingerprinting subclonal tumor composition. Nucleic Acids Research 41, e165 (2013).
OpenUrl CrossRef PubMed
6.↵
Hajirasouliha, I., Mahmoody, A. & Raphael, B. J. A combinatorial approach for analyzing intra-tumor heterogeneity from high-throughput sequencing data. Bioinformatics 30, i78–i86 (2014).
OpenUrl CrossRef PubMed
7.↵
Qiao, Y. et al. SubcloneSeeker: a computational framework for reconstructing tumor clone structure for cancer variant interpretation and prioritization. Genome Biology 15, 443 (2014).
OpenUrl CrossRef PubMed
8.↵
Kim, K. I. & Simon, R. Using single cell sequencing data to model the evolutionary history of a timor. BMC Bioinformatics 15, 27 (2014).
OpenUrl CrossRef PubMed
9.↵
Popic, V. et al. Fast and scalable inference of multi-sample cancer lineages. CoRR, abs/1412.8574 (2014).
10.↵
Jiao, W., Vembu, S., Deshwar, A. G., Stein, L. & Morris, Q. Inferring clonal evolution of tumors from single nucleotide somatic mutations. BMC Bioinformatics 15, 35 (2014).
OpenUrl CrossRef PubMed
11.↵
El-Kebir, M., Oesper, L., Acheson-Field, H. & Raphael, B. J. Reconstruction of clonal trees and tumor composition from multi-sample sequencing data. Bioinformatics 31, i62–i70 (2015).
OpenUrl CrossRef PubMed
12.↵
Deshwar, A. G. et al. PhyloWGS: Reconstructing subclonal composition and evolution from whole-genome sequencing of tumors. Genome Biology 16, 35 (2015).
OpenUrl CrossRef PubMed
13.↵
Niknafs, N., Beleva-Guthrie, V., Naiman, D. Q. & Karchin, R. SubClonal hierarchy inference from somatic mutations: automatic reconstruction of cancer evolutionary trees from multiregion next generation sequencing. PLoS Computional Biology 11, e1004416 (2015).
OpenUrl
14.↵
Malikic, S., McPherson, A. W., Donmez, N. & Sahinalp, C. S. Clonality inference in multiple tumor samples using phylogeny. Bioinformatics 31, 1349–1356 (2015).
OpenUrl CrossRef PubMed
15.↵
Donmez, N. et al. Clonality inference from single tumor samples using low coverage sequence data. In International Conference on Research in Computational Molecular Biology, 83–94 (Springer, 2016).
16.↵
Yuan, K., Sakoparnig, T., Markowetz, F. & Beerenwinkel, N. BitPhylogeny: a probabilistic framework for reconstructing intra-tumor phylogenies. Genome Biology 16, 36 (2015).
OpenUrl CrossRef PubMed
17.↵
Ross, E. & Markowetz, F. OncoNEM: Inferring tumour evolution from single-cell sequencing data. Genome Biology 17, 69 (2016).
OpenUrl CrossRef
18.↵
Jahn, K., Kuipers, J. & Beerenwinkel, N. Tree inference for single-cell data. Genome Biology 17, 86 (2016).
OpenUrl CrossRef
19.↵
Jiang, Y., Qiu, Y., Minn, A. J. & Zhang, N. R. Assessing intratumor heterogeneity and tracking longitudinal and spatial clonal evolutionary history by next-generation sequencing. Proceedings of the National Academy of Sciences 113, E5528–E5537 (2016).
OpenUrl Abstract/FREE Full Text
20.↵
Kimura, M. The number of heterozygous nucleotide sites maintained in a finite population due to steady flux of mutations. Genetics 61, 893 (1969).
OpenUrl FREE Full Text
21.↵
El-Kebir, M., Satas, G., Oesper, L. & Raphael, B. J. Inferring the mutational history of a tumor using multi-state perfect phylogeny mixtures. Cell Systems 3, 43–53 (2016).
OpenUrl
22.↵
McPherson, A. et al. Divergent modes of clonal spread and intraperitoneal mixing in highgrade serous ovarian cancer. Nature Genetics 48, 758–767 (2016).
OpenUrl CrossRef PubMed
23.↵
Gusfield, D. Algorithms on strings, trees and sequences: computer science and computational biology (Cambridge university press, Cambridge, 1997).
24.↵
Gerlinger, M. et al. Intratumor heterogeneity and branched evolution revealed by multiregion sequencing. New England Journal of Medicine 366, 883–892 (2012).
OpenUrl CrossRef PubMed Web of Science
25.↵
Kovac, M. et al. Recurrent chromosomal gains and heterogeneous driver mutations characterise papillary renal cancer evolution. Nature Communications 6, 6336 (2015).
OpenUrl
26.↵
Tomasetti, C. & Vogelstein, B. Variation in cancer risk among tissues can be explained by the number of stem cell divisions. Science 347, 78–81 (2015).
OpenUrl Abstract/FREE Full Text
27.↵
Lynch, M. Rate, molecular spectrum, and consequences of human mutation. Proceedings of the National Academy of Sciences 107, 961–968 (2010).
OpenUrl Abstract/FREE Full Text
28.↵
Kass, R. E. & Raftery, A. E. Bayes factors. Journal of the American Statistical Association 90, 773–795 (1995).
OpenUrl CrossRef PubMed Web of Science
29.↵
Moffa, G. et al. Refining pathways: A model comparison approach. PLoS ONE 11, e0155999 (2016).
OpenUrl CrossRef
30.↵
Fluidigm. Doublet rate and detection on the C1 IFCs (2016). White Paper PN 101–2711 A1.
31.↵
Hou, Y. et al. Single-cell exome sequencing and monoclonal evolution of a JAK2-negative myeloproliferative neoplasm. Cell 148, 873–885 (2012).
OpenUrl CrossRef PubMed Web of Science
32.↵
Xu, X. et al. Single-cell exome sequencing reveals single-nucleotide mutation characteristics of a kidney tumor. Cell 148, 886–895 (2012).
OpenUrl CrossRef PubMed Web of Science
33.↵
Wang, Y. et al. Clonal evolution in breast cancer revealed by single nucleus genome sequencing. Nature 512, 155–160 (2014).
OpenUrl CrossRef PubMed Web of Science
34.↵
Gawad, C., Koh, W. & Quake, S. R. Dissecting the clonal origins of childhood acute lymphoblastic leukemia by single-cell genomics. Proceedings of the National Academy of Sciences 111, 17947–17952 (2014).
OpenUrl Abstract/FREE Full Text
35.↵
Martinez, P. et al. Dynamic clonal equilibrium and predetermined cancer risk in Barrett’s oesophagus. Nature Communications 7, 12158 (2016).
OpenUrl
36.↵
Forero-Castro, M. et al. Genome-wide DNA copy number analysis of acute lymphoblastic leukemia identifies new genetic markers associated with clinical outcome. PloS ONE 11, e0148972 (2016).
OpenUrl
37.↵
1. Guenova, M. &
2. Balatzenko, G.
Forero, R. M., Hernández, M. & Hernández-Rivas, J. M. Genetics of acute lymphoblastic leukemia. In Guenova, M. & Balatzenko, G. (eds.) Leukemia, 1–37 (InTech, 2013).