Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Letter
  • Published:

Single-cell 5hmC sequencing reveals chromosome-wide cell-to-cell variability and enables lineage reconstruction

Abstract

The epigenetic DNA modification 5-hydroxymethylcytosine (5hmC) has crucial roles in development and gene regulation1,2,3,4,5,6,7. Quantifying the abundance of this epigenetic mark at the single-cell level could enable us to understand its roles. We present a single-cell, genome-wide and strand-specific 5hmC sequencing technology, based on 5hmC glucosylation and glucosylation-dependent digestion of DNA, that reveals pronounced cell-to-cell variability in the abundance of 5hmC on the two DNA strands of a given chromosome. We develop a mathematical model that reproduces the strand bias and use this model to make two predictions. First, the variation in strand bias should decrease when 5hmC turnover increases. Second, the strand bias of two sister cells should be strongly anti-correlated. We validate these predictions experimentally, and use our model to reconstruct lineages of two- and four-cell mouse embryos, showing that single-cell 5hmC sequencing can be used as a lineage reconstruction tool.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Figure 1: Schematic of the scAba-seq method.
Figure 2: The levels of 5hmC reflect strong bias between the two strands of DNA of a chromosome.
Figure 3: Stochastic model explains chromosome-wide strand bias.
Figure 4: 5hmC strand bias enables lineage reconstruction.

Similar content being viewed by others

Accession codes

Primary accessions

Gene Expression Omnibus

References

  1. Kriaucionis, S. & Heintz, N. The nuclear DNA base 5-hydroxymethylcytosine is present in Purkinje neurons and the brain. Science 324, 929–930 (2009).

    Article  CAS  Google Scholar 

  2. Tahiliani, M. et al. Conversion of 5-methylcytosine to 5-hydroxymethylcytosine in mammalian DNA by MLL partner TET1. Science 324, 930–935 (2009).

    Article  CAS  Google Scholar 

  3. Shen, L. et al. Genome-wide analysis reveals TET- and TDG-dependent 5-methylcytosine oxidation dynamics. Cell 153, 692–706 (2013).

    Article  CAS  Google Scholar 

  4. Ficz, G. et al. Dynamic regulation of 5-hydroxymethylcytosine in mouse ES cells and during differentiation. Nature 473, 398–402 (2011).

    Article  CAS  Google Scholar 

  5. Booth, M.J. et al. Quantitative sequencing of 5-methylcytosine and 5-hydroxymethylcytosine at single-base resolution. Science 336, 934–937 (2012).

    Article  CAS  Google Scholar 

  6. Yu, M. et al. Base-resolution analysis of 5-hydroxymethylcytosine in the mammalian genome. Cell 149, 1368–1380 (2012).

    Article  CAS  Google Scholar 

  7. Sun, Z. et al. High-resolution enzymatic mapping of genomic 5-hydroxymethylcytosine in mouse embryonic stem cells. Cell Rep. 3, 567–576 (2013).

    Article  CAS  Google Scholar 

  8. Shapiro, E., Biezuner, T. & Linnarsson, S. Single-cell sequencing-based technologies will revolutionize whole-organism science. Nat. Rev. Genet. 14, 618–630 (2013).

    Article  CAS  Google Scholar 

  9. Kolodziejczyk, A.A., Kim, J.K., Svensson, V., Marioni, J.C. & Teichmann, S.A. The technology and biology of single-cell RNA sequencing. Mol. Cell 58, 610–620 (2015).

    Article  CAS  Google Scholar 

  10. Shalek, A.K. et al. Single-cell transcriptomics reveals bimodality in expression and splicing in immune cells. Nature 498, 236–240 (2013).

    Article  CAS  Google Scholar 

  11. Treutlein, B. et al. Reconstructing lineage hierarchies of the distal lung epithelium using single-cell RNA-seq. Nature 509, 371–375 (2014).

    Article  CAS  Google Scholar 

  12. Jaitin, D.A. et al. Massively parallel single-cell RNA-seq for marker-free decomposition of tissues into cell types. Science 343, 776–779 (2014).

    Article  CAS  Google Scholar 

  13. Zeisel, A. et al. Brain structure. Cell types in the mouse cortex and hippocampus revealed by single-cell RNA-seq. Science 347, 1138–1142 (2015).

    Article  CAS  Google Scholar 

  14. Raj, A. & van Oudenaarden, A. Nature, nurture, or chance: stochastic gene expression and its consequences. Cell 135, 216–226 (2008).

    Article  CAS  Google Scholar 

  15. Guo, H. et al. Single-cell methylome landscapes of mouse embryonic stem cells and early embryos analyzed using reduced representation bisulfite sequencing. Genome Res. 23, 2126–2135 (2013).

    Article  CAS  Google Scholar 

  16. Smallwood, S.A. et al. Single-cell genome-wide bisulfite sequencing for assessing epigenetic heterogeneity. Nat. Methods 11, 817–820 (2014).

    Article  CAS  Google Scholar 

  17. Farlik, M. et al. Single-cell DNA methylome sequencing and bioinformatic inference of epigenomic cell-state dynamics. Cell Rep. 10, 1386–1397 (2015).

    Article  CAS  Google Scholar 

  18. Lee, H.J., Hore, T.A. & Reik, W. Reprogramming the methylome: erasing memory and creating diversity. Cell Stem Cell 14, 710–719 (2014).

    Article  CAS  Google Scholar 

  19. Horton, J.R. et al. Structure of 5-hydroxymethylcytosine-specific restriction enzyme, AbaSI, in complex with DNA. Nucleic Acids Res. 42, 7947–7959 (2014).

    Article  CAS  Google Scholar 

  20. Valinluck, V. & Sowers, L.C. Endogenous cytosine damage products alter the site selectivity of human DNA maintenance methyltransferase DNMT1. Cancer Res. 67, 946–950 (2007).

    Article  CAS  Google Scholar 

  21. Inoue, A. & Zhang, Y. Replication-dependent loss of 5-hydroxymethylcytosine in mouse preimplantation embryos. Science 334, 194 (2011).

    Article  CAS  Google Scholar 

  22. Zhao, L. et al. The dynamics of DNA methylation fidelity during mouse embryonic stem cell self-renewal and differentiation. Genome Res. 24, 1296–1307 (2014).

    Article  CAS  Google Scholar 

  23. Blaschke, K. et al. Vitamin C induces Tet-dependent DNA demethylation and a blastocyst-like state in ES cells. Nature 500, 222–226 (2013).

    Article  CAS  Google Scholar 

  24. Huh, Y.H., Cohen, J. & Sherley, J.L. Higher 5-hydroxymethylcytosine identifies immortal DNA strand chromosomes in asymmetrically self-renewing distributed stem cells. Proc. Natl. Acad. Sci. USA 110, 16862–16867 (2013).

    Article  CAS  Google Scholar 

  25. Spruijt, C.G. et al. Dynamic readers for 5-(hydroxy)methylcytosine and its oxidized derivatives. Cell 152, 1146–1159 (2013).

    Article  CAS  Google Scholar 

  26. Iurlaro, M. et al. A screen for hydroxymethylcytosine and formylcytosine binding proteins suggests functions in transcription and chromatin regulation. Genome Biol. 14, R119 (2013).

    Article  Google Scholar 

  27. Shukla, A., Sehgal, M. & Singh, T.R. Hydroxymethylation and its potential implication in DNA repair system: A review and future perspectives. Gene 564, 109–118 (2015).

    Article  CAS  Google Scholar 

  28. Falconer, E. et al. DNA template strand sequencing of single-cells maps genomic rearrangements at high resolution. Nat. Methods 9, 1107–1112 (2012).

    Article  CAS  Google Scholar 

  29. Hashimshony, T., Wagner, F., Sher, N. & Yanai, I. CEL-Seq: single-cell RNA-Seq by multiplexed linear amplification. Cell Rep. 2, 666–673 (2012).

    Article  CAS  Google Scholar 

Download references

Acknowledgements

We thank L. Kaaij, M. Bienko, M. Staps, M. Welling, W. Reik and members of the van Oudenaarden laboratory for constructive feedback. We would also like to thank S. van der Elst and R. van der Linden for their assistance with flow sorting and M. Verheul and E. de Bruijn at the Utrecht DNA Sequencing Facility for assistance with Illumina sequencing. This work was supported by NWO (VICI award) and ERC (ERC-AdG 294325-GeneNoiseControl) grants.

Author information

Authors and Affiliations

Authors

Contributions

D.M., S.S.D. and A.v.O. designed the study. D.M., S.S.D. and J.-C.B. performed experiments. D.M., S.S.D. and A.v.O. performed analysis. S.S.D. and A.v.O. performed modeling. N.C. provided key inputs for initial technology development. D.M., S.S.D. and A.v.O. wrote the manuscript.

Corresponding author

Correspondence to Alexander van Oudenaarden.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Integrated supplementary information

Supplementary Figure 1 Distributions of sequencing depth for scAba-Seq libraries.

Histograms depicting the depth of sequencing for each of the eight libraries processed in this study are shown. Fold oversequencing is defined as the number of times each unique read (or 5hmC site) is sequenced (also see Supplementary Table 1).

Supplementary Figure 2 Distribution of 50 nucleotides upstream and downstream of the detected 5hmC sites.

Unequal contribution of nucleotides at position 0 corresponds to the 5hmCG site (99% cytosine), 9-11 bp upstream of the 5hmCG site corresponds to the ligation junction and 20 bp upstream of the 5hmCG site (35% cytosine) corresponds to the symmetrical cytosine reported by Sun, Z. et al. Cell Reports (2013).

Supplementary Figure 3 Relative amount of spike-in molecules detected as a function of the absolute abundance of the spike-in molecules.

A spike-in molecule is considered ‘detected’ if at least one read of that particular spike-in sequence is present in the sequencing library. Each data point is an average over several spike-in sequences with similar abundance. The data are fitted well by a model where a single spike-in molecule is detected with a probability p = 0.05 (red solid curve).

Supplementary Figure 4 Distribution of 5hmCG sites within different genomic elements for E14tg2a cells.

a, Genomic 5hmCG distribution for 10,000 bulk E14tg2a cells processed with Aba-seq. b, Average genomic 5hmCG distribution for 480 single E14tg2a cells processed with scAba-seq.

Supplementary Figure 5 Correlation between scAba-seq and bulk 5hmC measurements in E14tg2a mouse embryonic stem cells.

a, For 10 kB bins, the number of unique 5hmC sites averaged over all single cells correlated strongly with a bulk Aba-Seq experiment (Pearson r = 0.89). b, Pearson correlation between scAba-seq and established bulk 5hmC measurements, Aba-seq (black) and TAB-seq (red), for a range of bin sizes.

Supplementary Figure 6 Theoretical strand bias distributions estimated from Model I.

a-f, The panels show strand bias distributions for a chromosome with one copy computed for different values of k (= k1 + k2) using Model I. The value of k is indicated above each panel. The figure shows that for low values of k the strand bias distribution does not have any values around f = 0.5, whereas high values of k give a unimodal strand bias distribution. Thus, Model I fails to capture the experimental data.

Supplementary Figure 7 Schematic describing Model II.

The schematic shows the intrinsic noise model (Model II) where the reversible reaction describing 5hmC dynamics is modeled stochastically. The figure shows a hypothetical simulation where 5hmC dynamics of the old and new strands are shown in black and red, respectively. The gray region depicts a time-window of one cell cycle during which strand bias is measured.

Supplementary Figure 8 Behavior of the variance of the strand bias distribution for Model II is different from the experimental data.

a, The figure shows the same data as that in Fig. 2b for chromosome X. The variance of the strand bias distribution after downsampling the number of detected 5hmC sites in chromosome X shows that beyond approximately 50 sites, the variance is independent of the number of sites. b, Model II predicts that the variance of the strand bias distribution is inversely proportional to λ, the average number of sites that can theoretically be quantified on a strand of a chromosome. Thus, Model II fails to explain the experimental data.

Supplementary Figure 9 Schematic describing Model III.

The schematic describes an extrinsic noise model where k1 and k2 are normally distributed random variables with both strands having the same rate constants within one cell cycle. 5hmC dynamics of the old and new strands in this hypothetical simulation are shown in black and red, respectively. Strand bias is measured within a time-window of one cell cycle, which is shown in gray.

Supplementary Figure 10 Theoretical strand bias distributions estimated from Model III.

a-f, The figures show theoretical strand bias distributions for a chromosome with one copy computed using Model III. The value of k used for each of the theoretical simulations is show above the panel. For small values of k, the model predicts no strand bias around f = 0.5 while large values of k predict a unimodal strand bias distribution. Thus, Model III fails to explain the experimental data.

Supplementary Figure 11 χ2 values comparing experimental data from the autosomes to Model IV.

The figure shows χ2 values for different mean values of k1 and k2 (μ1 and μ2, respectively). For 2-dimentional visualization, the coefficient of variation of the distribution for k1 and k2 were fixed at 0.25.

Supplementary Figure 12 Rates of 5hmC turnover for individual chromosomes in E14tg2a cells.

The 5hmC turnover rate for individual chromosomes are estimated using Model IV. The figure shows bootstrapped estimates for the mean and standard deviation of k1+k2. All chromosomes show similar rates of 5hmC turnover. “A” indicates the 5hmC turnover rate for all the autosomes combined.

Supplementary Figure 13 5hmC strand bias between sister cells at the 2-cell stage of embryogenesis.

Bar plot shows 5hmC strand bias for chromosomes 1-7 for sister cells 1-2 and 3-4 derived from 2-cell mouse embryos. Stand bias towards the plus strand or the minus strand within 10 MB bins are indicated in red and black, respectively. Putative sister chromatid exchange (SCE) events are marked by arrows.

Supplementary Figure 14 5hmC strand bias can be used to identify sister chromatid exchange events.

Schematic showing that a sister chromatid exchange (SCE) event in the G2 phase of the cell cycle in a mother cell can be detected in the two daughter cells. n and m indicates the number of sites that are hydroxymethylated on the old (black) and new (dashed red) strand, respectively. For sister cells at the 2-cell stage of mouse embryogenesis, the sum of strand bias at any location in the genome equals 1, even in the presence of SEC events. As sister cells share the location at which SEC events occur, these events can be used to identify sister cells at the 4-cell stage of mouse embryogenesis (Fig. 4f).

Supplementary Figure 15 Theoretical basis for inferring the lineage of 4-cell mouse embryos.

Schematic showing that the sum of strand biases for four cells that derive from the same oocyte at any location in the genome is equal to 2. n, m and q are the number of 5hmC sites per unit length of the genome for the black, dashed red and dashed green strand, respectively.

Supplementary information

Supplementary Text and Figures

Supplementary Figures 1–15 and Supplementary Notes 1 and 2 (PDF 1989 kb)

Supplementary Table 1

Sequencing statistics (XLS 63 kb)

Supplementary Table 2

scAba-Seq adapter sequences. (XLS 27 kb)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Mooijman, D., Dey, S., Boisset, JC. et al. Single-cell 5hmC sequencing reveals chromosome-wide cell-to-cell variability and enables lineage reconstruction. Nat Biotechnol 34, 852–856 (2016). https://doi.org/10.1038/nbt.3598

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/nbt.3598

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing