Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Efficient de novo assembly of single-cell bacterial genomes from short-read data sets

Abstract

Whole genome amplification by the multiple displacement amplification (MDA) method allows sequencing of DNA from single cells of bacteria that cannot be cultured. Assembling a genome is challenging, however, because MDA generates highly nonuniform coverage of the genome. Here we describe an algorithm tailored for short-read data from single cells that improves assembly through the use of a progressively increasing coverage cutoff. Assembly of reads from single Escherichia coli and Staphylococcus aureus cells captures >91% of genes within contigs, approaching the 95% captured from an assembly based on many E. coli cells. We apply this method to assemble a genome from a single cell of an uncultivated SAR324 clade of Deltaproteobacteria, a cosmopolitan bacterial lineage in the global ocean. Metabolic reconstruction suggests that SAR324 is aerobic, motile and chemotaxic. Our approach enables acquisition of genome assemblies for individual uncultivated bacteria using only short reads, providing cell-specific genetic information absent from metagenomic studies.

This is a preview of subscription content, access via your institution

Access options

Rent or buy this article

Prices vary by article type

from$1.95

to$39.95

Prices may be subject to local taxes which are calculated during checkout

Figure 1: Assembling single-cell reads using Velvet-SC.
Figure 2: Comparison of contigs generated by Velvet versus EULER+Velvet-SC for single-cell E. coli lane 1.
Figure 3: A 16S maximum likelihood tree of Deltaproteobacterial 16S sequences including SAR324_MDA (red).

Similar content being viewed by others

Accession codes

Accessions

BioProject

GenBank/EMBL/DDBJ

NCBI Reference Sequence

Sequence Read Archive

References

  1. Rusch, D.B. et al. The Sorcerer II Global Ocean Sampling expedition: northwest Atlantic through eastern tropical Pacific. PLoS Biol. 5, e77 (2007).

    Article  Google Scholar 

  2. Gill, S.R. et al. Metagenomic analysis of the human distal gut microbiome. Science 312, 1355–1359 (2006).

    Article  CAS  Google Scholar 

  3. Raghunathan, A. et al. Genomic DNA amplification from a single bacterium. Appl. Environ. Microbiol. 71, 3342–3347 (2005).

    Article  CAS  Google Scholar 

  4. Dean, F.B. et al. Comprehensive human genome amplification using multiple displacement amplification. Proc. Natl. Acad. Sci. USA 99, 5261–5266 (2002).

    Article  CAS  Google Scholar 

  5. Dean, F.B., Nelson, J.R., Giesler, T.L. & Lasken, R.S. Rapid amplification of plasmid and phage DNA using Phi 29 DNA polymerase and multiply-primed rolling circle amplification. Genome Res. 11, 1095–1099 (2001).

    Article  CAS  Google Scholar 

  6. Hosono, S. et al. Unbiased whole-genome amplification directly from clinical samples. Genome Res. 13, 954–964 (2003).

    Article  CAS  Google Scholar 

  7. Lasken, R.S. Single cell genomic sequencing using Multiple Displacement Amplification. Curr. Opin. Microbiol. 10, 510–516 (2007).

    Article  CAS  Google Scholar 

  8. Ishoey, T., Woyke, T., Stepanauskas, R., Novotny, M. & Lasken, R.S. Genomic sequencing of single microbial cells from environmental samples. Curr. Opin. Microbiol. 11, 198–204 (2008).

    Article  CAS  Google Scholar 

  9. Zhang, K. et al. Sequencing genomes from single cells by polymerase cloning. Nat. Biotechnol. 24, 680–686 (2006).

    Article  CAS  Google Scholar 

  10. Lasken, R.S. & Stockwell, T.B. Mechanism of chimera formation during the Multiple Displacement Amplification reaction. BMC Biotechnol. 7, 19 (2007).

    Article  Google Scholar 

  11. Lasken, R.S. et al. Multiple displacement amplification from single bacterial cells in Whole Genome Amplification: Methods Express (eds. Hughes, S. & Lasken, R.) 119–147 (Scion Publishing Ltd., UK, 2005).

  12. Kvist, T., Ahring, B.K., Lasken, R.S. & Westermann, P. Specific single-cell isolation and genomic amplification of uncultured microorganisms. Appl. Microbiol. Biotechnol. 74, 926–935 (2007).

    Article  CAS  Google Scholar 

  13. Mussmann, M. et al. Insights into the genome of large sulfur bacteria revealed by analysis of single filaments. PLoS Biol. 5, e230 (2007).

    Article  Google Scholar 

  14. Marcy, Y. et al. Dissecting biological “dark matter” with single-cell genetic analysis of rare and uncultivated TM7 microbes from the human mouth. Proc. Natl. Acad. Sci. USA 104, 11889–11894 (2007).

    Article  CAS  Google Scholar 

  15. Podar, M. et al. Targeted access to the genomes of low abundance organisms in complex microbial communities. Appl. Environ. Microbiol. 73, 3205–3214 (2007).

    Article  CAS  Google Scholar 

  16. Hongoh, Y. et al. Complete genome of the uncultured Termite Group 1 bacteria in a single host protist cell. Proc. Natl. Acad. Sci. USA 105, 5555–5560 (2008).

    Article  CAS  Google Scholar 

  17. Rodrigue, S. et al. Whole genome amplification and de novo assembly of single bacterial cells. PLoS ONE 4, e6864 (2009).

    Article  Google Scholar 

  18. Woyke, T. et al. Assembling the marine metagenome, one cell at a time. PLoS ONE 4, e5299 (2009).

    Article  Google Scholar 

  19. Zerbino, D.R. & Birney, E. Velvet: algorithms for de novo short read assembly using de Bruijn graphs. Genome Res. 18, 821–829 (2008).

    Article  CAS  Google Scholar 

  20. Margulies, M. et al. Genome sequencing in microfabricated high-density picolitre reactors. Nature 437, 376–380 (2005).

    Article  CAS  Google Scholar 

  21. Pevzner, P.A., Tang, H. & Waterman, M.S. An Eulerian path approach to DNA fragment assembly. Proc. Natl. Acad. Sci. USA 98, 9748–9753 (2001).

    Article  CAS  Google Scholar 

  22. Simpson, J.T. et al. ABySS: a parallel assembler for short read sequence data. Genome Res. 19, 1117–1123 (2009).

    Article  CAS  Google Scholar 

  23. Chaisson, M.J. & Pevzner, P.A. Short read fragment assembly of bacterial genomes. Genome Res. 18, 324–330 (2008).

    Article  CAS  Google Scholar 

  24. Diep, B.A. et al. Complete genome sequence of USA300, an epidemic clone of community-acquired meticillin-resistant Staphylococcus aureus. Lancet 367, 731–739 (2006).

    Article  CAS  Google Scholar 

  25. Wright, T.D., Vergin, K.L., Boyd, P.W. & Giovannoni, S.J. A novel delta-subdivision proteobacterial lineage from the lower ocean surface layer. Appl. Environ. Microbiol. 63, 1441–1448 (1997).

    CAS  PubMed  PubMed Central  Google Scholar 

  26. Noguchi, H., Park, J. & Takagi, T. MetaGene: prokaryotic gene finding from environmental genome shotgun sequences. Nucleic Acids Res. 34, 5623–5630 (2006).

    Article  CAS  Google Scholar 

  27. Tatusov, R.L. et al. The COG database: an updated version includes eukaryotes. BMC Bioinformatics 4, 41 (2003).

    Article  Google Scholar 

  28. Goldman, B.S. et al. Evolution of sensory complexity recorded in a myxobacterial genome. Proc. Natl. Acad. Sci. USA 103, 15200–15205 (2006).

    Article  CAS  Google Scholar 

  29. DeLong, E.F. et al. Community genomics among stratified microbial assemblages in the ocean's interior. Science 311, 496–503 (2006).

    Article  CAS  Google Scholar 

  30. Rich, V.I., Pham, V.D., Eppley, J., Shi, Y. & Delong, E.F. Time-series analyses of Monterey Bay coastal microbial picoplankton using a 'genome proxy' microarray. Environ. Microbiol. 13, 116–134 (2010).

    Article  Google Scholar 

  31. Yooseph, S. et al. Genomic and functional adaptation in surface ocean planktonic prokaryotes. Nature 468, 60–66 (2010).

    Article  CAS  Google Scholar 

  32. Iizuka, T. et al. Plesiocystis pacifica gen. nov., sp. nov., a marine myxobacterium that contains dihydrogenated menaquinone, isolated from the Pacific coasts of Japan. Int. J. Syst. Evol. Microbiol. 53, 189–195 (2003).

    Article  CAS  Google Scholar 

  33. Callister, S.J. et al. Comparative bacterial proteomics: analysis of the core genome concept. PLoS ONE 3, e1542 (2008).

    Article  Google Scholar 

  34. Mitreva, M. Bacterial core gene set. <http://www.hmpdacc.org/doc/sops/reference_genomes/metrics/Bacterial_CoreGenes_SOP.pdf> (2008).

  35. Nelson, K.E. et al. A catalog of reference genomes from the human microbiome. Science 328, 994–999 (2010).

    Article  CAS  Google Scholar 

  36. Woyke, T. et al. One bacterial cell, one complete genome. PLoS ONE 5, e10314 (2010).

    Article  Google Scholar 

  37. King, G.M. Microbial carbon monoxide consumption in salt marsh sediments. FEMS Microbiol. Ecol. 59, 2–9 (2007).

    Article  CAS  Google Scholar 

  38. Schloss, P.D. et al. Introducing mothur: open-source, platform-independent, community-supported software for describing and comparing microbial communities. Appl. Environ. Microbiol. 75, 7537–7541 (2009).

    Article  CAS  Google Scholar 

  39. Wilgenbusch, J.C. & Swofford, D. Inferring evolutionary trees with PAUP*. Curr. Prot. Bioinformatics, Unit 6.4 6.4.1–6.4.28 (2003).

  40. Hernandez, D., Francois, P., Farinelli, L., Ostera, M. & Schrenzel, J. De novo bacterial genome sequencing: Millions of very short reads assembled on a desktop computer. Genome Res. 18, 802–809 (2008).

    Article  CAS  Google Scholar 

  41. Li, R. et al. De novo assembly of human genomes with massively parallel short read sequencing. Genome Res. 20, 265–272 (2010).

    Article  CAS  Google Scholar 

  42. Mao, F., Dam, P., Chou, J., Olman, V. & Xu, Y. DOOR: a database for prokaryotic operons. Nucleic Acids Res. 37, D459–D463 (2009).

    Article  CAS  Google Scholar 

  43. Bentley, D.R. et al. Accurate whole human genome sequencing using reversible terminator chemistry. Nature 456, 53–59 (2008).

    Article  CAS  Google Scholar 

  44. Tanenbaum, D.M. et al. The JCVI standard operating procedure for annotating prokaryotic metagenomic shotgun sequencing data. Stand. Genomic Sci. 2, 229–237 (2010).

    Article  Google Scholar 

  45. Ramirez-Flandes, S. & Ulloa, O. Bosque: integrated phylogenetic analysis software. Bioinformatics 24, 2539–2541 (2008).

    Article  CAS  Google Scholar 

  46. Kanehisa, M. & Goto, S. KEGG: Kyoto encyclopedia of genes and genomes. Nucleic Acids Res. 28, 27–30 (2000).

    Article  CAS  Google Scholar 

  47. Moriya, Y., Itoh, M., Okuda, S., Yoshizawa, A.C. & Kanehisa, M. KAAS: an automatic genome annotation and pathway reconstruction server. Nucleic Acids Res. 35, W182–185 (2007).

    Article  Google Scholar 

Download references

Acknowledgements

This work was partially supported by grants to R.S.L. from the National Human Genome Research Institute (NIH-2 R01 HG003647) and the Alfred P. Sloan Foundation (Sloan Foundation-2007-10-19), and by a grant to P.A.P. and G.T. from the US National Institutes of Health (NIH grant 3P41RR024851-02S1). We thank M. Kim (J. Craig Venter Institute) for bioinformatics support.

Author information

Authors and Affiliations

Authors

Contributions

All authors analyzed data. H.C. and G.T. wrote software. M.N., J.L.Y.-G., M.-J.L. and L.J.F. performed wet lab experiments. Illumina sequencing was performed at Illumina Cambridge Ltd. O.S.-T. analyzed sequencing data at Illumina. H.C., J.L.Y.-G., G.T., C.L.D., M.-J.L., L.J.F., N.A.G., P.A.P. and R.S.L. wrote the manuscript. H.C., G.T., M.-J.L., C.L.D., J.H.B., D.B.R. and N.A.G. created figures and tables. R.S.L. and M.-J.L. supervised the JCVI group. P.A.P. and G.T. supervised the UCSD group. N.A.G. and D.J.E. supervised the Illumina group. G.P.S. initiated the Illumina-JCVI collaboration.

Corresponding author

Correspondence to Roger S Lasken.

Ethics declarations

Competing interests

L.J.F., N.A.G., O.S.-T., G.P.S. and D.J.E. are employees of Illumina, the commercial source of Illumina sequencing, which is evaluated in this manuscript.

Supplementary information

Supplementary Text and Figures

Supplementary Tables 1–5, Supplementary Methods, Supplementary Data 3 and Supplementary Figures 1–13 (PDF 2029 kb)

Supplementary Data 1

Velvet-SC source code (TGZ 4047 kb)

Supplementary Data 2

EULER-SR Error correction source code (TGZ 129 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chitsaz, H., Yee-Greenbaum, J., Tesler, G. et al. Efficient de novo assembly of single-cell bacterial genomes from short-read data sets. Nat Biotechnol 29, 915–921 (2011). https://doi.org/10.1038/nbt.1966

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/nbt.1966

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing