Skip to main content
bioRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search
New Results

Accurate viral genome reconstruction and host assignment with proximity-ligation sequencing

View ORCID ProfileGherman Uritskiy, Maximillian Press, Christine Sun, View ORCID ProfileGuillermo Domínguez Huerta, View ORCID ProfileAhmed A. Zayed, Andrew Wiser, Jonas Grove, View ORCID ProfileBenjamin Auch, View ORCID ProfileStephen M. Eacker, View ORCID ProfileShawn Sullivan, View ORCID ProfileDerek M. Bickhart, View ORCID ProfileTimothy P. L. Smith, View ORCID ProfileMatthew B. Sullivan, Ivan Liachko
doi: https://doi.org/10.1101/2021.06.14.448389
Gherman Uritskiy
1Phase Genomics, Seattle, WA 98109, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Gherman Uritskiy
Maximillian Press
1Phase Genomics, Seattle, WA 98109, USA
2Inscripta, Boulder, CO 80301, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Christine Sun
3Department of Microbiology, Center of Microbiome Science, and EMERGE Biology Integration Institute, Ohio State University, Columbus, OH 43210, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Guillermo Domínguez Huerta
3Department of Microbiology, Center of Microbiome Science, and EMERGE Biology Integration Institute, Ohio State University, Columbus, OH 43210, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Guillermo Domínguez Huerta
Ahmed A. Zayed
3Department of Microbiology, Center of Microbiome Science, and EMERGE Biology Integration Institute, Ohio State University, Columbus, OH 43210, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Ahmed A. Zayed
Andrew Wiser
1Phase Genomics, Seattle, WA 98109, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Jonas Grove
1Phase Genomics, Seattle, WA 98109, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Benjamin Auch
1Phase Genomics, Seattle, WA 98109, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Benjamin Auch
Stephen M. Eacker
1Phase Genomics, Seattle, WA 98109, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Stephen M. Eacker
Shawn Sullivan
1Phase Genomics, Seattle, WA 98109, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Shawn Sullivan
Derek M. Bickhart
4USDA Dairy Forage Research Center, Madison, WI 53593, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Derek M. Bickhart
Timothy P. L. Smith
5USDA-ARS U.S. Meat Animal Research Center, Clay Center, NE 68933, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Timothy P. L. Smith
Matthew B. Sullivan
3Department of Microbiology, Center of Microbiome Science, and EMERGE Biology Integration Institute, Ohio State University, Columbus, OH 43210, USA
6Department of Civil, Environmental and Geodetic Engineering, Ohio State University, Columbus, OH 43210, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Matthew B. Sullivan
Ivan Liachko
1Phase Genomics, Seattle, WA 98109, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • For correspondence: ivan@phasegenomics.com
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Preview PDF
Loading

Abstract

Viruses play crucial roles in the ecology of microbial communities, yet they remain relatively understudied in their native environments. Despite many advancements in high-throughput whole-genome sequencing (WGS), sequence assembly, and annotation of viruses, the reconstruction of full-length viral genomes directly from metagenomic sequencing is possible only for the most abundant phages and requires long-read sequencing technologies. Additionally, the prediction of their cellular hosts remains difficult from conventional metagenomic sequencing alone. To address these gaps in the field and to accelerate the study of viruses directly in their native microbiomes, we developed an end-to-end bioinformatics platform for viral genome reconstruction and host attribution from metagenomic data using proximity-ligation sequencing (i.e., Hi-C). We demonstrate the capabilities of the platform by recovering and characterizing the metavirome of a variety of metagenomes, including a fecal microbiome that has also been sequenced with accurate long reads, allowing for the assessment and benchmarking of the new methods. The platform can accurately extract numerous near-complete viral genomes even from highly fragmented short-read assemblies and can reliably predict their cellular hosts with minimal false positives. To our knowledge, this is the first software for performing these tasks. Being significantly cheaper than long-read sequencing of comparable depth, the incorporation of proximity-ligation sequencing in microbiome research shows promise to greatly accelerate future advancements in the field.

Competing Interest Statement

GU, MP, SE, AW, JG, BA, SS, and IL are past or present employees of Phase Genomics. MP is an employee of Inscripta. All other authors have no competing interests.

Copyright 
The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license.
Back to top
PreviousNext
Posted June 14, 2021.
Download PDF
Email

Thank you for your interest in spreading the word about bioRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
Accurate viral genome reconstruction and host assignment with proximity-ligation sequencing
(Your Name) has forwarded a page to you from bioRxiv
(Your Name) thought you would like to see this page from the bioRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
Accurate viral genome reconstruction and host assignment with proximity-ligation sequencing
Gherman Uritskiy, Maximillian Press, Christine Sun, Guillermo Domínguez Huerta, Ahmed A. Zayed, Andrew Wiser, Jonas Grove, Benjamin Auch, Stephen M. Eacker, Shawn Sullivan, Derek M. Bickhart, Timothy P. L. Smith, Matthew B. Sullivan, Ivan Liachko
bioRxiv 2021.06.14.448389; doi: https://doi.org/10.1101/2021.06.14.448389
Digg logo Reddit logo Twitter logo Facebook logo Google logo LinkedIn logo Mendeley logo
Citation Tools
Accurate viral genome reconstruction and host assignment with proximity-ligation sequencing
Gherman Uritskiy, Maximillian Press, Christine Sun, Guillermo Domínguez Huerta, Ahmed A. Zayed, Andrew Wiser, Jonas Grove, Benjamin Auch, Stephen M. Eacker, Shawn Sullivan, Derek M. Bickhart, Timothy P. L. Smith, Matthew B. Sullivan, Ivan Liachko
bioRxiv 2021.06.14.448389; doi: https://doi.org/10.1101/2021.06.14.448389

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Genomics
Subject Areas
All Articles
  • Animal Behavior and Cognition (3689)
  • Biochemistry (7797)
  • Bioengineering (5676)
  • Bioinformatics (21290)
  • Biophysics (10578)
  • Cancer Biology (8176)
  • Cell Biology (11945)
  • Clinical Trials (138)
  • Developmental Biology (6763)
  • Ecology (10401)
  • Epidemiology (2065)
  • Evolutionary Biology (13867)
  • Genetics (9708)
  • Genomics (13073)
  • Immunology (8146)
  • Microbiology (20014)
  • Molecular Biology (7853)
  • Neuroscience (43058)
  • Paleontology (320)
  • Pathology (1279)
  • Pharmacology and Toxicology (2258)
  • Physiology (3353)
  • Plant Biology (7232)
  • Scientific Communication and Education (1312)
  • Synthetic Biology (2006)
  • Systems Biology (5538)
  • Zoology (1128)