Skip to main content
bioRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search
New Results

PanGraph: scalable bacterial pan-genome graph construction

View ORCID ProfileNicholas Noll, View ORCID ProfileMarco Molari, View ORCID ProfileRichard A. Neher
doi: https://doi.org/10.1101/2022.02.24.481757
Nicholas Noll
1Kavli Institute for Theoretical Physics, University of California, Santa Barbara
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Nicholas Noll
Marco Molari
2Swiss Institute of Bioinformatics, Basel, Switzerland
3Biozentrum, University of Basel, Basel, Switzerland
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Marco Molari
Richard A. Neher
2Swiss Institute of Bioinformatics, Basel, Switzerland
3Biozentrum, University of Basel, Basel, Switzerland
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Richard A. Neher
  • For correspondence: richard.neher@unibas.ch
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Data/Code
  • Preview PDF
Loading

Abstract

The genomic diversity of microbes is commonly parameterized as population genetic polymorphisms relative to a reference genome of a well-characterized, but arbitrary, isolate. Reference genomes contain a fraction of the microbial pangenome, the set of genes observed within all isolates of a given species, and are thus blind to both the dynamics of the accessory genome, as well as variation within gene order and copy number. With the wide-spread usage of long-read sequencing, the number of high-quality, complete genome assemblies has increased dramatically. Traditional computational approaches towards whole-genome analysis either scale poorly, or treat genomes as dissociated bags of genes, and thus are not suited for this new era. Here, we present PanGraph, a Julia based library and command line interface for aligning whole genomes into a graph, wherein each genome is represented as an undirected path along vertices, which in turn, encapsulate homologous multiple sequence alignments. The resultant data structure succinctly summarizes population-level nucleotide and structural polymorphisms and can be exported into a several common formats for either downstream analysis or immediate visualization.

Competing Interest Statement

The authors have declared no competing interest.

Footnotes

  • https://neherlab.github.io/pangraph/

Copyright 
The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY 4.0 International license.
Back to top
PreviousNext
Posted February 24, 2022.
Download PDF
Data/Code
Email

Thank you for your interest in spreading the word about bioRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
PanGraph: scalable bacterial pan-genome graph construction
(Your Name) has forwarded a page to you from bioRxiv
(Your Name) thought you would like to see this page from the bioRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
PanGraph: scalable bacterial pan-genome graph construction
Nicholas Noll, Marco Molari, Richard A. Neher
bioRxiv 2022.02.24.481757; doi: https://doi.org/10.1101/2022.02.24.481757
Digg logo Reddit logo Twitter logo Facebook logo Google logo LinkedIn logo Mendeley logo
Citation Tools
PanGraph: scalable bacterial pan-genome graph construction
Nicholas Noll, Marco Molari, Richard A. Neher
bioRxiv 2022.02.24.481757; doi: https://doi.org/10.1101/2022.02.24.481757

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Bioinformatics
Subject Areas
All Articles
  • Animal Behavior and Cognition (3476)
  • Biochemistry (7313)
  • Bioengineering (5288)
  • Bioinformatics (20170)
  • Biophysics (9966)
  • Cancer Biology (7693)
  • Cell Biology (11242)
  • Clinical Trials (138)
  • Developmental Biology (6409)
  • Ecology (9907)
  • Epidemiology (2065)
  • Evolutionary Biology (13260)
  • Genetics (9345)
  • Genomics (12541)
  • Immunology (7664)
  • Microbiology (18918)
  • Molecular Biology (7411)
  • Neuroscience (40844)
  • Paleontology (298)
  • Pathology (1224)
  • Pharmacology and Toxicology (2124)
  • Physiology (3137)
  • Plant Biology (6832)
  • Scientific Communication and Education (1268)
  • Synthetic Biology (1890)
  • Systems Biology (5294)
  • Zoology (1083)