Skip to main content
bioRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search
New Results

SweetOrigins: Extracting Evolutionary Information from Glycans

View ORCID ProfileDaniel Bojar, View ORCID ProfileRani K. Powers, View ORCID ProfileDiogo M. Camacho, James J. Collins
doi: https://doi.org/10.1101/2020.04.08.031948
Daniel Bojar
1Wyss Institute for Biologically Inspired Engineering, Harvard University, Boston, MA 02115, USA
2Department of Biological Engineering and Institute for Medical Engineering & Science, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Daniel Bojar
Rani K. Powers
1Wyss Institute for Biologically Inspired Engineering, Harvard University, Boston, MA 02115, USA
2Department of Biological Engineering and Institute for Medical Engineering & Science, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Rani K. Powers
Diogo M. Camacho
1Wyss Institute for Biologically Inspired Engineering, Harvard University, Boston, MA 02115, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Diogo M. Camacho
  • For correspondence: jimjc@mit.edu diogo.camacho@wyss.harvard.edu
James J. Collins
1Wyss Institute for Biologically Inspired Engineering, Harvard University, Boston, MA 02115, USA
2Department of Biological Engineering and Institute for Medical Engineering & Science, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
3Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • For correspondence: jimjc@mit.edu diogo.camacho@wyss.harvard.edu
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Supplementary material
  • Data/Code
  • Preview PDF
Loading

Abstract

Glycans, the most diverse biopolymer and crucial for many biological processes, are shaped by evolutionary pressures stemming in particular from host-pathogen interactions. While this positions glycans as being essential for understanding and targeting host-pathogen interactions, their considerable diversity and a lack of methods has hitherto stymied progress in leveraging their predictive potential. Here, we utilize a curated dataset of 12,674 glycans from 1,726 species to develop and apply machine learning methods to extract evolutionary information from glycans. Our deep learning-based language model SweetOrigins provides evolution-informed glycan representations that we utilize to discover and investigate motifs used for molecular mimicry-mediated immune evasion by commensals and pathogens. Novel glycan alignment methods enable us to identify and contextualize virulence-determining motifs in the capsular polysaccharide of Staphylococcus aureus and Acinetobacter baumannii. Further, we show that glycan-based phylogenetic trees contain most of the information present in traditional 16S rRNA-based phylogenies and improve on the differentiation of genetically closely related but phenotypically divergent species, such as Bacillus cereus and Bacillus anthracis. Leveraging the evolutionary information inherent in glycans with machine learning methodology is poised to provide further – critically needed – insights into host-pathogen interactions, sequence-to-function relationships, and the major influence of glycans on phenotypic plasticity.

Footnotes

  • https://github.com/midas-wyss/sweetorigins

Copyright 
The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license.
Back to top
PreviousNext
Posted April 09, 2020.
Download PDF

Supplementary Material

Data/Code
Email

Thank you for your interest in spreading the word about bioRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
SweetOrigins: Extracting Evolutionary Information from Glycans
(Your Name) has forwarded a page to you from bioRxiv
(Your Name) thought you would like to see this page from the bioRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
SweetOrigins: Extracting Evolutionary Information from Glycans
Daniel Bojar, Rani K. Powers, Diogo M. Camacho, James J. Collins
bioRxiv 2020.04.08.031948; doi: https://doi.org/10.1101/2020.04.08.031948
Reddit logo Twitter logo Facebook logo LinkedIn logo Mendeley logo
Citation Tools
SweetOrigins: Extracting Evolutionary Information from Glycans
Daniel Bojar, Rani K. Powers, Diogo M. Camacho, James J. Collins
bioRxiv 2020.04.08.031948; doi: https://doi.org/10.1101/2020.04.08.031948

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Bioinformatics
Subject Areas
All Articles
  • Animal Behavior and Cognition (4223)
  • Biochemistry (9101)
  • Bioengineering (6748)
  • Bioinformatics (23929)
  • Biophysics (12080)
  • Cancer Biology (9488)
  • Cell Biology (13725)
  • Clinical Trials (138)
  • Developmental Biology (7614)
  • Ecology (11653)
  • Epidemiology (2066)
  • Evolutionary Biology (15471)
  • Genetics (10613)
  • Genomics (14289)
  • Immunology (9454)
  • Microbiology (22773)
  • Molecular Biology (9065)
  • Neuroscience (48824)
  • Paleontology (354)
  • Pathology (1479)
  • Pharmacology and Toxicology (2560)
  • Physiology (3820)
  • Plant Biology (8307)
  • Scientific Communication and Education (1467)
  • Synthetic Biology (2287)
  • Systems Biology (6168)
  • Zoology (1297)