Skip to main content
bioRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search
New Results

OrthoSNAP: a tree splitting and pruning algorithm for retrieving single-copy orthologs from gene family trees

View ORCID ProfileJacob L. Steenwyk, Dayna C. Goltz, View ORCID ProfileThomas J. Buida III, View ORCID ProfileYuanning Li, View ORCID ProfileXing-Xing Shen, View ORCID ProfileAntonis Rokas
doi: https://doi.org/10.1101/2021.10.30.466607
Jacob L. Steenwyk
1Vanderbilt University, Department of Biological Sciences, VU Station B #35-1634, Nashville, TN 37235, United States of America
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Jacob L. Steenwyk
  • For correspondence: jacob.steenwyk@vanderbilt.edu antonis.rokas@vanderbilt.edu
Dayna C. Goltz
22312 Elliston Place #510, Nashville, TN 37203, United States of America
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Thomas J. Buida III
39 City Place #312, Nashville, TN 37209, United States of America
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Thomas J. Buida III
Yuanning Li
1Vanderbilt University, Department of Biological Sciences, VU Station B #35-1634, Nashville, TN 37235, United States of America
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Yuanning Li
Xing-Xing Shen
4Ministry of Agriculture Key Lab of Molecular Biology of Crop Pathogens and Insects, Institute of Insect Sciences, Zhejiang University, Hangzhou, China
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Xing-Xing Shen
Antonis Rokas
1Vanderbilt University, Department of Biological Sciences, VU Station B #35-1634, Nashville, TN 37235, United States of America
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Antonis Rokas
  • For correspondence: jacob.steenwyk@vanderbilt.edu antonis.rokas@vanderbilt.edu
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Supplementary material
  • Data/Code
  • Preview PDF
Loading

Abstract

Molecular evolution studies, such as phylogenomic studies and genome-wide surveys of selection, often rely on gene families of single-copy orthologs (SC-OGs). Large gene families with multiple homologs in one or more species—a phenomenon observed among several important families of genes such as transporters and transcription factors—are often ignored because identifying and retrieving SC-OGs nested within them is challenging. To address this issue and increase the number of markers used in molecular evolution studies, we developed OrthoSNAP, a software that uses a phylogenetic framework to simultaneously split gene families into SC-OGs and prune species-specific inparalogs. We term SC-OGs identified by OrthoSNAP as SNAP-OGs because they are identified using a splitting and pruning procedure analogous to snapping branches on a tree. From 415,129 orthologous groups of genes inferred across seven eukaryotic phylogenomic datasets, we identified 9,821 SC-OGs; using OrthoSNAP on the remaining 405,308 orthologous groups of genes, we identified an additional 10,704 SNAP-OGs. Comparison of SNAP-OGs and SC-OGs revealed that their phylogenetic information content was similar, even in complex datasets that contain a whole genome duplication, complex patterns of duplication and loss, transcriptome data where each gene typically has multiple transcripts, and contentious branches in the tree of life. OrthoSNAP is useful for increasing the number of markers used in molecular evolution data matrices, a critical step for robustly inferring and exploring the tree of life.

Competing Interest Statement

Antonis Rokas is a scientific consultant for LifeMine Therapeutics, Inc. J.L.S. is a scientific consultant for Latch AI Inc.

Footnotes

  • The manuscript has been updated in response to comments by three reviewers. We have modified our algorithm (by adding additional features) and now report the results of analyses on additional phylogenomic datasets.

  • https://github.com/JLSteenwyk/orthosnap

  • https://doi.org/10.6084/m9.figshare.16875904

Copyright 
The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC 4.0 International license.
Back to top
PreviousNext
Posted July 27, 2022.
Download PDF

Supplementary Material

Data/Code
Email

Thank you for your interest in spreading the word about bioRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
OrthoSNAP: a tree splitting and pruning algorithm for retrieving single-copy orthologs from gene family trees
(Your Name) has forwarded a page to you from bioRxiv
(Your Name) thought you would like to see this page from the bioRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
OrthoSNAP: a tree splitting and pruning algorithm for retrieving single-copy orthologs from gene family trees
Jacob L. Steenwyk, Dayna C. Goltz, Thomas J. Buida III, Yuanning Li, Xing-Xing Shen, Antonis Rokas
bioRxiv 2021.10.30.466607; doi: https://doi.org/10.1101/2021.10.30.466607
Digg logo Reddit logo Twitter logo Facebook logo Google logo LinkedIn logo Mendeley logo
Citation Tools
OrthoSNAP: a tree splitting and pruning algorithm for retrieving single-copy orthologs from gene family trees
Jacob L. Steenwyk, Dayna C. Goltz, Thomas J. Buida III, Yuanning Li, Xing-Xing Shen, Antonis Rokas
bioRxiv 2021.10.30.466607; doi: https://doi.org/10.1101/2021.10.30.466607

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Evolutionary Biology
Subject Areas
All Articles
  • Animal Behavior and Cognition (4079)
  • Biochemistry (8751)
  • Bioengineering (6467)
  • Bioinformatics (23315)
  • Biophysics (11719)
  • Cancer Biology (9135)
  • Cell Biology (13227)
  • Clinical Trials (138)
  • Developmental Biology (7404)
  • Ecology (11360)
  • Epidemiology (2066)
  • Evolutionary Biology (15078)
  • Genetics (10390)
  • Genomics (14001)
  • Immunology (9110)
  • Microbiology (22026)
  • Molecular Biology (8773)
  • Neuroscience (47317)
  • Paleontology (350)
  • Pathology (1419)
  • Pharmacology and Toxicology (2480)
  • Physiology (3701)
  • Plant Biology (8044)
  • Scientific Communication and Education (1427)
  • Synthetic Biology (2206)
  • Systems Biology (6009)
  • Zoology (1247)