Skip to main content
bioRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search
New Results

Beyond sequence similarity: cross-phyla protein annotation by structural prediction and alignment

View ORCID ProfileFabian Ruperti, View ORCID ProfileNikolaos Papadopoulos, View ORCID ProfileJacob Musser, View ORCID ProfileMilot Mirdita, View ORCID ProfileMartin Steinegger, View ORCID ProfileDetlev Arendt
doi: https://doi.org/10.1101/2022.07.05.498892
Fabian Ruperti
1Developmental Biology Unit, European Molecular Biology Laboratory, 69117 Heidelberg, Germany
2Collaboration for joint Ph.D. degree between EMBL and Heidelberg University, Faculty of Biosciences 69117 Heidelberg, Germany.
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Fabian Ruperti
Nikolaos Papadopoulos
1Developmental Biology Unit, European Molecular Biology Laboratory, 69117 Heidelberg, Germany
3Department of Evolutionary Biology, University of Vienna, 1030 Vienna, Austria
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Nikolaos Papadopoulos
Jacob Musser
1Developmental Biology Unit, European Molecular Biology Laboratory, 69117 Heidelberg, Germany
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Jacob Musser
Milot Mirdita
4School of Biological Sciences, Seoul National University, Seoul, South Korea
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Milot Mirdita
Martin Steinegger
4School of Biological Sciences, Seoul National University, Seoul, South Korea
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Martin Steinegger
Detlev Arendt
1Developmental Biology Unit, European Molecular Biology Laboratory, 69117 Heidelberg, Germany
5Centre for Organismal Studies (COS), University of Heidelberg, 69120 Heidelberg, Germany
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Detlev Arendt
  • For correspondence: arendt@embl.de
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Data/Code
  • Preview PDF
Loading

Abstract

Background Annotating protein function is a major goal in molecular biology, yet experimentally determined knowledge is often limited to a few model organisms. In non-model species, the sequence-based prediction of gene orthology can be used to infer function, however this approach loses predictive power with longer evolutionary distances. Here we propose a pipeline for the functional annotation of proteins using structural similarity, exploiting the fact that protein structures are directly linked to function and can be more conserved than protein sequences.

Results We propose a pipeline of openly available tools for the functional annotation of proteins via structural similarity (MorF: MorphologFinder) and use it to annotate the complete proteome of a sponge. Sponges are highly relevant for inferring the early history of animals, yet their proteomes remain sparsely annotated. MorF accurately predicts the functions of proteins with known homology in >90% cases, and annotates an additional 50% of the proteome beyond standard sequence-based methods. Using this, we uncover new functions for sponge cell types, including extensive FGF, TGF and Ephrin signalling in sponge epithelia, and redox metabolism and control in myopeptidocytes. Notably, we also annotate genes specific to the enigmatic sponge mesocytes, proposing they function to digest cell walls.

Conclusions Our work demonstrates that structural similarity is a powerful approach that complements and extends sequence similarity searches to identify homologous proteins over long evolutionary distances. We anticipate this to be a powerful approach that boosts discovery in numerous -omics datasets, especially for non-model organisms.

Competing Interest Statement

The authors have declared no competing interest.

Footnotes

  • ↵☯ These authors contributed equally to this work.

  • Added "morpholog" concept to denote structurally similar proteins; renamed pipeline to MorF; demonstrated that MorF is capable of detecting remote homologs; demonstrated that morphologs often retain function even when homology is not detectable via sequence; evaluated MorF consistency in function prediction; added sequence profile comparison; weakened HGT claim; added outgroup comparison (choanoflagellates).

  • https://git.embl.de/grp-arendt/MorF/

Copyright 
The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC 4.0 International license.
Back to top
PreviousNext
Posted January 30, 2023.
Download PDF
Data/Code
Email

Thank you for your interest in spreading the word about bioRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
Beyond sequence similarity: cross-phyla protein annotation by structural prediction and alignment
(Your Name) has forwarded a page to you from bioRxiv
(Your Name) thought you would like to see this page from the bioRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
Beyond sequence similarity: cross-phyla protein annotation by structural prediction and alignment
Fabian Ruperti, Nikolaos Papadopoulos, Jacob Musser, Milot Mirdita, Martin Steinegger, Detlev Arendt
bioRxiv 2022.07.05.498892; doi: https://doi.org/10.1101/2022.07.05.498892
Reddit logo Twitter logo Facebook logo LinkedIn logo Mendeley logo
Citation Tools
Beyond sequence similarity: cross-phyla protein annotation by structural prediction and alignment
Fabian Ruperti, Nikolaos Papadopoulos, Jacob Musser, Milot Mirdita, Martin Steinegger, Detlev Arendt
bioRxiv 2022.07.05.498892; doi: https://doi.org/10.1101/2022.07.05.498892

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Evolutionary Biology
Subject Areas
All Articles
  • Animal Behavior and Cognition (4224)
  • Biochemistry (9101)
  • Bioengineering (6749)
  • Bioinformatics (23935)
  • Biophysics (12086)
  • Cancer Biology (9491)
  • Cell Biology (13737)
  • Clinical Trials (138)
  • Developmental Biology (7614)
  • Ecology (11656)
  • Epidemiology (2066)
  • Evolutionary Biology (15476)
  • Genetics (10615)
  • Genomics (14292)
  • Immunology (9456)
  • Microbiology (22773)
  • Molecular Biology (9069)
  • Neuroscience (48840)
  • Paleontology (354)
  • Pathology (1479)
  • Pharmacology and Toxicology (2562)
  • Physiology (3822)
  • Plant Biology (8307)
  • Scientific Communication and Education (1467)
  • Synthetic Biology (2289)
  • Systems Biology (6170)
  • Zoology (1297)