Skip to main content
bioRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search
New Results

Comparative analysis of machine learning algorithms on the microbial strain-specific AMP prediction

Boris Vishnepolsky, Maya Grigolava, Grigol Managadze, Andrei Gabrielian, Alex Rosenthal, Darrell E. Hurt, Michael Tartakovsky, Malak Pirtskhalava
doi: https://doi.org/10.1101/2022.01.28.478081
Boris Vishnepolsky
1Ivane Beritashvili Center of Experimental Biomedicine, Tbilisi 0160, Georgia
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • For correspondence: b.vishnepolsky@lifescience.org.ge m.pirtskhalava@lifescience.org.ge
Maya Grigolava
1Ivane Beritashvili Center of Experimental Biomedicine, Tbilisi 0160, Georgia
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Grigol Managadze
1Ivane Beritashvili Center of Experimental Biomedicine, Tbilisi 0160, Georgia
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Andrei Gabrielian
2Office of Cyber Infrastructure and Computational Biology, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, MD 20892, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Alex Rosenthal
2Office of Cyber Infrastructure and Computational Biology, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, MD 20892, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Darrell E. Hurt
2Office of Cyber Infrastructure and Computational Biology, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, MD 20892, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Michael Tartakovsky
2Office of Cyber Infrastructure and Computational Biology, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, MD 20892, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Malak Pirtskhalava
1Ivane Beritashvili Center of Experimental Biomedicine, Tbilisi 0160, Georgia
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • For correspondence: b.vishnepolsky@lifescience.org.ge m.pirtskhalava@lifescience.org.ge
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Supplementary material
  • Preview PDF
Loading

ABSTRACT

The evolution of drug-resistant pathogenic microbial species is a major global health concern. Naturally occurring, antimicrobial peptides (AMPs) are considered promising candidates to address antibiotic resistance problems. A variety of computational methods have been developed to accurately predict AMPs. The majority of such methods are not microbial strain-specific (MSS): they can predict whether a given peptide is active against some microbe, but cannot accurately calculate whether such peptide would be active against a particular microbial strain. Due to insufficient data on most microbial strains, only a few MSS predictive models have been developed so far. To overcome this problem, we developed a novel approach that allows to improve MSS predictive models (MSSPM), based on properties, computed for AMP sequences and characteristics of genomes, computed for target microbial strains. New models can perform predictions of AMPs for microbial strains that do not have data on peptides tested on them. We tested various types of feature engineering as well as different machine learning (ML) algorithms to compare the predictive abilities of resulting models. Among the ML algorithms, Random Forest and AdaBoost performed best. By using genome characteristics as additional features, the performance for all models increased significantly—on average by 7%—relative to models relying on AMP sequence-based properties only. Our novel MSS AMP predictor is freely accessible as part of DBAASP database resource at https://dbaasp.org/tools?page=genome-prediction

Competing Interest Statement

The authors have declared no competing interest.

Copyright 
The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-ND 4.0 International license.
Back to top
PreviousNext
Posted January 28, 2022.
Download PDF

Supplementary Material

Email

Thank you for your interest in spreading the word about bioRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
Comparative analysis of machine learning algorithms on the microbial strain-specific AMP prediction
(Your Name) has forwarded a page to you from bioRxiv
(Your Name) thought you would like to see this page from the bioRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
Comparative analysis of machine learning algorithms on the microbial strain-specific AMP prediction
Boris Vishnepolsky, Maya Grigolava, Grigol Managadze, Andrei Gabrielian, Alex Rosenthal, Darrell E. Hurt, Michael Tartakovsky, Malak Pirtskhalava
bioRxiv 2022.01.28.478081; doi: https://doi.org/10.1101/2022.01.28.478081
Digg logo Reddit logo Twitter logo Facebook logo Google logo LinkedIn logo Mendeley logo
Citation Tools
Comparative analysis of machine learning algorithms on the microbial strain-specific AMP prediction
Boris Vishnepolsky, Maya Grigolava, Grigol Managadze, Andrei Gabrielian, Alex Rosenthal, Darrell E. Hurt, Michael Tartakovsky, Malak Pirtskhalava
bioRxiv 2022.01.28.478081; doi: https://doi.org/10.1101/2022.01.28.478081

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Bioinformatics
Subject Areas
All Articles
  • Animal Behavior and Cognition (3580)
  • Biochemistry (7534)
  • Bioengineering (5488)
  • Bioinformatics (20709)
  • Biophysics (10267)
  • Cancer Biology (7942)
  • Cell Biology (11597)
  • Clinical Trials (138)
  • Developmental Biology (6576)
  • Ecology (10151)
  • Epidemiology (2065)
  • Evolutionary Biology (13565)
  • Genetics (9504)
  • Genomics (12801)
  • Immunology (7891)
  • Microbiology (19472)
  • Molecular Biology (7624)
  • Neuroscience (41939)
  • Paleontology (307)
  • Pathology (1253)
  • Pharmacology and Toxicology (2182)
  • Physiology (3254)
  • Plant Biology (7017)
  • Scientific Communication and Education (1291)
  • Synthetic Biology (1944)
  • Systems Biology (5412)
  • Zoology (1109)