Skip to main content
bioRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search
New Results

DGINN, an automated and highly-flexible pipeline for the Detection of Genetic INNovations on protein-coding genes

Lea Picard, Quentin Ganivet, Omran Allatif, Andrea Cimarelli, Laurent Guéguen, View ORCID ProfileLucie Etienne
doi: https://doi.org/10.1101/2020.02.25.964155
Lea Picard
1CIRI – Centre International de Recherche en Infectiologie, Inserm U1111, Université Claude Bernard Lyon 1, CNRS UMR5308, Ecole Normale Supérieure de Lyon, Univ Lyon, F-69007, Lyon, France
2LBBE – Laboratoire de Biologie et Biometrie Evolutive, CNRS UMR 5558, Université Claude Bernard Lyon 1, Villeurbanne, France
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Quentin Ganivet
2LBBE – Laboratoire de Biologie et Biometrie Evolutive, CNRS UMR 5558, Université Claude Bernard Lyon 1, Villeurbanne, France
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Omran Allatif
1CIRI – Centre International de Recherche en Infectiologie, Inserm U1111, Université Claude Bernard Lyon 1, CNRS UMR5308, Ecole Normale Supérieure de Lyon, Univ Lyon, F-69007, Lyon, France
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Andrea Cimarelli
1CIRI – Centre International de Recherche en Infectiologie, Inserm U1111, Université Claude Bernard Lyon 1, CNRS UMR5308, Ecole Normale Supérieure de Lyon, Univ Lyon, F-69007, Lyon, France
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Laurent Guéguen
2LBBE – Laboratoire de Biologie et Biometrie Evolutive, CNRS UMR 5558, Université Claude Bernard Lyon 1, Villeurbanne, France
3Swedish Collegium for Advanced Study, Uppsala, Sweden
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • For correspondence: lucie.etienne@ens-lyon.fr laurent.gueguen@univ-lyon1.fr
Lucie Etienne
1CIRI – Centre International de Recherche en Infectiologie, Inserm U1111, Université Claude Bernard Lyon 1, CNRS UMR5308, Ecole Normale Supérieure de Lyon, Univ Lyon, F-69007, Lyon, France
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Lucie Etienne
  • For correspondence: lucie.etienne@ens-lyon.fr laurent.gueguen@univ-lyon1.fr
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Data/Code
  • Preview PDF
Loading

Abstract

Adaptive evolution has shaped major biological processes. Finding the protein-coding genes and the sites that have been subjected to adaptation during evolutionary time is a major endeavor. However, very few methods fully automate the identification of positively selected genes, and widespread sources of genetic innovations as gene duplication and recombination are absent from most pipelines. Here, we developed DGINN, a highly-flexible and public pipeline to Detect Genetic INNovations and adaptive evolution in protein-coding genes. DGINN automates, from a gene’s sequence, all steps of the evolutionary analyses necessary to detect the aforementioned innovations, including the search for homologues in databases, assignation of orthology groups, identification of duplication and recombination events, as well as detection of positive selection using five different methods to increase precision and ranking of genes when a large panel is analyzed. DGINN was validated on nineteen genes with previously-characterized evolutionary histories in primates, including some engaged in host-pathogen arms-races. The results obtained with DGINN confirm and also expand results from the literature, establishing DGINN as an efficient tool to automatically detect genetic innovations and adaptive evolution in diverse datasets, from the user’s gene of interest to a large gene list in any species range.

Footnotes

  • https://github.com/leapicard/DGINN

Copyright 
The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license.
Back to top
PreviousNext
Posted February 26, 2020.
Download PDF
Data/Code
Email

Thank you for your interest in spreading the word about bioRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
DGINN, an automated and highly-flexible pipeline for the Detection of Genetic INNovations on protein-coding genes
(Your Name) has forwarded a page to you from bioRxiv
(Your Name) thought you would like to see this page from the bioRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
DGINN, an automated and highly-flexible pipeline for the Detection of Genetic INNovations on protein-coding genes
Lea Picard, Quentin Ganivet, Omran Allatif, Andrea Cimarelli, Laurent Guéguen, Lucie Etienne
bioRxiv 2020.02.25.964155; doi: https://doi.org/10.1101/2020.02.25.964155
Digg logo Reddit logo Twitter logo CiteULike logo Facebook logo Google logo Mendeley logo
Citation Tools
DGINN, an automated and highly-flexible pipeline for the Detection of Genetic INNovations on protein-coding genes
Lea Picard, Quentin Ganivet, Omran Allatif, Andrea Cimarelli, Laurent Guéguen, Lucie Etienne
bioRxiv 2020.02.25.964155; doi: https://doi.org/10.1101/2020.02.25.964155

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Evolutionary Biology
Subject Areas
All Articles
  • Animal Behavior and Cognition (2612)
  • Biochemistry (5152)
  • Bioengineering (3604)
  • Bioinformatics (15614)
  • Biophysics (7137)
  • Cancer Biology (5551)
  • Cell Biology (7970)
  • Clinical Trials (138)
  • Developmental Biology (4688)
  • Ecology (7392)
  • Epidemiology (2059)
  • Evolutionary Biology (10461)
  • Genetics (7655)
  • Genomics (10021)
  • Immunology (5090)
  • Microbiology (13701)
  • Molecular Biology (5308)
  • Neuroscience (30348)
  • Paleontology (208)
  • Pathology (864)
  • Pharmacology and Toxicology (1503)
  • Physiology (2215)
  • Plant Biology (4938)
  • Scientific Communication and Education (1030)
  • Synthetic Biology (1366)
  • Systems Biology (4109)
  • Zoology (797)