Skip to main content
bioRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search
New Results

BRAKER2: Automatic Eukaryotic Genome Annotation with GeneMark-EP+ and AUGUSTUS Supported by a Protein Database

Tomáš Brůna, Katharina J. Hoff, Alexandre Lomsadze, Mario Stanke, Mark Borodovsky
doi: https://doi.org/10.1101/2020.08.10.245134
Tomáš Brůna
1School of Biological Science, Georgia Tech, Atlanta, GA 30332, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Katharina J. Hoff
2Institute of Mathematics and Computer Science, University of Greifswald, 17489 Greifswald, Germany
3Center for Functional Genomics of Microbes, University of Greifswald, 17489 Greifswald, Germany
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Alexandre Lomsadze
4Wallace H Coulter Department of Biomedical Engineering, Georgia Tech and Emory University, Atlanta, GA 30332, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Mario Stanke
2Institute of Mathematics and Computer Science, University of Greifswald, 17489 Greifswald, Germany
3Center for Functional Genomics of Microbes, University of Greifswald, 17489 Greifswald, Germany
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Mark Borodovsky
4Wallace H Coulter Department of Biomedical Engineering, Georgia Tech and Emory University, Atlanta, GA 30332, USA
5School of Computational Science and Engineering, Georgia Tech, Atlanta, GA 30332, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • For correspondence: borodovsky@gatech.edu
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Supplementary material
  • Data/Code
  • Preview PDF
Loading

Abstract

Full automation of gene prediction has become an important bioinformatics task since the advent of next generation sequencing. The eukaryotic genome annotation pipeline BRAKER1 had combined self-training GeneMark-ET with AUGUSTUS to generate genes’ coordinates with support of transcriptomic data. Here, we introduce BRAKER2, a pipeline with GeneMark-EP+ and AUGUSTUS externally supported by cross-species protein sequences aligned to the genome. Among the challenges addressed in the development of the new pipeline was generation of reliable hints to the locations of protein-coding exon boundaries from likely homologous but evolutionarily distant proteins. Under equal conditions, the gene prediction accuracy of BRAKER2 was shown to be higher than the one of MAKER2, yet another genome annotation pipeline. Also, in comparison with BRAKER1 supported by a large volume of transcript data, BRAKER2 could produce a better gene prediction accuracy if the evolutionary distances to the reference species in the protein database were rather small. All over, our tests demonstrated that fully automatic BRAKER2 is a fast and accurate method for structural annotation of novel eukaryotic genomes.

Competing Interest Statement

The authors have declared no competing interest.

Footnotes

  • ↵* joint first authors

  • ↵^ joint last authors

  • https://github.com/Gaius-Augustus/BRAKER

  • https://github.com/gatech-genemark/BRAKER2-exp

Copyright 
The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license.
Back to top
PreviousNext
Posted August 11, 2020.
Download PDF

Supplementary Material

Data/Code
Email

Thank you for your interest in spreading the word about bioRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
BRAKER2: Automatic Eukaryotic Genome Annotation with GeneMark-EP+ and AUGUSTUS Supported by a Protein Database
(Your Name) has forwarded a page to you from bioRxiv
(Your Name) thought you would like to see this page from the bioRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
BRAKER2: Automatic Eukaryotic Genome Annotation with GeneMark-EP+ and AUGUSTUS Supported by a Protein Database
Tomáš Brůna, Katharina J. Hoff, Alexandre Lomsadze, Mario Stanke, Mark Borodovsky
bioRxiv 2020.08.10.245134; doi: https://doi.org/10.1101/2020.08.10.245134
Reddit logo Twitter logo Facebook logo LinkedIn logo Mendeley logo
Citation Tools
BRAKER2: Automatic Eukaryotic Genome Annotation with GeneMark-EP+ and AUGUSTUS Supported by a Protein Database
Tomáš Brůna, Katharina J. Hoff, Alexandre Lomsadze, Mario Stanke, Mark Borodovsky
bioRxiv 2020.08.10.245134; doi: https://doi.org/10.1101/2020.08.10.245134

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Bioinformatics
Subject Areas
All Articles
  • Animal Behavior and Cognition (4672)
  • Biochemistry (10338)
  • Bioengineering (7658)
  • Bioinformatics (26300)
  • Biophysics (13501)
  • Cancer Biology (10672)
  • Cell Biology (15412)
  • Clinical Trials (138)
  • Developmental Biology (8487)
  • Ecology (12806)
  • Epidemiology (2067)
  • Evolutionary Biology (16831)
  • Genetics (11382)
  • Genomics (15469)
  • Immunology (10601)
  • Microbiology (25167)
  • Molecular Biology (10206)
  • Neuroscience (54383)
  • Paleontology (399)
  • Pathology (1667)
  • Pharmacology and Toxicology (2889)
  • Physiology (4334)
  • Plant Biology (9235)
  • Scientific Communication and Education (1586)
  • Synthetic Biology (2555)
  • Systems Biology (6773)
  • Zoology (1461)