DeepRibo: a neural network for precise gene annotation of prokaryotes by combining ribosome profiling signal and binding site patterns

Nucleic Acids Res. 2019 Apr 8;47(6):e36. doi: 10.1093/nar/gkz061.

Abstract

Annotation of gene expression in prokaryotes often finds itself corrected due to small variations of the annotated gene regions observed between different (sub)-species. It has become apparent that traditional sequence alignment algorithms, used for the curation of genomes, are not able to map the full complexity of the genomic landscape. We present DeepRibo, a novel neural network utilizing features extracted from ribosome profiling information and binding site sequence patterns that shows to be a precise tool for the delineation and annotation of expressed genes in prokaryotes. The neural network combines recurrent memory cells and convolutional layers, adapting the information gained from both the high-throughput ribosome profiling data and ribosome binding translation initiation sequence region into one model. DeepRibo is designed as a single model trained on a variety of ribosome profiling experiments, used for the identification of open reading frames in prokaryotes without a priori knowledge of the translational landscape. Through extensive validation of the model trained on various sets of data, multiple species sequence similarity, mass spectrometry and Edman degradation verified proteins, the effectiveness of DeepRibo is highlighted.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Binding Sites
  • Computational Biology / methods
  • Datasets as Topic
  • High-Throughput Screening Assays / methods
  • Molecular Sequence Annotation / methods*
  • Neural Networks, Computer
  • Open Reading Frames
  • Prokaryotic Cells / chemistry
  • Prokaryotic Cells / metabolism*
  • Protein Biosynthesis / physiology*
  • Protein Processing, Post-Translational
  • Ribosomes / metabolism*
  • Sequence Alignment / methods
  • Signal Transduction