RepeatsDB 2.0: improved annotation, classification, search and visualization of repeat protein structures

Nucleic Acids Res. 2017 Jan 4;45(D1):D308-D312. doi: 10.1093/nar/gkw1136. Epub 2016 Nov 29.

Abstract

RepeatsDB 2.0 (URL: http://repeatsdb.bio.unipd.it/) is an update of the database of annotated tandem repeat protein structures. Repeat proteins are a widespread class of non-globular proteins carrying heterogeneous functions involved in several diseases. Here we provide a new version of RepeatsDB with an improved classification schema including high quality annotations for ∼5400 protein structures. RepeatsDB 2.0 features information on start and end positions for the repeat regions and units for all entries. The extensive growth of repeat unit characterization was possible by applying the novel ReUPred annotation method over the entire Protein Data Bank, with data quality is guaranteed by an extensive manual validation for >60% of the entries. The updated web interface includes a new search engine for complex queries and a fully re-designed entry page for a better overview of structural data. It is now possible to compare unit positions, together with secondary structure, fold information and Pfam domains. Moreover, a new classification level has been introduced on top of the existing scheme as an independent layer for sequence similarity relationships at 40%, 60% and 90% identity.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Databases, Protein* / statistics & numerical data
  • Humans
  • Proteins / classification
  • Repetitive Sequences, Amino Acid*
  • Software
  • Structure-Activity Relationship

Substances

  • Proteins