Protein sequence comparison and fold recognition: progress and good-practice benchmarking

Curr Opin Struct Biol. 2011 Jun;21(3):404-11. doi: 10.1016/j.sbi.2011.03.005. Epub 2011 Mar 31.

Abstract

Protein sequence comparison methods have grown increasingly sensitive during the last decade and can often identify distantly related proteins sharing a common ancestor some 3 billion years ago. Although cellular function is not conserved so long, molecular functions and structures of protein domains often are. In combination with a domain-centered approach to function and structure prediction, modern remote homology detection methods have a great and largely underexploited potential for elucidating protein functions and evolution. Advances during the last few years include nonlinear scoring functions combining various sequence features, the use of sequence context information, and powerful new software packages. Since progress depends on realistically assessing new and existing methods and published benchmarks are often hard to compare, we propose 10 rules of good-practice benchmarking.

Publication types

  • Research Support, Non-U.S. Gov't
  • Review

MeSH terms

  • Amino Acid Sequence
  • Computational Biology
  • Databases, Protein
  • Proteins / chemistry*
  • Proteins / genetics*
  • Proteins / metabolism
  • Sensitivity and Specificity
  • Sequence Alignment* / standards
  • Sequence Homology, Amino Acid
  • Software

Substances

  • Proteins