Application of multiple sequence alignment profiles to improve protein secondary structure prediction

J A Cuff; G J Barton

doi:10.1002/1097-0134(20000815)40:3<502::aid-prot170>3.0.co;2-q

Application of multiple sequence alignment profiles to improve protein secondary structure prediction

Proteins. 2000 Aug 15;40(3):502-11. doi: 10.1002/1097-0134(20000815)40:3<502::aid-prot170>3.0.co;2-q.

Authors

J A Cuff¹, G J Barton

Affiliation

¹ Laboratory of Molecular Biophysics, Oxford, United Kingdom.

PMID: 10861942
DOI: 10.1002/1097-0134(20000815)40:3<502::aid-prot170>3.0.co;2-q

Abstract

The effect of training a neural network secondary structure prediction algorithm with different types of multiple sequence alignment profiles derived from the same sequences, is shown to provide a range of accuracy from 70.5% to 76.4%. The best accuracy of 76.4% (standard deviation 8.4%), is 3.1% (Q(3)) and 4.4% (SOV2) better than the PHD algorithm run on the same set of 406 sequence non-redundant proteins that were not used to train either method. Residues predicted by the new method with a confidence value of 5 or greater, have an average Q(3) accuracy of 84%, and cover 68% of the residues. Relative solvent accessibility based on a two state model, for 25, 5, and 0% accessibility are predicted at 76.2, 79.8, and 86. 6% accuracy respectively. The source of the improvements obtained from training with different representations of the same alignment data are described in detail. The new Jnet prediction method resulting from this study is available in the Jpred secondary structure prediction server, and as a stand-alone computer program from: http://barton.ebi.ac.uk/. Proteins 2000;40:502-511.

MeSH terms

Algorithms
Amino Acid Sequence
Databases, Factual
Molecular Sequence Data
Neural Networks, Computer
Protein Structure, Secondary*
Reproducibility of Results
Sequence Alignment / methods*
Sequence Analysis, Protein / methods*
Software
Solvents

Substances

Solvents