Inferring consensus structure from nucleic acid sequences

D K Chiu; T Kolodziejczak

doi:10.1093/bioinformatics/7.3.347

Inferring consensus structure from nucleic acid sequences

Comput Appl Biosci. 1991 Jul;7(3):347-52. doi: 10.1093/bioinformatics/7.3.347.

Authors

D K Chiu¹, T Kolodziejczak

Affiliation

¹ Department of Computing and Information Science, University of Guelph, Ontario, Canada.

PMID: 1913217
DOI: 10.1093/bioinformatics/7.3.347

Abstract

This paper presents an unsupervised inference method for determining the higher-order structure from sequence data. The method is general, but in this paper it is applied to nucleic acid sequences in determining the secondary (2-D) and tertiary (3-D) structure of the macromolecule. The method evaluates position - position interdependence of the sequence using an information measure known as expected mutual information. The expected mutual information is calculated for each pair of positions and the chi-square test is used to screen statistically significant position pairs. In the calculation of expected mutual information, an unbiased probability estimator is used to overcome the problem associated with zero observation in conserved sites. A selection criterion based on known structural constraints of the strongest interdependent position pairs is applied yielding position pairs most indicative of secondary and tertiary interactions. The method has been tested using tRNA and 5S rRNA sequences with very good results.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Algorithms*
Amino Acid Sequence*
Microcomputers
Probability
Protein Conformation*
RNA, Ribosomal / analysis
RNA, Transfer / analysis

Substances

RNA, Ribosomal
RNA, Transfer