A nucleotide substitution model with nearest-neighbour interactions

Gerton Lunter; Jotun Hein

doi:10.1093/bioinformatics/bth901

A nucleotide substitution model with nearest-neighbour interactions

Bioinformatics. 2004 Aug 4:20 Suppl 1:i216-23. doi: 10.1093/bioinformatics/bth901.

Authors

Gerton Lunter¹, Jotun Hein

Affiliation

¹ Bioinformatics group, Department of Statistics, University of Oxford, Oxford, UK. lunter@stats.ox.ac.uk

PMID: 15262802
DOI: 10.1093/bioinformatics/bth901

Abstract

Motivation: It is well known that neighbouring nucleotides in DNA sequences do not mutate independently of each other. In this paper, we introduce a context-dependent substitution model and derive an algorithm to calculate the likelihood of sequences evolving under this model. We use this algorithm to estimate neighbour-dependent substitution rates, as well as rates for dinucleotide substitutions, using a Bayesian sampling procedure. The model is irreversible, giving an arrow to time, and allowing the position of the root between a pair of sequences to be inferred without using out-groups.

Results: We applied the model upon aligned human-mouse non-coding data. Clear neighbour dependencies were observed, including 17-18-fold increased CpG to TpG/CpA rates compared with other substitutions. Root inference positioned the root halfway the mouse and human tips, suggesting an approximately clock-like behaviour of the irreversible part of the substitution process.

MeSH terms

Algorithms*
Animals
Base Pair Mismatch / genetics*
Chromosome Mapping / methods*
Computer Simulation
DNA / genetics*
DNA Mutational Analysis
Genetic Variation / genetics
Humans
Mice
Models, Genetic*
Nucleotides / genetics*
Sequence Analysis, DNA / methods*

Substances

Nucleotides
DNA