Perturbative formulation of general continuous-time Markov model of sequence evolution via insertions/deletions, Part II: Perturbation analyses
Ezawa, Kiyoshi
Graur, Dan
Landan, Giddy
2015/01/01
Background Insertions and deletions (indels) account for more nucleotide differences between two related DNA sequences than substitutions do, and thus it is imperative to develop a stochastic evolutionary model that enables us to reliably calculate the probability of the sequence evolution through indel processes. In a separate paper (Ezawa, Graur and Landan 2015a), we established a theoretical basis of our ab initio perturbative formulation of a genuine evolutionary model, more specifically, a continuous-time Markov model of the evolution of an entire sequence via insertions and deletions. And we showed that, under some conditions, the ab initio probability of an alignment can be factorized into the product of an overall factor and contributions from regions (or local alignments) separated by gapless columns.Results This paper describes how our ab initio perturbative formulation can be concretely used to approximately calculate the probabilities of all types of local pairwise alignments (PWAs) and some typical types of local multiple sequence alignments (MSAs). For each local alignment type, we calculated the fewest-indel contribution and the next-fewest-indel contribution to its probability, and we compared them under various conditions. We also derived a system of integral equations that can be numerically solved to give "exact solutions" for some common types of local PWAs. And we compared the obtained "exact solutions" with the fewest-indel contributions. The results indicated that even the fewest-indel terms alone can quite accurately approximate the probabilities of local alignments, as long as the segments and the branches in the tree are of modest lengths. Moreover, in the light of our formulation, we examined parameter regions where other indel models can safely approximate the correct evolutionary probabilities. The analyses also suggested some modifications necessary for these models to improve the accuracy of their probability estimations.Conclusions At least under modest conditions, our ab initio perturbative formulation can quite accurately calculate alignment probabilities under biologically realistic indel models. It also provides a sound reference point that other indel models can be compared to. [This paper and three other papers (Ezawa, Graur and Landan 2015a,b,c) describe a series of our efforts to develop, apply, and extend the ab initio perturbative formulation of a general continuous-time Markov model of indels.]
