PT - JOURNAL ARTICLE AU - Aarti Venkat AU - Matthew W. Hahn AU - Joseph W. Thornton TI - Multinucleotide mutations cause false inferences of lineage-specific positive selection AID - 10.1101/165969 DP - 2018 Jan 01 TA - bioRxiv PG - 165969 4099 - http://biorxiv.org/content/early/2018/04/24/165969.short 4100 - http://biorxiv.org/content/early/2018/04/24/165969.full AB - Phylogenetic tests of adaptive evolution, which infer positive selection from an excess of nonsynonymous changes, assume that nucleotide substitutions occur singly and independently. But recent research has shown that multiple errors at adjacent sites often occur in single events during DNA replication. These multinucleotide mutations (MNMs) are overwhelmingly likely to be nonsynonymous. We therefore evaluated whether phylogenetic tests of adaptive evolution, such as the widely used branch-site test, might misinterpret sequence patterns produced by MNMs as false support for positive selection. We explored two genome-wide datasets comprising thousands of coding alignments – one from mammals and one from flies – and found that codons with multiple differences (CMDs) account for virtually all the support for lineage-specific positive selection inferred by the branch-site test. Simulations under genome-wide, empirically derived conditions without positive selection show that realistic rates of MNMs cause a strong and systematic bias in the branch-site and related tests; the bias is sufficient to produce false positive inferences approximately as often as the branch-site test infers positive selection from the empirical data. Our analysis indicates that genes may often be inferred to be under positive selection simply because they stochastically accumulated one or a few MNMs. Because these tests do not reliably distinguish sequence patterns produced by authentic positive selection from those caused by neutral fixation of MNMs, many published inferences of adaptive evolution using these techniques may therefore be artifacts of model violation caused by unincorporated neutral mutational processes. We develop an alternative model that incorporates MNMs and may be helpful in reducing this bias.