PT - JOURNAL ARTICLE AU - Guy Ling AU - Danielle Miller AU - Rasmus Nielsen AU - Adi Stern TI - A Bayesian framework for inferring the influence of sequence context on single base modifications AID - 10.1101/571646 DP - 2019 Jan 01 TA - bioRxiv PG - 571646 4099 - http://biorxiv.org/content/early/2019/03/09/571646.short 4100 - http://biorxiv.org/content/early/2019/03/09/571646.full AB - The probability of single base modifications (mutations and DNA/RNA modifications) is expected to be highly influenced by the flanking nucleotides that surround them, known as the sequence context. This phenomenon may be mainly attributed to the enzyme that modifies or mutates the genetic material, since most enzymes tend to have specific sequence contexts that dictate their activity. Thus, identification of context effects may lead to the discovery of additional editing sites or unknown enzymatic factors. Here, we develop a statistical model that allows for the detection and evaluation of the effects of different sequence contexts on mutation rates from deep population sequencing data. This task is computationally challenging, as the complexity of the model increases exponentially as the context size increases. We established our novel Bayesian method based on sparse model selection methods, with the leading assumption that the number of actual sequence contexts that directly influence mutation rates is minuscule compared to the number of possible sequence contexts. We show that our method is highly accurate on simulated data using pentanucleotide contexts, even when accounting for noisy data. We next analyze empirical population sequencing data from polioviruses and detect a significant enrichment in sequence contexts associated with deamination by the cellular deaminases ADAR 1/2. In the current era, where next generation sequencing data is highly abundant, our approach can be used on any population sequencing data to reveal context-dependent base alterations, and may assist in the discovery of novel mutable sites or editing sites.