ABSTRACT
Our understanding of mutation rate helps us build evolutionary models and make sense of genetic variation. Recent work indicates that the frequencies of specific mutation types have been elevated in Europe, and that many more, subtler signatures of global polymorphism variation may yet remain unidentified. Here, we present an analysis of the 1,000 Genomes Project (phase 3), suggesting additional putative signatures of mutation rate variation across populations and the extent to which they are shaped by local sequence context. First, we compiled a list of the most significantly variable polymorphism types in a cross-continental statistical test. Clustering polymorphisms together, we observed four sets of substitution types that showed similar trends of relative mutation rate across populations, and describe the patterns of these mutational clusters among continental groups. For the majority of these signatures, we found that a single flanking base pair of sequence context was sufficient to determine the majority of enrichment or depletion of a mutation type. However, local genetic context up to 2-3 base pairs away contributes additional variability, and helps to interpret a previously noted enrichment of certain polymorphism types in some East Asian groups. Building our understanding of mutation rate in this way can help us to construct more accurate evolutionary models and better understand the mechanisms that underlie genetic change.