PT - JOURNAL ARTICLE AU - Thomas Smith AU - Adam Eyre-Walker TI - Large scale variation in the rate of <em>de novo</em> mutation, base composition, divergence and diversity in humans AID - 10.1101/110452 DP - 2017 Jan 01 TA - bioRxiv PG - 110452 4099 - http://biorxiv.org/content/early/2017/03/07/110452.short 4100 - http://biorxiv.org/content/early/2017/03/07/110452.full AB - It has long been suspected that the rate of mutation varies across the human genome at a large scale based on the divergence between humans and other species. It is now possible to directly investigate this question using &gt;40,000 de novo mutations (DNMs) that have been discovered in humans through the sequencing of trios. We show that there is variation in the mutation rate at the 100KB and 1MB scale that cannot be explained by variation at smaller scales, however the level of this variation is modest. Different types of mutation show similar levels of variation and appear to vary in concert, and in a manner such that they are not predicted to generate variation in base composition across the genome. Regressing the rate of DNM against a range of genomic features suggests that nucleosome occupancy is the most important correlate, but that GC content, recombination rate, replication time and various histone methylation signals also correlate significantly. In total the model explains ~75% of the explainable variance suggesting that it will be useful for predicting large scale variation in the mutation rate. As expected the rate of divergence between species and the level of diversity within humans are correlated to the rate of DNM. However, the correlations are weaker than if all the variation in divergence was due to variation in the mutation rate. We provide evidence that this is due the effect of biased gene conversion on the probability that a mutation will become fixed. Finally, we show that the correlation between divergence and DNM density declines as increasingly divergent species are considered. Our results have important implications for understanding large scale variation in base composition and the use of divergence and diversity data to study variation in the mutation rate.Author summary Using a dataset of 40,000 de novo mutations we show that there is large-scale variation in the mutation rate at the 100KB and 1MB scale. We show that different types of mutation vary in concert and in a way that is not expected to generate variation in base composition; hence mutation bias is not responsible for the large-scale variation in base composition that is observed across human chromosomes. The variation in the mutation rate appears to depend on the density of nucleosomes, DNA replication and DNA repair and a simple model can explain over 70% of the variation in the density of mutations. As expected large-scale variation in the rate of divergence between species and the variation within species across the genome, is correlated to the rate of mutation, but the correlations are not as strong as they could be. We show that biased gene conversion is responsible for weakening the correlations. Finally, we show that the correlation between the rate of mutation in humans and the divergence between humans and other species, weakens as the species become more divergent.