Abstract
In plants, mammals and insects, some genes are methylated in the CG dinucleotide context, a phenomenon called gene body methylation. It has been controversial whether this phenomenon has any functional role. Here, we took advantage of the availability of 876 leaf methylomes in Arabidopsis thaliana to characterize the population frequency of methylation at the gene level and estimated the site-frequency spectrum of allelic states (epialleles). Using a population genetics model specifically designed for epigenetic data, we found that genes with ancestral gene body methylation are under significant selection to remain methylated. Conversely, all genes taken together were inferred to be under selection to be unmethylated. The estimated selection coefficients were small, similar to the magnitude of selection acting on codon usage. We also estimated that A. thaliana is losing gene body methylation three-fold more rapidly than gaining it, which could be due to a recent reduction in the efficacy of selection after a switch to selfing. Finally, we investigated the potential function of gene body methylation through its link with gene expression level. Across genes with polymorphic methylation states, the expression of gene body methylated alleles was consistently and significantly higher than unmethylated alleles. Although it is difficult to disentangle genetic from epigenetic effects, our work suggests that gbM has a small but measurable effect on fitness, perhaps due to its association to a phenotype like gene expression.