PT - JOURNAL ARTICLE AU - Enrique J. Schwarzkopf AU - Juan C. Motamayor AU - Omar E. Cornejo TI - Genetic differentiation and intrinsic genomic features explain variation in recombination hotspots among cocoa tree populations AID - 10.1101/482299 DP - 2019 Jan 01 TA - bioRxiv PG - 482299 4099 - http://biorxiv.org/content/early/2019/04/20/482299.short 4100 - http://biorxiv.org/content/early/2019/04/20/482299.full AB - Our study investigates the possible drivers of recombination hotspots in Theobroma cacao using ten genetically differentiated populations. This constitutes the first time that recombination rates from more than two populations of the same species have been compared, providing a novel view of recombination at the population-divergence time-scale. For each population, a fine-scale recombination map was generated using under the coalescent with a standard method based on linkage disequilibrium (LD). They revealed higher recombination rates in a domesticated population and a population that has undergone a recent bottleneck. We address whether the pattern of recombination rate variation along the chromosome is sensitive to the uncertainty in the per-site estimates. We find that uncertainty, as assessed from the Markov chain Monte Carlo iterations is orders of magnitude smaller than the scale of variation of the recombination rates genome-wide. We inferred hotspots of recombination for each population and find that the genomic locations of these hotspots correlate with genetic differentiation between populations (FST). We developed novel randomization approaches to generate appropriate null models for understanding the association between hotspots of recombination and both DNA sequence motifs and genomic features. Hotspot regions contained fewer known retroelement sequences than expected, and were overrepresented near transcription start and termination sites. Our findings indicate that recombination hotspots are evolving in a way that is consistent with genetic differentiation, but are also preferentially driven to regions of the genome that are up or downstream from coding regions.