Genome-wide sampling suggests island divergence accompanied by cryptic epigenetic plasticity in Canada lynx

ABSTRACT Determining the molecular signatures of adaptive differentiation is a fundamental component of evolutionary biology. A key challenge remains for identifying such signatures in wild organisms, particularly between populations of highly mobile species that undergo substantial gene flow. The Canada lynx (Lynx canadensis) is one species where mainland populations appear largely undifferentiated at traditional genetic markers, despite inhabiting diverse environments and displaying phenotypic variation. Here, we used high-throughput sequencing to investigate both neutral genetic structure and epigenetic differentiation across the distributional range of Canada lynx. Using a customized bioinformatics pipeline we scored both neutral SNPs and methylated nucleotides across the lynx genome. Newfoundland lynx were identified as the most differentiated population at neutral genetic markers, with diffusion approximations of allele frequencies indicating that divergence from the panmictic mainland occurred at the end of the last glaciation, with minimal contemporary admixture. In contrast, epigenetic structure revealed hidden levels of differentiation across the range coincident with environmental determinants including winter conditions, particularly in the peripheral Newfoundland and Alaskan populations. Several biological pathways related to morphology were differentially methylated between populations, with Newfoundland being disproportionately methylated for genes that could explain the observed island dwarfism. Our results indicate that epigenetic modifications, specifically DNA methylation, are powerful markers to investigate population differentiation and functional plasticity in wild and non-model systems. SIGNIFICANCE Populations experiencing high rates of gene flow often appear undifferentiated at neutral genetic markers, despite often extensive environmental and phenotypic variation. We examined genome-wide genetic differentiation and DNA methylation between three interconnected regions and one insular population of Canada lynx (Lynx canadensis) to determine if epigenetic modifications characterized climatic associations and functional molecular plasticity. Demographic approximations indicated divergence of Newfoundland during the last glaciation, while cryptic epigenetic structure identified putatively functional differentiation that might explain island dwarfism. Our study suggests that DNA methylation is a useful marker for differentiating wild populations, particularly when faced with functional plasticity and low genetic differentiation.


SIGNIFICANCE
Populations experiencing high rates of gene flow often appear undifferentiated at neutral genetic markers, despite often extensive environmental and phenotypic variation. We examined genomewide genetic differentiation and DNA methylation between three interconnected regions and one insular population of Canada lynx (Lynx canadensis) to determine if epigenetic modifications characterized climatic associations and functional molecular plasticity. Demographic approximations indicated divergence of Newfoundland during the last glaciation, while cryptic epigenetic structure identified putatively functional differentiation that might explain island dwarfism. Our study suggests that DNA methylation is a useful marker for differentiating wild populations, particularly when faced with functional plasticity and low genetic differentiation.
. CC-BY 4.0 International license It is made available under a was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.

INTRODUCTION
Investigations into adaptive differentiation have often relied on either quantifying phenotypic variation or contrasting genetic polymorphisms identified in an organism's DNA (1,2).
Environmental conditions can be a powerful driver of population differentiation and phenotypic plasticity, where relationships are traditionally ascertained by correlating allele frequencies to environmental and morphological variation (3). Detecting adaptive differentiation in species that experience high rates of gene flow, however, is challenging due to the homogenization of genomic regions that are neutral or under weak selection (4). An aspect of molecular variation that is undetected by standard genetic sequencing involves direct modifications to the structure of DNA. Epigenetic modifications like DNA methylation are influenced by environmental conditions, directly affect gene expression, and may even be indicative of early divergence due to local adaptation (5)(6)(7).
DNA methylation has been implicated in mediating ecologically relevant traits due to varied environmental conditions (8), primarily due to its regulatory role in transcription by modifying chromatin structure, repressing transcription factors, or recruiting protein complexes that block transcriptional machinery, especially around CpG islands (9)(10)(11). CpG islands are dense clusters of cytosine-guanine dinucleotides and frequently occur near the transcription start site of genes and have a functional relationship with gene expression (12)(13)(14). Consequently, DNA methylation, particularly around genes that impart an ecologically relevant benefit, could be used to characterize populations that exhibit low levels of neutral genetic differentiation despite extensive differences in resources and conditions. . CC-BY 4.0 International license It is made available under a was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint (which . http://dx.doi.org/10.1101/316711 doi: bioRxiv preprint first posted online May. 7, 2018; Here, we assessed whether environmental variation, geographic distance, or insularity were determinants of DNA methylation structure in a free-ranging carnivore. Our study species, the Canada lynx (Lynx canadensis), is a mid-sized felid that is highly mobile and whose neutral genetic variation (i.e., microsatellites) exhibits low levels of genetic differentiation across the mainland, with divergent island populations (15,16). Although some hypotheses exist regarding the colonization of Newfoundland (15), no formal demographic analyses have been investigated using genome-wide markers. Despite a low degree of overall genetic structure, we hypothesize that two potential mechanisms might drive epigenetic divergence in Canada lynx. First, allele frequencies are correlated to climatic gradients in both population time-series and fine-scale genetic analyses (15,17,18), suggesting climate might influence patterns of methylation in lynx due to its semi-plastic nature. Second, Canada lynx show a subtle cline in body size with larger individuals in Alaska (19) to smaller individuals in insular populations including Newfoundland and Cape Breton Island (20). Body size changes in island populations appear to be consistent with the 'Island Rule' (20), where insular mammals are smaller in size compared to their mainland counterparts (21-24). If functional genes related to body size are repressed in geographically isolated populations, then epigenetic modifications might underlie the molecular plasticity that is associated with these differences (25). Based on these two mechanisms (climate and island isolation), we tested two predictions: i) climatic conditions are the driving force behind spatial epigenetic structuring (26); and ii) Canada lynx on the island of Newfoundland exhibit differential methylation compared to mainland populations that explains the observed trends in phenotypic divergence (27).

RESULTS
. CC-BY 4.0 International license It is made available under a was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint (which . http://dx.doi.org/10.1101/316711 doi: bioRxiv preprint first posted online May. 7, 2018; We collected 95 Canada lynx epidermal tissue samples from four populations (n = 23-24 per sampling area) across North America, including one insular population in Newfoundland ( Fig. 1; Table S1. The sampled populations have a wide geographical spread, with an average minimum distance between populations (Québec and Newfoundland) of 1,158 km and a maximum distance (Alaska and Newfoundland) of 5,520 km. Habitats around these populations present a dynamic range of environmental conditions, ranging from 32 to 432 mm of winter precipitation and a mean annual temperature range (28) of -6.3 to 4.7° C. To determine associations between these environmental conditions and genome-wide patterns of DNA methylation, we created a reduced representation bisulfite sequencing (RRBS) library (full protocol in Supplemental). Paired-end sequencing on a HiSeq2500 generated a total of 210,773,612 filtered and demultiplexed reads that were aligned to the domestic cat genome (85.0% average mapping success; Table S1 Felis catus; NCBI: felcat9.0) and variants were called using specially designed software for bisulfite converted reads (29). The cat genome was chosen due to our interests in functional annotations and the multiple revisions this genome has underwent resulting in a chromosome-scale assembly (NCBI: felcat9.0; 3GCA_000181335.4). We assessed bisulfite conversion efficiency by including non-methylated lambda phage DNA in the sequencing lane, and mapped our reads to the human (Homo sapiens) and lambda phage genomes to rule out contamination (2.7% and <0.1% success, respectively). We used a Bayesian framework to call genetic variants from bisulfite converted reads; these SNPs were used to characterize historical demography and neutral genetic structure. The distribution of methylated sites and SNPs are shown in Fig. 2; our multiplexed RRBS library with 95 individuals sampled an average of 2.6 million unique CpG positions per individual ( = 262,000; = 9,677,599; Table S2). Furthermore, we quantified the temporal effects of methylation, the ramifications of missing data, model . CC-BY 4.0 International license It is made available under a was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint (which . http://dx.doi.org/10.1101/316711 doi: bioRxiv preprint first posted online May. 7, 2018; parameterization, and the effects of dataset subdivisions based on feature type with sensitivity analyses, with no implications on overall inferences ( Fig. S1 -S5).

Population demography and subtle neutral genetic differentiation
We estimated the time of divergence ( ) and the best fit historical demographic model for Newfoundland using an AIC model selection framework using the SNP dataset. Demographic scenarios (Table S3) were determined using the site-frequency spectrum (SFS) implemented in the program ∂a∂i (30). The best fit model for our joint SFS (Fig. S6) between the mainland and Newfoundland suggests a split during the last glacial maximum ( = 26.5 ± 6.5 , Table   S4) with early asymmetric admixture from the mainland into the ancestral Newfoundland population (Fig. 3). This model indicated that Newfoundland likely experienced a founder effect with no contemporary admixture with the mainland, while overall diversity between populations was still relatively low (Ө = 3.1). An additional one-dimensional SFS solely for Newfoundland corroborated our identification of a demographic bottleneck (Fig. S7). We then performed an analysis of molecular variance (AMOVA) to assess relative population differentiation, where extensive variation was seen within samples ( = 24.8) relative to overall variation between populations ( = 0.46). Relative differences among all populations ( = 0.02) was again diminutive compared to the variation determined between Newfoundland and the mainland ( = 0.09). Bayescan identified no highly differentiated ( ) loci under selection (Fig. S8), and a general pattern consistent with isolation by distance was observed ( − = 43.6; 2 = 0.31; ≤ 0.001). A principal coordinates analysis computed from a Euclidean distance matrix summarizing SNP variation detected relatively congruous populations (Fig. 4), where the first axis largely described variation between the mainland and Newfoundland ( 1 = 30.3%), . CC-BY 4.0 International license It is made available under a was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint (which . http://dx.doi.org/10.1101/316711 doi: bioRxiv preprint first posted online May. 7, 2018; while the second axis detected the subtler differentiation between mainland populations ( 2 = 6.1%). We used multivariate distance-based redundancy analyses (db-RDAs) and step-wise model selection to quantify the determinants of observed variation in DNA methylation, which identified three significant variables: geographic distance, winter conditions and a binary variable representing insularity for the Newfoundland population ( − = 16.6; 2 = 0.35; all variables ≤ 0.001, Table S5). Tree cover ( = 0.77) and a randomly-generated numerical variable to assess the effect of noise ( = 0.71) added no explanatory power to the model (Table S5). Collinearity was low between all retained variables ( = 2.09 − 3.77; Table S6). We examined the explanatory power of each variable independently using partial db-RDAs (Fig. 5), which identified the most variation solely explained by geographic distance . CC-BY 4.0 International license It is made available under a was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.

DMRs over morphological genes could explain island dwarfism
We conducted a de novo annotation of CpG islands throughout the cat genome ( = 28,127) using hidden Markov models, which contained an average GC content of 59.2% with a posterior probability of observed-to-expected GC content (CpGo/e) of 1.13. Our 705 5,000-bp windows of methylation were then divided based on overlap with a CpG island or gene body, which were used to identify differential methylation between populations using beta regressions with a Bonferroni-corrected significance threshold of 3.19 × 10 −5 . A total of 42 differentially methylated genes were identified across all populations, including 4 long non-coding RNAs.
Overrepresented functional pathways for differentially methylated genes were identified using a Panther-Gene Ontology (GO) analysis, which identified cellular component morphogenesis  (Table S7). Three specific genes identified with the ontology analysis warranted closer examination due to their known direct regulation of body mass and morphology, as identified in previous laboratory experiments. These genes (HDAC9, TMOD2, and ZEB1) were hypermethylated in Newfoundland compared to the mainland, with a difference in methylation of 39.6, 29.4, and 37.6%, respectively ( Fig. 6).

DISCUSSION
. CC-BY 4.0 International license It is made available under a was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.

Newfoundland as a glacial refuge for Canada lynx
The historical demography of insular Canada lynx has been the focus of speculation, with estimates ranging from intra-glacial colonization to introductions within the last century (15).
Our demographic analyses with SNP data suggest the former, where colonization occurred sometime during the last-glacial period; for example an ice bridge is believed to have connected the mainland to Newfoundland and portions of the island were still thought to be ice-free (31).
While our population divergence time estimates are scaled by mutation rate and generation time, all reasonable metrics support a historic colonization (> 1,000 years). Our analyses also indicated that Newfoundland underwent a demographic bottleneck after colonization, consistent with microsatellite data (16), and that genetic diversity remains low compared to mainland populations. Row et al. (15) showed a minimal barrier effect of the Rocky Mountains to Canada lynx, which is also supported with our SNP data. Despite minimal range-wide nuclear divergence in lynx, it remains possible that distinct environmental variation might be driving unique patterns of functionally important differentiation at the range margins, including Alaska and Newfoundland.

DNA methylation detects biologically relevant cryptic population structure
We have shown that despite largely undifferentiated populations across the mainland at the genetic level, epigenetic structure across the range is substantial and is suggestive of functional epigenetic plasticity in the face of genetic homogeneity. Despite the lack of genetic structure between mainland populations (Rueness et al. 2003, this study), we observed distinct patterns of DNA methylation that differentiate Alaska from mid-and east-continental populations. While the segregation of Newfoundland from the mainland was largely reflected across both epigenetic and genetic datasets, closer examination of differentially methylated regions in Newfoundland . CC-BY 4.0 International license It is made available under a was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint (which . http://dx.doi.org/10.1101/316711 doi: bioRxiv preprint first posted online May. 7, 2018; appear to explain the phenotypic trends observed in this insular population. Specifically, the differentiated DNA methylation patterns in the Newfoundland population suggest a molecular pathway for the morphological differences observed between mainland and island populations The genetic contribution to epigenetic variation cannot be ignored (37), but cannot be irrefutably resolved without an economically-prohibitive amount of sequencing, including whole-genome and whole-methylome assays (6). Our genome-wide sampling of a non-model organism and localization of SNPs completely outside of DMRs indicates that these genetic data largely describe neutral genetic structure without confounding our signals from methylation data, while our epigenetic data describe molecular plasticity with potential functional ramifications.
. CC-BY 4.0 International license It is made available under a was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint (which . http://dx.doi.org/10.1101/316711 doi: bioRxiv preprint first posted online May. 7, 2018; Importantly, our demographic inferences suggesting a relatively recent evolutionary establishment of Newfoundland Canada lynx could implicate DNA methylation as a marker to examine rapid evolutionary change. Evolutionary theory in model organisms has predicted that epimutations and methylome evolution often precede genomic changes (38), which has been further substantiated with empirical evidence examining epigenetic changes over structural genomic variants in the absence of genetic variation (39).
As evidence accumulates regarding the mechanisms underlying epigenetic inheritance (40) and the adaptive benefits of epialleles (41), DNA methylation has the potential to be worked into a framework of adaptive divergence (42, 43) and ultimately into the ecological speciation model (4). It is possible that hypermethylation of locally deleterious genes could provide a precursory mechanism to adaptive divergence over linked SNPs. In contrast, if phenotypically-plastic epigenetic marks are advantageous and heritable, then selection might act directly on the repression of these genomic locations, even in the absence of underlying genetic divergence, Overall, our results indicate that DNA methylation can detect cryptic population structure in genetically homogenous species while providing insight on the functional pathways that shape phenotypic divergence between populations. Epigenetic modifications offer great utility for understanding the mechanisms intertwining ecology and evolution (50,51), and we offer among . CC-BY 4.0 International license It is made available under a was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint (which . http://dx.doi.org/10.1101/316711 doi: bioRxiv preprint first posted online May. 7, 2018; the first empirical results demonstrating the historical and functional inferences that can be gained from comparative methylomics, even in non-model organisms. PicoGreen® ds-DNA assay using manufacturer's instructions (Thermo Fisher Scientific). We adapted an existing reduced representation bisulfite sequencing library preparation workflow designed for multiplexed high-throughput sequencing (52). Genomic DNA (400 ng) for 95

Sample Acquisition and Reduced-Representation
Canada lynx samples was digested with NsiI and AseI restriction enzymes overnight and subsequently ligated with methylated adapters (Table S8). An individual sample of completely non-methylated lambda phage genomic DNA (200 ng; Sigma-Aldrich -D3654) with a unique barcode was included to assess bisulfite conversion efficiency (Table S9). Barcoded samples were then combined into eight pools to ensure consistent reaction environments for the entire library using a QIAquick PCR Purification Kit (Qiagen, Valencia, CA, USA) following manufacturer's instructions and 10 μL of 3M NaAc was added to neutralize pH. We then . CC-BY 4.0 International license It is made available under a was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.

Bioinformaticsquality checks and SNP calling
We assessed sequencing success as well as removed adapter and low-quality reads via FastQC (54) and Cutadapt (55) implemented in TrimGalore! v0.4.4 (56). Individuals were demultiplexed using python scripts (52). Paired-ends reads for Canada lynx samples were initially aligned to . CC-BY 4.0 International license It is made available under a was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.  (64). This dissimilarity matrix was then summarized in a principal coordinates analysis (PCoA) using the dudi.pco function in adegenet (65). Missing data were imputed by mean allele at a population level. Pairwise was calculated using VCFtools (62) and any highly fixed . CC-BY 4.0 International license It is made available under a was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint (which . http://dx.doi.org/10.1101/316711 doi: bioRxiv preprint first posted online May. 7, 2018; SNPs putatively under selection were identified with BayeScan v2.0 with three runs using different prior odds ratios to test for sensitivity (66) (Fig. S8). AMOVA between populations was computed using the poppr.amova function within poppr (Excoffier et al. 1992, Kamvar et al. 2014, Fig. S10). We determined if genetic structuring was consistent with patterns of isolation by distance using a db-RDA with the first 3 axes of a PCoA on SNP data as response variables with the first axis of a PCoA on a Euclidean distance matrix of latitude and longitude as an explanatory variable.

Demographic Inferences and Simulations
Historical demography was assessed using the program ∂a∂i (30). ∂a∂i performs numerous optimizations to assess fit between various demographic scenarios and the expected allele frequency spectrum according to a multi-population diffusion equation, which can then be repeated numerous times to find ideal optimization parameters and assess goodness of fit. VCF files were converted to a ∂a∂i appropriate SNP format using custom python scripts. Each locus contained a maximum of one SNP and thus assumptions of linkage disequilibrium for the SFS were satisfied.
To determine the best fit demographic model, two rounds of cross-validated model selection were conducted with 10 and 20 replicates, respectively, iterated maximally five times, using a grid size of 100, 110, and 120, and projections that maximized sample size against the number of segregating size (73 for mainland, 8 for Newfoundland), as outlined in Portik et. al (69). The most likely demographic model (lowest AIC) was then optimized to determine the best values of , 1 , 2 , 1 , , and using a similar process outlined above except specifying explicit upper and lower bounds, using three-fold cross validation, and repeating the entire process five times, as detailed in Portik et al. (69). Model fit was assessed by comparing empirical model fit to 100 . CC-BY 4.0 International license It is made available under a was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint (which . http://dx.doi.org/10.1101/316711 doi: bioRxiv preprint first posted online May. 7, 2018; SFS simulations with two-fold cross validation (Fig. S6), as outlined in Barratt et al. (70). This process was again repeated to create a 1D SFS solely for Newfoundland to assess the presence of a historical bottleneck (Fig. S7). Split time was determined by first using the equation , (Ө =   4 ), which after solving for was used to determine divergence with the equation ). To assess sensitivity (Table S4)

Bioinformatics -DNA methylation
We identified methylated and non-methylated positions by first filtering BAM files for incomplete bisulfite conversion based on reads containing more than three methylated positions in a CHH or CHG context. In the remaining reads, methylated positions in a CpG context with a sequencing depth of at least five were extracted with Bismark (29), truncating the last two bases of the forward mate-paired reads (R1) and the first two bases of the reverse mate-paired (R2) reads. Methylation polymorphisms in areas of overlap between read pairs were extracted only once. After confirming that our non-directional library contained roughly equal reads for all possible amplified DNA strands, we proceeded to analysis.
We first generated a custom CpG island annotation track using hidden Markov models based on CpGo/e implemented in makeCGI (73). Only islands with a calculated posterior probability greater than 99.5% were retained for analysis, based on CpGo/e. Mapped and extracted methylated sites were then imported into Seqmonk (74) using the generic text importer, and raw . CC-BY 4.0 International license It is made available under a was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint (which . http://dx.doi.org/10.1101/316711 doi: bioRxiv preprint first posted online May. 7, 2018; data were qualitatively visualized against the annotated domestic cat genome (felCat9.0). We analyzed DNA methylation over CpG islands and gene bodies by creating 5,000-bp running windows directly over and 25,000-bp upstream of gene bodies, combined with windows directly over CpG islands. Remaining, unannotated regions of the genome were then also divided into 5,000-bp windows, and added to the analysis. Each window was assigned a methylated percentage score based on the overall ratio of methylated to non-methylated bases within the feature. A window size of 5,000-bp was chosen based on maximizing a number of windows with all 95 individuals, and is similar to sizes seen in other studies (75). We filtered this window-set for regions that had at least one CpG and equal representation from each population due to low coverage of some individuals, although many windows contained more CpGs (Table S10). No overlap between SNPs and DMPs were identified (Fig. 2).

Sensitivity analyses
To determine the implications of missing data and PCoA axis retention thresholds, we performed two sensitivity analyses. The first analysis examined the effects of missing data by repeating all analyses using a subset of the top 10 individuals per population (N = 40), and again with the top 6 individuals per population ( = 24), across both SNP and DNA methylation datasets (Fig. S1 -S5). Overall trends in explanatory effects (R 2 ) and qualitative inferences (PCoA clustering) were investigated and no change in inferences were determined. We examined the implications of arbitrary PCoA axis retention by repeating all analyses, but instead using different cumulative variation explained thresholds as response variables. We performed a number of db-RDAs using all axes explaining 30%, 50%, 75%, and 95% cumulative variation as response variables, and results were qualitatively similar regardless of axis retention (Fig. S4). We also assessed if feature-specific patterns of DNA methylation were different between populations. We did this by . CC-BY 4.0 International license It is made available under a was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint (which . http://dx.doi.org/10.1101/316711 doi: bioRxiv preprint first posted online May. 7, 2018; treating our CpG island and gene body dataset separately from our unannotated region dataset, and repeating all analyses (Fig. S3 -S5). We also assessed where our DNA methylation dataset fell on the spectrum of missing at random (MAR) or missing not a random (MNAR) per locus between individuals (Fig. S11). Although a number of individuals in Newfoundland show low per locus sampling, this is likely due to stochastic inequal amplification or miscalibration of equimolar pooling during super-pooling of the final RRBS library, as roughly 12 sequential individuals (the number of individuals in a pool) suffered from low coverage, while remaining individuals show similar sampling to other populations (Fig. S11). We repeated ordinations after removal of these individuals, which showed no impact on observed patterns, suggesting that missing data is not the cause of the unique epigenetic structuring of Newfoundland (Fig. S11).

Quantifying environmental associations
To determine if patterns of DNA methylation could be explained by macro-scale winter conditions, geographic distance, or insular divergence, we performed a distanced-based redundancy analysis on the summarized DNA methylation data. Meaningful axes explaining > 30% of the cumulative variation in the data were used as response variables in a distance-based redundancy analysis (db-RDA) conducted in vegan (76), using variables that putatively describe the environmental determinants of population structure in Canada lynx. Our covariates included a binary variable of insularity, which identified the Newfoundland population against mainland populations and was used to describe the largely impassable barrier of the Strait of Belle Isle between Newfoundland and mainland Labrador (77). A variable of geographic distance was included which was simply the first axis of a principal coordinates analysis (PCoA) on a Euclidean distance matrix of latitude and longitude ( 1 = 99.7% of the variation, Fig. S12).
In addition to the geographic variables, we included a biotic variable of percent tree cover (78), a . CC-BY 4.0 International license It is made available under a was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint (which . http://dx.doi.org/10.1101/316711 doi: bioRxiv preprint first posted online May. 7, 2018; randomly generated numerical variable to assess the effect of noise, and a climate variable describing winter conditions. To create the climate variable, we performed a PCA on climate data to prevent multi-collinearity, which reduced annual temperature ranges, winter precipitation, and minimum coldest temperature to a single PCA axis (28) ( 1 = 85.6% of the variation, Fig.   S13). Linearity was confirmed between response and explanatory variables, and multicollinearity between explanatory variables was assessed using the VIF and any variables > 4 were removed (Table S6). Step-wise model selection using the function ordistep within vegan (76) was performed to isolate the best overall model using a QR decomposition technique based on p-values (Table S5). To isolate the individual explanatory power of each variable, we performed partial distanced-based redundancy analyses (p-dbRDAs) on the variables that were identified as significant in the full db-RDA.

Differentially methylated regions and gene ontology
We identified differentially methylated regions with functional biological correlates by performing beta-regressions on windows over CpG islands and gene bodies for all 95 individuals with percent methylation as the response variable and population as the explanatory variable.
Beta regressions are appropriate for proportion or percentage data (79), and we set an alpha threshold at conservative levels seen in similar studies (80) ( < 0.001). Direct overlap between our differentially methylated regions and the felcat9.0 gene annotations were extracted using Seqmonk (74). Samples were then categorized as 'Mainland' or 'Newfoundland' and population averages and standard error of the mean were calculated for visualization. We then identified specific enrichment for biological functions of differentially methylated genes using the Panther-GO Gene Ontology Consortium database (http://geneontology.org/page/go-enrichment-analysis) with significance thresholds seen in other studies (80).
. CC-BY 4.0 International license It is made available under a was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint (which . http://dx.doi.org/10.1101/316711 doi: bioRxiv preprint first posted online May. 7, 2018; . CC-BY 4.0 International license It is made available under a was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint (which . http://dx.doi.org/10.1101/316711 doi: bioRxiv preprint first posted online May. 7, 2018; 18. Row  was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.  . CC-BY 4.0 International license It is made available under a was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.  was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.  . CC-BY 4.0 International license It is made available under a was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.  Canada lynx, where filtered reads were mapped to the 19 chromosomes of the domestic cat.
. CC-BY 4.0 International license It is made available under a was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.  . CC-BY 4.0 International license It is made available under a was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint (which . http://dx.doi.org/10.1101/316711 doi: bioRxiv preprint first posted online May. 7, 2018; Fig. 4. Principal coordinate ordination plots of genetic (left) and epigenetic (right) variation between Canada lynx populations indicate cryptic structure at epigenetic markers, with individuals as points and populations delineated by colour. All molecular data was summarized with a pair-wise Euclidean dissimilarity matrix. Methylation data was summarized with 5,000-bp running windows across the genome (n = 705) and SNPs were called from bisulfite converted reads (n = 85).
. CC-BY 4.0 International license It is made available under a was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint (which . http://dx.doi.org/10.1101/316711 doi: bioRxiv preprint first posted online May. 7, 2018; . CC-BY 4.0 International license It is made available under a was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.  Panther gene-specific function is identified to the right of the bar plots. All differentially methylated regions were located directly over gene bodies.
. CC-BY 4.0 International license It is made available under a was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.