ABSTRACT
Genetic variation in life-history timing allows populations to synchronize with seasonal cycles but little is known about the molecular mechanisms that produce differences in circannual rhythm in nature. Changes in diapause timing in the European corn borer moth (Ostrinia nubilalis) have facilitated rapid response to shifts in winter length encountered during range expansion and from climate change, with some populations emerging from diapause earlier to produce an additional generation per year. We identify genomic variation associated with changes in the time spent in winter diapause and show evidence that the circadian clock genes period (per) and pigment dispersing factor receptor (Pdfr) interact to underlie this adaptive polymorphism in circannual rhythm. Per and Pdfr are located within two epistatic QTL, strongly differ in allele frequency among individuals that pupate earlier or later, have the highest linkage disequilibrium among gene pairs in the QTL regions despite separation by > 4 megabases, and possess amino-acid changes likely to affect function. One per mutation in linkage disequilibrium with Pdfr creates a novel putative clock-cycle binding site found exclusively in populations that pupate later. We find associated changes in free-running daily circadian rhythm, with longer daily rhythms in individuals that end diapause early. These results support a modular connection between circadian and circannual timers and provide testable hypotheses about the physiological role of the circadian clock in seasonal synchrony. Winter length is expected to continually shorten from climate warming and we predict these gene candidates will be targets of selection for future adaptation and population persistence.
INTRODUCTION
Many species display tremendous flexibility in the annual timing of physiological, morphological, and behavioral transitions that enable survival in seasonal environments. The capacity to adjust the timing of circannual rhythms and track local seasonal cycles can facilitate expansion into new geographic areas with different seasonal environments (1, 2). Dramatic shifts in seasonal timing seen across plants and animals in recent decades (3) may also enable adaptation (4) and persistence (5, 6) during rapid, anthropogenic alterations of the environment. When shifts in seasonal timing additionally change the number of generations per year (7-9), populations may grow faster and even tolerate a faster rate of sustained environmental change (Figure S1) (10). Nevertheless, if environments change too rapidly, timing mismatches with seasonality can occur (11, 12), resulting in loss of fitness and population decline (5, 6, 12). The capacity to adjust seasonal timing and track changes in seasonal cycles, as well as our ability to evaluate risks to biodiversity, depends in part on the proximate causes of variation in circannual rhythm (10, 13). However, relatively little is known about the molecular basis of this diversity (11, 14, 15).
Insects are highly variable in the seasonal timing of transitions into and out of diapause, a stress-tolerant physiological state that enables coping with seasonal challenges (1, 16-19). For temperate species, which generally use seasonal changes in photoperiod and temperature to synchronize diapause with winter stress, the timing of diapause transitions in spring and autumn varies widely within and among species (15, 20) and therefore provides an excellent opportunity for analysis of the genetic control of circannual timing. We used natural variation in timing of spring transitions from larval diapause to active development of the European corn borer moth (Ostrinia nubilalis) to understand the genetic basis of this seasonal adaptation. In corn borers, the duration of developmental arrest in the spring (i.e., diapause termination timing, (18)) generally tracks winter length. Shifts in diapause timing have rapidly evolved across latitudinal gradients in winter after range expansion from Europe to North America in ∼1910 (21, 22). The length of winter decreases with decreasing latitude in North America. Consequently, the optimal time to exit diapause is advanced to earlier in the year and the number of days required to end larval diapause evolved into a positive correlation with latitude (ranging from 17.5 to 49 days across 9.28°N latitude; Figure S2) (22). Changes in diapause timing similarly tracks shorter winters associated with climate warming, with populations at the same latitude showing a ∼50% average reduction in time needed to transition out of diapause since the 1950s in some locations (22, 23). Although broad-scale changes in climate may be leading to directional change in diapause timing across space and over time, polymorphism is common in natural populations (24, 25) and may be maintained (26) because timing shifts alleviate competition for limiting resources (such as host plants) or may be favored by year-to-year fluctuations in seasons (27). In the mid-Atlantic region of the United States, earlier springtime pupation (∼20 days) reduces generation time, thereby enabling production of two generations per year (bivoltine) rather than one generation per year (univoltine; Figure S3) (28, 29). Geographic co-occurrence of earlier-and later-pupating individuals (∼40 days) leads to asynchronous adult mating flights (June versus July) and allochronic reproductive isolation (Figure S4) (29). Thus, in corn borers, range expansion and population growth, enhanced tolerance of environmental change, and speciation may all be byproducts of natural selection on diapause timing during seasonal adaptation (29, 30).
Despite decades of work on Ostrinia (25, 26, 31-33), no causative loci for natural variation in diapause termination timing have been identified definitively. We have therefore taken an unbiased, whole-genome approach to identify loci underlying this variation. Previous work found sex-linked inheritance (31, 32) and evidence for a single quantitative trait locus (QTL) on the Z (sex) chromosome (25). However, a putative inversion encompassing 39% of the Z chromosome was discovered and the QTL was locked into a non-recombining region along with hundreds of genes, many of which were differentially expressed during diapause break (33-35). Subsequent work demonstrated that the recombination suppressor is polymorphic in field populations (35) and we therefore performed QTL mapping in pedigrees with putatively collinear Z chromosomes. An advantage of this approach is that it will exclusively identify the genetic architecture of diapause timing, but a challenge is that individual genes or mutations associated with trait differences cannot be easily identified. Therefore, we obtained the higher resolution needed using population genomic sequencing data derived from phenotyped, field-caught moths. As population genomic analysis will identify mutations within individual genes underlying changes in measured traits, as well as loci controlling unmeasured traits subject to correlated selection in nature (such as temperature tolerance), we view the approaches as parts of a complementary, two-pronged forward-genetics strategy to characterize the genetics of natural variation in circannual rhythm.
RESULTS
QTLs for diapause timing
Our first forward genetic approach used QTL mapping of diapause termination timing. In corn borers, the main environmental trigger to end diapause is photoperiod (36, 37). The time required to end diapause and return to active development was quantified as the time to pupation under diapause-breaking photoperiod and temperature (referred to as post-diapause development (PDD) time). Variation in PDD time was measured in backcross pedigrees of females from a two-generation (bivoltine), early-emerging, short-PDD population collected in East Aurora, NY in 2011 (EA, PDD time < 19 days) and males from a one-generation (univoltine), later-emerging, long-PDD laboratory colony originally derived from Bouckville, NY in 2004 (BV, PDD time ≥ 39 days) (25, 28, 34, 35, 38). Diapause in 5th instar backcross larvae was induced by a winter-like short-day 12 hour (h) photoperiod. Subsequently, PDD time was measured as the number of calendar days required for diapausing larvae to pupate after transfer to a summer-like long-day (16 h) photoperiod. PDD time varied from 11 to 63 days. Using 167 autosomal and 18 Z-linked molecular markers, we found that only the Z chromosome was significantly associated with PDD time (N = 67 offspring, LOD = 9.67, P < 0.001).
We refined the Z chromosome QTL map by including 226 offspring and 35 markers which resulted in the prediction of two adjacent, interacting QTL (Figure 1a). Model fit was improved by inclusion of QTL1 (F2 = 22.89, P < 0.001), QTL2 (F2 = 9.09, P < 0.001), and their interaction (F1 = 7.80, P = 0.005; Table S1; Figure 1c,e), indicating that natural variation for diapause termination time in the analyzed populations is regulated by at least two genes. Together these QTL and their interaction explained 35.3% of phenotypic variance. QTL1 was located at 29.5 cM (BCI = 24.5-31 cM) (Figure 1b) and QTL2 was located at 34 cM (BCI = 31.5-36 cM) (Figure 1c,d). QTL1 was estimated to be at ∼3.1 Mb in size (on the 21 Mb Z chromosome), containing ∼48 genes with annotations in draft European corn borer moth genome (GenBank BioProject: PRJNA534504; Accession SWFO00000000; Supplemental Material; Table S2). QTL2 was estimated to be ∼3.7 Mb in size with ∼42 annotated genes. The QTL1 allele originating from the parental strain expressing shorter PDD time (EA) was epistatic to allelic changes at QTL2 and masked its effect (two-way ANOVA, F1,218 = 8.76, P = 0.003; Figure S5). In contrast, an allele at QTL1 originating from the parental strain expressing longer PDD time (BV) resulted in an unusually long PDD time when paired with a QTL2 allele from a short-PDD time parent (EA) (mean PDD time ± SD: QTL1BV/QTL2EA = 51 ± 11.92, QTL1BV/QTL2BV mean PDD time = 39.52 ± 8.05).
a) Z chromosome consensus linkage map, adapted from Kozak et al. (2017) (35); b) plot of QTL1 estimated using scanone, with Bayesian credible interval (BCI) shaded; c) plot of the LOD score for a model with an epistatic interaction compared to a model including only a single QTL, blue line indicates the 1.5 LOD interval contour and the location of QTL2; d) plot of QTL2 estimated using scanone on individuals with slow allele at QTL1 (BCI shaded). N = 226 offspring for a-c; N = 101 for d.
Genetic variation in natural populations
We used genome sequencing of wild populations as a second forward genetic approach to identify segregating chromosomal regions affecting diapause timing. We tested for associations with PDD time in pooled sequencing (pool-seq) data derived from field collections of 5 populations (Table S3). In addition to a pool of a pool of individuals from the EA population (N = 34; mean PDD time = 9.8 ± 1.2 days) and a separate field collected pool from the BV population (N = 20; mean PDD time = 42.9 ± 7.9 days), we sequenced pools from Penn Yan, NY (PY, N = 26; univoltine, long; mean PDD time = 52.9 days ± 5.3), Geneva, NY (GEN, N = 25; univoltine, long; mean PDD time = 45.3 days ± 5.4), and Landisville, PA (LA, N = 39; bivoltine, short; PDD time < 19 days). Paired-end 150 bp Illumina reads were aligned to the draft European corn borer moth genome ordered and oriented into 31 chromosomes (61% of 454.7 Mb assigned locations genome-wide; 20.9 Mb of the Z chromosome ordered; Table S4). Average coverage was 25X (range = 12-40X).
Overall genetic differentiation was low among populations as estimated from mean pairwise FST across 1 kb windows (mean autosomal FST = 0.05; Z chromosome FST = 0.06) with the highest values of FST between short and long populations observed on the Z chromosome in the QTL regions (Figure 2, Figure S6). To identify gene regions associated with PDD time while directly accounting for population demography we used a Bayesian framework. BayPASS 2.1 (39) was used to estimate a covariance matrix that represents an approximation of the unknown demographic history and test associations of single nucleotide polymorphisms (SNPs) with PDD time while accounting for any covariance. BayPASS was run separately for Z chromosome and autosomal loci, as our expectations about the demographic history and number of haploid chromosomes in each pool differed between sex chromosomes and autosomes (see methods). Significantly associated alleles were defined as containing SNPs with a Bayes Factor (BF) > 20 deciban units (dB), correlation coefficient (r) ≥ 0.5, and strength of association (β) with a posterior distribution that had a probability < 0.01% of β = 0 (“empirical Bayesian P-value” eBPis > 2) (39, 40). The autosomal analysis identified 7 SNPs in predicted genes with BF > 20 dB, but all of these had eBPis < 0.5. On the Z chromosome, 16 of 8,435 SNPs in predicted genes had BF > 20 dB and showed a strong association with PDD time (r ≥ 0.57). However, only four Z-linked SNPs (0.05%) in predicted genes had eBPis > 2. Congruent with our mapping results, all SNPs with eBPis > 2 fell inside the two interacting QTL regions (Table 1a; Figure 2c) and three were within two genes known to interact in the same pathway— the circadian clock genes period (per) and pigment-dispersing factor receptor (Pdfr). One SNP was within QTL1 in per (Figure 3a; Table S5), a core circadian clock gene (41), and two SNPs within QTL2 were in Pdfr, the gene encoding the receptor for the main circadian neurotransmitter PDF (42, 43). The remaining SNP with eBPis > 2 was within terribly reduced optic lobes (trol) (Figure 3b) which encodes an extracellular matrix protein and is not known to interact with per (44). Two additional intergenic SNPs with eBPis > 2 were between Pdfr and trol (Figure 3b). No additional outlier loci were detected when we analyzed all genome scaffolds (including those lacking an assigned chromosomal location) as if they were on the Z chromosome, indicating that per and Pdfr have the strongest association with PDD time across the entire sequenced genome (Figure S7-S8). Per and Pdfr also displayed extreme values of FST (> 0.5) and significance (q < 10−10) in Cochran-Mantel-Haenszel (CMH) outlier tests (45) (see supplemental results; Table S6; Figure S9).
a) SNPs identified by BayPASS in pool-seq data including the demographic corrected measure of population differentiation (XtX), and measures of association with PDD time: pearson r, beta, Bayes Factor (BF, measured in dB), eBPis, and maximum linkage disequilibrium with other SNPs under the QTL. Outliers in the individual resequencing data case/control analysis (P values and FDR corrected q-values shown) with long PDD allele (minor allele) and minor allele frequency (MAF) for short and long PDD samples for b) E-box altering SNP and c) amino-acid (AA) altering SNPs. All case-control results listed in Table S9-S10. Position in base pairs (BP) on the scaffold shown.
a) Genome-wide plot of mean FST between short and long populations gene regions for chromosomes 1(Z)-31, with unscaffolded chromosomes assigned to chromosome 32; N = 57,842 1 kb windows. (b) Genome-wide plot of BayPASS empirical Bayesian P-values (eBPis) for PDD association (N = 293,590 SNPs in CDS); eBPis > 2 (equivalent to P< 0.01) indicated by red line. (c) Plot across the Z chromosome. SNPs with strongest evidence of association denoted by triangles (Bayes Factor > 20 dB) and labeled in red (eBPis > 2); no evidence denoted by black circles (BF < 20 dB); location of QTL1 and QTL2 BCI shown; N = 8,435 SNPs within genes.
a) eBPis plotted for 2 Mb interval around per, showing per (red) location and other flanking gene intervals (blue); entire region is within QTL1 BCI. Intergenic SNPs included (N=1,441). b) eBPis plotted for 2 Mb interval around Pdfr (bright red) location, trol (dark red) and other gene intervals (blue); all within QTL2 BCI except kon and CG10467 (N=1,423 SNPs). eBPis > 2 labeled in red. BF > 20 dB denoted by triangles. Full gene descriptions listed in Table S5.
Linkage mapping indicated two QTL located ∼4.5 cM (∼ 5 Mb) apart contribute to the evolution of PDD time. In wild populations, alleles in QTL1 and QTL2 associated with PDD time should be in high linkage disequilibrium (LD) due to their joint effect on the phenotype. To measure LD, we resequenced the genomes of individual moths from all 5 populations with long (N = 18) and short (N = 25) PDD times using 150 bp paired-end Illumina sequencing at 14X coverage (mean coverage = 14.22 ± 4.55). We calculated r2 between SNPs in 627 genes on different genome scaffolds of the Z chromosome ≤ 10 Mb apart for a total of 41,193 biallelic SNPs with MAF ≥ 0.25 (since 18/43 individuals had long PDD, only SNPs with a high minor allele frequency represent potential candidate mutations underlying PDD time). LD was high for genes < 2 Mb apart (maximum LD = 0.97, 99.9% quantile = 0.77), but decayed over larger physical distances (2-10 Mb: maximum LD = 0.77, 99.9% quantile = 0.56).
We identified LD outliers by calculating a 95% confidence interval for the 99.9% quantile of all gene pairs within a 1 Mb window (N = 10,000 bootstrap replicates). Among gene pairs located within or between the two QTL regions (≥ 2 Mb apart and ≤ 7 Mb), there were 12 outliers (Figure S10; Table S7). Of these, the most extreme LD outlier was between per and Pdfr (Figure 4; Figure S11) and specifically, maximum LD occurred between 9 SNPs in the 5’UTR intron of Pdfr and 3 SNPs in the 5’UTR intron of per (r2 = 0.75). In both genes, introns within the 5’ UTR contain E-box cis-regulatory enhancer elements where the circadian transcription factors CLOCK (CLK) and CYCLE (CYC) bind (46, 47). In Drosophila melanogaster, Pdfr contains one CLK-CYC binding site in the 5’UTR intron and per contains three in the 5’UTR intron and one E-box upstream of the promoter (46-47). The long-PDD corn borer allele for one of the high LD SNPs created a novel E-box element (CACGTG) in the 5’UTR and this allele was completely absent in the short-PDD populations (Table 1b). Additionally, per and Pdfr were present in other outlier gene pairs (Pdfr with the genes magu and CG6752; per with genes flanking Pdfr: trol, meigo, and CG32809), and one pair contained the circadian gene clk with 1-Cys peroxiredoxin (Prx6005). We also performed LD analysis on the outlier SNPs identified by BayPASS (eBPis > 2) with all other SNPs (>1 Mb apart; MAF ≥ 0.25) and found the 2 outlier SNPs in Pdfr had the highest LD with SNPs in per (r2 = 0.73) and the single SNP in per had the highest LD with SNPs in Pdfr (r2 = 0.61). The SNP in trol had the highest LD with autophagy-related 9 (Atg9; r2 = 0.53).
Histogram of maximum linkage disequilibrium (LD, measured by r2). Red bar indicates per-Pdfr, the genes with the maximum LD observed for any genes in the two intervals over 1 Mb apart (N = 1,678 pairs).
Our genome-wide Bayesian and linkage disequilibrium outlier analyses suggest that per and Pdfr represent the best gene candidates in QTL1 and QTL2, respectively. We analyzed variation in the resequenced individuals using a case/control association analysis in plink (48) to detect other mutations within per and Pdfr that might have a phenotypic effect, such as changes in amino acids, changes in splice junctions leading to splice variants, and large structural variants that disrupt exons or cis-regulatory regions (other than E-boxes) (Table S9-S10). Using homology with per in other insects, we identified protein domains (27 exons) in the corn borer ortholog. Three nonsynonymous SNPs were significantly associated with PDD time and all were located in per exon 23 (Table 1c; outlined in red in Figure 5a). Both proline/threonine (P/T) and serine/proline (S/P) polymorphisms are in a 33 amino-acid region of per that is deleted in an artificially selected line of flesh fly (Sacrophaga bullata) showing enhanced diapause and the S/P polymorphism is 3 residues away from a 9 amino-acid insertion in a selected flesh fly line with decreased diapause (Figure 5a) (49). No consistent associations were found among splice variants, PDD time, and polymorphisms at splice sites (see Supplemental Infromation), nor were any large structural variants detected in within the gene.
Candidate gene models including 5’ UTR, 3’ UTR, exons (black bars), with protein domains (purple triangles) labeled. Gray portions of 5’UTR are putative 5’UTR introns (i.e., sequences not present in RNA transcripts). Locations of polymorphisms that showed significant association (q < 0.01) in individual sequencing data indicated by light pink triangles, those that change amino acid sequence denoted by red triangles. Below, amino acid sequence for exons with differences in sequence between ECB short-and long-PDD populations aligned with selected species of flies (blue), butterflies (purple) and moths (black). a) Gene model for per in ECB, including the upstream E-box enhancer element (green diamond) and novel E-box in 5’UTR (green triangle). Domains for TIMELESS binding (PAS), DOUBLETIME binding (DBT) and CLOCK-CYCLE inhibitory domain (CCID) indicated. Two amino acid changes (outlined in red) are in the same region which Sarcophaga high diapausing mutants have a deletion (shaded in teal) and non-diapausing mutants have an insertion (teal triangle; (49)). The amino acid changes also flank a region containing several predicted casein kinase 2 (CK2) sites in ECB and a conserved serine phosphorylated by SHAGGY identified in Drosophila (outlined in blue; (125)); N=78 polymorphisms. b) Gene model for Pdfr including PDF hormone binding domain (HMB), and transmembrane domain (7 alpha helices). Amino acid sequence shown for portion of the PDF hormone binding domain region annotated in Bombyx mori (outlined in teal) with differences between ECB slow and fast populations outlined in red; N = 166 polymorphisms.
Pdfr consisted of 12 exons (Figure 5b). In the predicted extracellular hormone binding domain for PDF (exon 2) there was one nonsynonymous SNP, coding for a methionine in short PDD individuals and a threonine in long PDD individuals (q = 1.12 ×10−8; Figure 4b; Table 1c). There were no splice junction polymorphisms or variants in Pdfr. Although its specific sequence is unknown, an enhancer is located ∼8.5 kb upstream of Pdfr in D. melanogaster (42). In corn borers, we found a ∼419 kb inversion associated with PDD time (q = 6.48 × 10−9) with one breakpoint 7.05 kb upstream from Pdfr in this putative enhancer region. The second breakpoint was predicted to occur 162 kb after trol.
Circadian activity
Prior work has shown that circadian rhythm of locomotor activity in mice, fruit flies, and humans is the behavioral output of circadian clock genes (50) and that mutations in per and Pdfr result in altered circadian rhythms in D. melanogaster (43, 51). We evaluated evidence for a difference in free-running circadian rhythm under total darkness (DD) between adult moths with short and long PDD times. Male pupae entrained to 16:8 were transferred to DD shortly before eclosion. We found that endogenous period length (τ) differed by approximately 1.3 hours, with short-PDD males showing longer average circadian periods (τ = 22.7 ± 0.2 h, N = 24) than long-PDD males (τ = 21.4 ± 0.28 h, N = 22) (ANOVA, F1,44 = 13.79, P < 0.001; Figure 6; Supplemental Results).
Actograms showing the locomotor activity in complete darkness (DD) over 2 day windows (double-plotted) for up to 15 days used to estimate circadian period (τ) for a representative a) short-PDD male, b) long-PDD male. c) Boxplot of length of circadian period (in hours) for males with short and long PDD (median, first and third quartile shown), lines indicate 95% confidence interval. Long-PDD individuals have a significantly shorter period (P < 0.001); N = 46 adults.
DISCUSSION
Genetic changes in circadian clock genes associate with natural variation in the time needed to end winter diapause and return to active springtime development in the European corn borer. Clock genes per and Pdfr are located within two epistatic QTL, strongly differ in allele frequency among individuals that pupate earlier or later, have the highest linkage disequilibrium among gene pairs in the QTL regions, and possess amino-acid changes that may affect protein function. Per alleles containing an additional putative CLK-CYC binding site were also exclusively identified in populations that pupate later. While additional work is needed to understand how identified allelic variants affect gene function and to verify that there are no other genetic polymorphisms contributing to diversity in seasonal timing, our combined results suggest that allelic variation in per and Pdfr is causal to evolution of diapause timing when confronting rapid environmental changes associated with range expansion (Figure S2) (22) and human-induced climate warming (23).
The presence of epistatic QTL indicates that genes underlying PDD time are likely members of the same genetic pathway. Both per and Pdfr interact in circadian pacemaker neurons in insect brains, where they synchronize biological activity to daily cycles of night and day (Figure 7) (52-54). In the laboratory, mutations in per are known to alter the length of the circadian activity period (51) and null mutants lose rhythm completely in D. melanogaster (41). Likewise, Pdfr is integral to the function of circadian pacemaker neurons in insect brains, where they receive secreted PDF neuropeptides that coordinate, synchronize, and reset the clock neuron network to new light cycles (55-57). Loss of expression of Pdf or Pdfr in these neurons can result in shorter circadian activity period (τ), abnormal peaks of circadian activity, and an inability to entrain to longer photoperiods in D. melanogaster (55, 58-60). Although robust connections between circadian clock genes and seasonal phenotypes have been discovered in plants (61), evidence in insects has been primarily based on RNAi studies demonstrating that functional clock genes are essential for diapause (62-64). It is less clear whether allelic variation in these genes typically responds to selection in natural populations to drive changes in the seasonal timing of diapause transitions. For example, polymorphism in the timeless gene in D. melanogaster influences diapause capacity in the laboratory, but in nature, latitudinal variation of timeless does not match variation in diapause and observed patterns are opposite of those predicted (65). Instead, diapause differences are more strongly associated with non-circadian genes, such as couch potato (66). Similarly, in Wyeomyia smithii pitcher plant mosquitos diapause appears to be independent of the circadian pathway, with these traits evolving separately in selection lines (67). Our study provides the first evidence that per and Pdfr, core components of the molecular clock, are associated with the duration of developmental arrest for insects in the spring. Recent studies in two other insects show that per alleles are associated with polymorphism in the timing of autumnal initiation of diapause (critical photoperiod for entrance) (68, 69). Genetic changes at per may therefore provide the capacity to adjust diapause transition times across two different seasons, enabling insects to synchronize with both the end and beginning of winter.
Regulation of circadian clock genes in clock pacemaker neurons shown (adapted from Lepidopteran clock in (125)). Blue arrows indicate activation, red arrows are suppression, black dashes are heterodimer formation, green dashed arrows are stabilization. Candidate genes shown in pink. The heterodimer formed by Clock (CLK) and Cycle (CYC) upregulate Period (PER), Timeless (TIM), and Pigment dispersing factor receptor (PDFR) (47). When PER and TIM are bound to Cryptochrome2 (CRY2), they migrate into the nucleus and PER-CRY2 repress CLK-CYC (126). Cryptochrome1 (CRY1) degrades TIM in the presence of light. The neurotransmitter pigment dispersing factor (PDF) binds to its receptor (PDFR) and this activation stabilizes both TIM (82) and PER (52). CLK-CYC activates arylalkylamine N-acetyltransferase (aaNAT) which converts Serotonin (5HT) to Melatonin (MEL) (81). b) Under short day conditions, serotonin levels are high, preventing PTTH release and leading to diapause maintenance. c) Under long days, melatonin levels are high and PTTH is released, leading to activation of ecdysone release by the PG and diapause termination. Ecdysone release is also facilitated by activation of PDFR in the PG (84).
In 1936, Bünning hypothesized that mechanisms underlying circadian rhythmicity control circannual rhythmicity (70). Alternatively, the circadian clock and the seasonal timer could act as two modules with largely separate genes, although individual genes may have cross-module effects (71-73).We find that populations of European corn borer moth differing in PDD time also differ in their internal circadian oscillator, such that the population spending more time in diapause (long PDD time) shows an accelerated circadian period (shorter τ). A similar inverse relationship between circadian and circannual rhythm has been found in Scandinavian flies (Drosophila littoralis), where shorter circadian periods are associated with earlier diapause initiation (74), and in mustard plants (Boechera stricta), where shorter circadian periods are associated with delayed flowering (75). Combined with the fact that multiple interacting circadian clock genes (per, Pdfr) are implicated in photoperiodic diapause termination, patterns of circadian activity in the European corn borer moth suggest that allelic variation and interactions between per and Pdfr might affect seasonal timing by altering circadian clock function (modular pleiotropy), rather than by a direct effect on diapause that is unrelated to the circadian clock (gene pleiotropy). Further evidence for at least partial circadian control of diapause termination timing was found by Beck (76), who tested Bünning’s model in the European corn borer moth using the Nanda-Hamner protocol. He found a circadian resonance cycle (∼24 h peaks) between the period of the diapause inducing photoperiod and PDD time, supporting the hypothesis that diapause timing is mediated or controlled by a circadian based physiological system. Future work will be needed to understand how molecular mechanisms might directly link expression of the daily clock and the seasonal timer in this species.
Physiological experiments suggest several molecular mechanisms by which per and Pdfr could regulate the neuroendocrine switch underlying the transition from diapause to development. Larval termination of diapause in the European corn borer moth and many other Lepidoptera is triggered by release of the developmental hormone ecdysone from the prothoracic gland (PG) due to stimulation from prothoracicotropic hormone (PTTH) (77-79). Work in the Chinese oak silkmoth (Antheraea pernyi) suggests that PTTH release or synthesis is regulated by the circadian clock pathway via the indolamine metabolism pathway. Specifically, a key step may involve the enzyme arylalkylamine N-acetyltransferase (aaNAT) and its opposing interaction on levels of melatonin (MEL) and gated PTTH synthesis/release under long-day photoperiod, or on levels of serotonin and PTTH suppression under short-day photoperiod (Figure 7) (80, 81). In A. pernyi, aaNAT is synthesized in circadian clock neurons when levels of CLK-CYC are high. PER represses CLK-CYC activity and RNAi against per results in increased aaNAT transcription, increased MEL protein, and diapause termination (81). In D. melanogaster, CLK-CYC binds to Pdfr, putatively regulating its expression (47). Activation of PDFR by PDF binding increases protein kinase A (PKA), which stabilizes PER and TIMELESS (TIM), preventing degradation, and increasing circadian period by ∼2 h (52, 82). Thus, per and Pdfr alleles of the European corn borer moth may function differently under seasonal changes in photoperiod by interacting in pacemaker neurons to alter aaNAT production, influencing synthesis/release of PTTH and the timing of diapause termination. Some evidence of differential regulatory control of the circadian clock–indolamine pathways exist between short and long PDD populations, potentially due to changes at per and Pdfr. We previously found that transcription in adult female heads of aaNAT, its putative regulator (cyc), and its downstream target (PTTH) is lower in strains with longer than shorter PDD times one hour before the light-dark transition under long-day photoperiod (83). If the novel E-box element we identified in per from long-PDD individuals leads to increased per expression and repression of cyc, it could hypothetically lower aaNAT, leading to a perception of days as shorter and delaying diapause termination. In addition to interaction of per and Pdfr in pacemaker neurons, a second route for epistasis and control of termination could occur by the release of ecdysone through an independent Pdfr cascade in the PG discovered in the silkmoth Bombyx mori (Figure 7) (84). Indeed, knockdowns of Pdf are sufficient to induce diapause under long photoperiods in mosquitos (Culex pipiens) and ablation of PDF-positive neurons impairs the photoperiodic regulation of diapause in bean bugs (Riptortus pedestris) and blow flies (Protophormia terraenovae) (63,85,86).
Despite repeated observation of geographic variation in circannual rhythm within and among species, and widespread alterations of seasonal activity in response to climate change and range expansion (3, 15, 20, 30, 87), seasonal timing in nature has rarely been linked to causal mechanisms. This gap in knowledge is alarming given that recent work suggests that roughly half of Lepidopteran species may be currently in decline (88) and accumulating connections between seasonal timing flexibility and population persistence (89). Establishing the genomic determinants of circannual variation is essential for understanding the capacity of species to tolerate rapidly changing environments (encountered through species movement or changes in local climate), as well as to accurately predict their future evolutionary trajectories (across geographic space and through time) (10, 13). We have shown using multiple whole-genome approaches in the European corn borer that evolution to earlier spring termination of diapause and an associated added generation has a relatively simple genetic basis, likely involving two genes that also orchestrate circadian timekeeping. Earlier springtime activity can allow populations to track preferred seasonal environments and to produce more generations per year, both of which improves population tolerance of sustained environmental change in theoretical (10) and empirical studies (87, 89). The duration of insect diapause generally tracks winter length, which will decrease by a month or more over the next century according to most climate change models (90, 91). Therefore, intense selection on alleles at per and Pdfr in this species is likely to be an important component of continued adaptation, anticipated range expansion (92, 93), and long-term species persistence under rapidly changing seasonal environments. As a major pest of corn and other crops in North America and Europe, the ecological and economic ramifications of these microevolutionary changes will be significant. To understand why certain pests like Ostrinia moths have the capacity to become greater threats under projected climates and why certain beneficial species may require enhanced conservation management to prevent extinction, future efforts should be made to more broadly understand the mechanisms underlying circannual rhythm in nature.
METHODS
QTL mapping of termination time
Backcross F2 female offspring, F1 parents and F0 grandparents were genotyped for polymorphic SNPs segregating within families using multiplexed PCR amplicons (500 bp amplicons, 384 unique individual barcodes per lane) sequenced on an Illumina MiSeq at the Cornell University Sequencing Facility (primer sequences used are described in Kozak et al. (35)). Additional Z-linked and autosomal markers were genotyped using Sequenom Assays developed for polymorphic SNPs (Sequenom Assay Design Suite 1.0, Sequenom, San Diego, CA, USA) and run at the Iowa State University Center for Plant Genomics (ISU-CPG) as described in Coates et al. (94) and Levy et al. (26). Linkage maps were constructed for each family separately using a maximum recombination frequency of 0.35 and a minimum LOD of 3 in R 3.4 using rQTL and the estimate map function (95-97).
QTL mapping of PDD time was performed using the scanone and scantwo functions with standard interval mapping and an interval of 0.5 cM (similar results were obtained when using extended Haley-Knott regression) for autosomal and Z linked markers in 1 family (F6, 67 offspring) and the Z chromosome only using 5 families (F2,5,6,9,11; 226 offspring total). For 5 families, a consensus Z map was constructed from the individual Z family maps using the LPmerge package in R (root mean squared = 10.89 and standard deviation = 7.31) (98). The 95% Bayesian credible interval (BCI) for the QTL were estimated and significance of QTL determined by F-test comparisons of models with and without QTL and their interaction using fitqtl function. For estimating BCI for the epistatic QTL, mapping was repeated on 101 individuals with the slow genotype at QTL1.
Population genomic analyses
We sampled from 5 European corn borer (ECB) populations (see Table S3). Individuals were collected from the field as diapausing larvae and PDD time was characterized in the lab. ECB have two known pheromone strains (E and Z) and field caught individuals were classified as Z strain as determined by genotyping at a polymorphic Taq1α restriction endonuclease cleavage site in the gene responsible for differences in pheromone components, pgfar (99). PDD time was measured as the number of days for diapausing larva to pupate after being placed in 16:8 LD and 26°C. Fast PDD individuals pupate < 19 days after exposure to these conditions while slow PDD individuals pupate after ≥ 39 days (24, 25, 29). For some long PDD individuals, we only had time to eclosion data (when adults emerged from puparium). Mean time ± SD from pupation to eclosion for BV = 9.9 ± 2.9 (N=107) so we conservatively estimated PDD time by subtracting 13 days from time to eclosion. Pennsylvania individuals were collected as adults in pheromone traps (100) and this population has been consistently phenotyped as fast PDD/bivoltine over a 15-year period (101, 102).
For each population, samples were pooled using equal DNA quantities from each individual. DNA was extracted using the Qiagen DNeasy tissue protocol except tissues were not vortexed during isolation to preserve high molecular weight DNA. Pooled libraries were prepared using the Illumina TruSeq protocol (Illumina Inc., San Diego, CA). Libraries were sequenced on an Illumina HiSeq3000 at the Iowa State University DNA Facility using 150 bp paired-end sequencing and 2 libraries run per lane. Genomic reads were trimmed using Trimmomatic v.35 to remove Illumina adapters (TruSeq2 single-end or TruSeq3 paired-end), reads with a phred quality score (q) < 15 over a sliding window of 4 and reads < 36 bp long. Trimmed genomic data were aligned to the 454.7 Mb draft ECB genome (GenBank BioProject PRJNA534504; BioSample SAMN11491597; accession SWFO00000000) which consists of 8,843 scaffolds (N50 = 392.5 kb, largest scaffold = 3.32 Mb; BUSCO 3.0.2 (103) score = 93.1% complete from 1066 from the arthropoda_odb9 gene set (Table S2). Prior to alignment repetitive regions were masked by RepeatMasker (using Drosophila melanogaster TE library from repbase; http://www.repeatmasker.org/; accessed March 2017; (104)). Genomic scaffold chromosomal location was determined as described in the supplemental material. Alignment was done using Bowtie2 (105). Due to poor quality of some of the reverse mate libraries, the number of reads aligned was found to be higher when reads were aligned as single-end libraries (forward mate pairs, broken mate pairs). Filtration of low quality alignments and duplicates were performed using picard tools (http://picard.sourceforge.net).
Samtools was used to identify SNPs (106). Scripts from the Popoolation2 package (107,108) were used to filter SNPs (removing SNPs near small indels, and those with rare minor alleles that did not appear twice in each population), calculate allele (read count) frequency of SNPs using a minimum coverage of 14 reads and a maximum coverage of 200, calculate FST over non-overlapping 1 kb windows (with >100 bp above minimum coverage in all populations), and perform CMH tests (see Supplemental Material). We used population read counts from Popoolation to test the association of alleles among our 5 populations while controlling for population demography using BayPASS 2.1 (39) and the standard (STD) model with PDD time (slow = 1, fast = −1) and d0yij = 6. Significantly associated alleles were defined as SNPs that had XtX above the 0.001% quantile of pseudo-observed data (POD) of simulated “neutral” loci (using simulate.baypass and mean read coverage for each population), BF > 20 dB (the difference between a model with and without PDD time included; with BF = 20 indicating “decisive” evidence in support of an association; (109)) and eBPis > 2 (which estimates how likely it is that the posterior distribution of β includes zero; equivalent to P < 0.01) (39, 40, 110). BayPASS was run separately for Z chromosome (14,724 SNPs; haploid pool sizes: EA = 50, GEN = 38, LA = 78, PY = 37, BV = 34) and autosomal loci (N = 577,412 SNPs; EA = 68, GEN = 50, LA = 78, PY = 52, BV = 40).
Individual resequencing
To identify the specific polymorphisms associated with voltinism differences and calculate LD, individual re-sequencing was done for 18 slow PDD individuals (10 GEN, 4 BV, 4 PY), 25 fast PDD individuals (14 EA, 11 LA). Individual libraries were prepared using the Illumina TruSeq protocol and were sequenced on an Illumina NextSeq using 150 bp paired-end sequencing at Cornell University. Trimmed genomic data were analyzed using the GATK best practices pipeline (111-113). Data were aligned to the draft reference ECB genome using BWA (106). Aligned reads were sorted and filtered using picard and samtools to remove duplicates and reads with a mapping quality score (Q) below 20. SNPs and small indels (< 50 bp indels) were called using GATK Haplotype caller (run in joint genotyping mode) after realigning around indels. Variants were filtered using recommended GATK hard filters (113). Larger structural variants (indels > 300 bp and inversions) were called from individual aligned bam files using information from split paired end reads in Delly2 (114).
Linkage disequilibrium
LD was calculated after the phase of genotypes was imputed using Beagle 5.0 (115). Prior to LD calculation, phased genotypes were filtered to include only those located within genes and MAF ≥ 0.25 and inter-scaffold r2 was calculated in vcftools (116). We summarized r2 over genes and performed bootstrapping analyses in R using data.table, plyr, and boot packages (117-119). Plots were constructed using the ggplot2 and qqman packages (120-121).
We ran an association analysis on sequencing data from individual ECB samples. GATK allele calls for SNPs and small indels (< 50 bp) were combined with delly2 variant calls of large indels and inversions using the combine variants function in GATK. We then analyzed the association of these polymorphisms with PDD time in plink 1.9 (48) with PDD time coded as a binary case/control phenotype (1 = fast, 2 = slow) and using a Fisher’s exact test to detect significant differences in allele frequencies. P-values were FDR corrected genome-wide using the fdrtools package in R (122).
Circadian activity
To measure the endogenous circadian clock, we used laboratory colonies from BV (slow-PDD) and a colony from a fast-PDD population collected near Geneva NY raised in the lab at 16:8 and 26°C (25, 28, 34, 35, 38). After pupation, male pupae were transferred to tubes within activity monitors in free running conditions (total darkness; DD) at 26°C. Activity was measured using a Trikinetics activity monitor (model LAM25, Waltham, MA) from the first day of adult eclosion. 16 individuals of each type were measured in two replicates for a total of 32 individuals per PDD type. Data were analyzed using custom MATLAB toolboxes (123).
Author contributions
G.M.K. and E.B.D. designed and performed research, analyzed data, and wrote the paper. B.S.C. contributed population genomic and genome sequencing. C.B.W. and S.C.K. performed primer design, DNA isolation, PDD phenotyping. S.M.B. performed primer design, amplicon and individual resequencing library preparation. R.G.H. assisted with research design.
Data deposition
Genbank (sequencing data)
ECB Genome: BioProject PRJNA534504; BioSample SAMN11491597; accession SWFO00000000 Pool-seq: BioProject PRJNA540655 (BV,Gen,EA,PY); BioProject PRJNA361472 (LA: SRX249882) Indiv-seq: BioProject PRJNA540833
ACKNOWLEDGEMENTS
Gabriel Golczer and Erastus Thuo assisted with DNA isolations. Henry Kunerth and Ben Hamilton assisted with sample acquisition. We thank F. Rob Jackson, Mary Roberts, and Jasper and Rigel Hatch Dopman for help conducting circadian activity experiments. This research was funded by the National Science Foundation (DEB-1257251 to E.B.D.; DEB-1256688 to R.G.H), Tufts University, a Tufts University Faculty Research Award (E.B.D.), and cooperative agreement 58-5030-7-066 between the United States Department of Agriculture, Agricultural Research Service (USDA-ARS), and Tufts University. Funding was also received from USDA-ARS Project CRIS-5030-22000-018-00D, USDA-ARS: Project CRIS-3625-22000-017-00 and the Iowa Agriculture and Home Economics Experiment Station, Ames, IA Project 3543. This article reports the results of research only and any mention of products or services does not constitute an endorsement by USDA-ARS. USDA-ARS is an equal opportunity employer and provider. While performing this research, S.C.K. was partially funded by the Tufts University Summer Scholars program and C.B.W. by a National Science Foundation Graduate Research Fellowship (2011-116050).
Footnotes
↵4 Posthumous
REFERENCES
- 1.↵
- 2.↵
- 3.↵
- 4.↵
- 5.↵
- 6.↵
- 7.↵
- 8.
- 9.↵
- 10.↵
- 11.↵
- 12.↵
- 13.↵
- 14.↵
- 15.↵
- 16.↵
- 17.
- 18.↵
- 19.↵
- 20.↵
- 21.↵
- 22.↵
- 23.↵
- 24.↵
- 25.↵
- 26.↵
- 27.↵
- 28.↵
- 29.↵
- 30.↵
- 31.↵
- 32.↵
- 33.↵
- 34.↵
- 35.↵
- 36.↵
- 37.↵
- 38.↵
- 39.↵
- 40.↵
- 41.↵
- 42.↵
- 43.↵
- 44.↵
- 45.↵
- 46.↵
- 47.↵
- 48.↵
- 49.↵
- 50.↵
- 51.↵
- 52.↵
- 53.
- 54.↵
- 55.↵
- 56.
- 57.↵
- 58.↵
- 59.
- 60.↵
- 61.↵
- 62.↵
- 63.↵
- 64.↵
- 65.↵
- 66.↵
- 67.↵
- 68.↵
- 69.↵
- 70.↵
- 71.↵
- 72.
- 73.↵
- 74.↵
- 75.↵
- 76.↵
- 77.↵
- 78.
- 79.↵
- 80.↵
- 81.↵
- 82.↵
- 83.↵
- 84.↵
- 85.↵
- 86.↵
- 87.↵
- 88.↵
- 89.↵
- 90.↵
- 91.↵
- 92.↵
- 93.↵
- 94.↵
- 95.↵
- 96.
- 97.↵
- 98.↵
- 99.↵
- 100.↵
- 101.↵
- 102.↵
- 103.↵
- 104.↵
- 105.↵
- 106.↵
- 107.↵
- 108.↵
- 109.↵
- 110.↵
- 111.↵
- 112.
- 113.↵
- 114.↵
- 115.↵
- 116.↵
- 117.↵
- 118.
- 119.↵
- 120.↵
- 121.↵
- 122.↵
- 123.↵
- 124.
- 125.↵