Integrating top-down and bottom-up approaches to understand the genetic architecture of speciation across a monkeyflower hybrid zone

Understanding the phenotypic and genetic architecture of reproductive isolation is a longstanding goal of speciation research. In many systems, candidate barrier traits and loci have been identified, but causal connections between them are rarely made. In this study, we combine ‘top-down’ and ‘bottom-up’ approaches with demographic modeling toward an integrated understanding of speciation across a monkeyflower hybrid zone. Previous work in this system suggests that pollinator-mediated reproductive isolation is a primary barrier to gene flow between two divergent red- and yellow-flowered ecotypes of Mimulus aurantiacus. Several candidate floral traits contributing to pollinator isolation have been identified, including a difference in flower color, which is caused primarily by a single large-effect locus (MaMyb2). Other anonymous SNP loci, potentially contributing to pollinator isolation, also have been identified, but their causal relationships remain untested. Here, we performed demographic analyses, which indicate that this hybrid zone formed by secondary contact, but that subsequent gene flow was restricted in a large fraction of the genome by barrier loci. Using a cline-based genome scan (our bottom-up approach), we demonstrate that candidate barrier loci are broadly distributed across the genome, rather than mapping to one or a few ‘islands of speciation.’ A QTL analysis (our top-down approach) revealed most floral traits are highly polygenic, with little evidence that QTL co-localize, indicating that most traits are largely genetically independent. Finally, we find little convincing evidence for the overlap of QTL and candidate barrier loci, suggesting that some loci contribute to other forms of reproductive isolation. Our findings highlight the challenges of understanding the genetic architecture of reproductive isolation and reveal that barriers to gene flow aside from pollinator isolation may play an important role in this system.


43
Understanding the phenotypic and genetic architecture of reproductive isolation is a 44 major goal of modern speciation research [1][2][3]. Early studies took a 'top-down' approach by 45 using quantitative trait locus (QTL) mapping and other association methods to detect genomic 46 regions controlling barrier phenotypes or genetic incompatibilities [4][5][6]. More recently, 'bottom-47 up' approaches, such as genome scans of genomic differentiation (e.g., FST) or admixture (e.g., 48 fd), have identified candidate barrier loci in numerous systems, including those where isolation is 49 thought to result from ecologically-based divergent selection or intrinsic incompatibilities [7][8][9][10]. 50 Although both approaches have clear strengths, they also present significant challenges 51 [11]. Top-down methods require that traits involved in reproductive isolation have already been 52 identified, so our understanding of the genetic architecture of speciation can only ever be as 53 complete as our knowledge of the traits controlling reproductive isolation in the system. In 54 contrast, bottom-up approaches can provide a comprehensive view of the genomic landscape of 55 speciation without complete knowledge of the isolating traits (but see [3,12]). However, even 56 though candidate barrier loci can be identified, their causal relationship with previously 57 identified barrier traits usually remains unclear. This is because speciation usually involves many 58 different isolating barriers (e.g., pre-and post-zygotic, extrinsic and intrinsic) [13,14] that can 59 become coupled together through different aspects of the speciation process [15][16][17]. Although barrier that is less conspicuous or that has yet to be discovered. County. The size of the circles shows variation in the sample sizes, which range from 4 to 18 148 individuals, totaling 292 individuals. The dashed line indicates the center of the hybrid zone, 149 previously inferred from spatial variation in the frequency of alternative alleles at the MaMyb2 locus. 150 (bottom) Clines in allele frequency at the MaMyb2 locus (red circles) and the mean floral trait PC1 151 score (blue squares) across the one-dimensional transect. The solid and dashed lines are the ML 152 sigmoid cline models for MaMyb2 allele frequency and trait PC1 score, respectively. The gray 153 shaded rectangle represents the width of the hybrid zone. 154 155

156
RAD sequencing, read filtering, and SNP calling have QTL i assigned to them. We calculated the mean number of overlaps per QTL for 9,999 280 random datasets and estimated a p-value for the observed value as the number of permuted 281 datasets where n overlaps/n QTL was equal to or greater than the observed estimate + 1/number 282 of permutations +1.

284
Test for overlap of QTL and outlier windows 285 We also used a permutation test to determine if QTL regions were enriched for outliers 286 identified in our cline and FCT based genome scans. We first counted the observed number of 287 outlier windows within the empirical QTL intervals. This was performed for both cs and FCT 288 outliers, defined using two different cutoffs (top 1% and 5% of the empirical distributions). To 289 determine if these counts were significantly different from chance, we produced 9,999 datasets 290 where the genomic position of outlier windows was randomized, and we counted the number of 291 outliers falling inside the empirical QTL intervals. A p-value for the estimate was calculated as 292 described above.

295
Evidence for a history of heterogeneous gene flow 296 The results of our demographic modeling support a history of gene flow across the range 297 of the ecotypes. First, demographic models that included contemporary gene flow were far better 298 at recreating the observed JSFS than a model of divergence without gene flow (i.e., strict 299 isolation; -ΔAIC = 3938), or the best model of ancient migration, which included historical but 300 not contemporary gene flow (-ΔAIC = 1157) ( Fig. 2; Fig. S4). Second, models that included 301 heterogeneous migration across the genome (2m) were always strongly favored over the 302 equivalent models, where gene flow was modeled with a single rate (Fig. 2). Third, the SC and 303 PSC models, which included periods of allopatry and secondary contact, were strongly favored 304 over the IM model, where divergence occurred without a period of geographic isolation. The 305 best-fitting model was the SC2m model, indicating that divergence of the red and yellow 306 ecotypes included a period of allopatry followed by gene flow upon secondary contact (Fig. 2). molecular markers (Fig. 1). In addition, consistent with the increased variance we observed in 362 multiple phenotypic traits in hybrid populations [33], the standard deviation of ancestry scores is 363 higher in sample sites close to the cline center, thus providing genomic evidence for 364 hybridization (Fig. 3a). Our demographic analyses provide new evidence that this hybrid zone formed from 511 secondary contact after a long period of isolation. Indeed, both models that included periods of 512 geographic isolation (SC and PSC) were a better fit to the data than models of continuous gene 513 flow (IM). The parameter estimates for the preferred model (SC2m) indicate a relatively long 514 period of isolation, followed by a period of secondary contact that began roughly 1,800 515 generations ago. It is important to note that we fit relatively simple models to the data that 516 excluded changes in population size in the ancestral and daughter populations and variation in Ne 517 along the genome. Recent work has shown that failure to model key parameters can result in 518 incorrect inference under some circumstances [53]. Although more sophisticated modelling may 519 arrive at different conclusions in the future, these results point to a secondary origin of this 520 hybrid zone.

521
In terms of the main goal of our paper, the key result of the demographic analysis was 522 that all models with two rates of migration (2m models) fit the data better than those where 523 migration was modelled at a single rate, indicating a heterogeneous pattern of gene flow across 524 the genome. Moreover, the estimated parameters for the preferred model suggest that roughly 525 one third of loci (37%) have experienced migration at substantially reduced rate compared with 526 non-barrier loci. This result supports previous conclusions that candidate barrier traits and loci 527 are indeed impacted by natural selection [24], further motivating the need to connect the 528 phenotypic and genetic architecture of RI in more detail.

530
visible to selection, but tight linkage and pleiotropy enhance the coupling of different sets of 590 adaptive traits, meaning that they can remain associated despite gene flow [17]. This begs the 591 question: why do we see so few large effect loci and such little overlap among floral trait QTL in 592 the red and yellow ecotypes? One potential explanation is that divergence was initiated during a 593 period of geographic isolation-a hypothesis that is supported by our demographic analysis. If 594 secondary contact, but gene flow following contact is now heterogeneous across the genome due 669 to the effect of multiple barrier loci. A cline-based genome scan indicates that candidate barrier 670 loci are widespread across the genome, rather than being associated with one or a few 'islands' 671 of speciation. A QTL analysis of floral traits identified many QTL of small effect, with limited 672 co-localization among QTL for different traits. We found limited evidence that QTL and 673 candidate barrier loci overlap, suggesting that other barriers to gene flow aside from pollinator 674 isolation may contribute to speciation.

675
In addition to providing knowledge about this system, our study has important 676 implications for efforts to understand the phenotypic and genetic architecture of isolating  Table S1. Location information and samples sizes for the sample sites used in the study.

937
Population codes correspond with those in Figure 1.   Table S1. Location information and samples sizes for the sample sites used in the study.
Population codes correspond with those in Figure 1.  In addition to the size-related traits, we also measured anthocyanin and carotenoid pigment levels. Anthocyanins were extracted in 1% acidic methanol from a single disc collected from one of the top petals the first day each flower opened. Absorbance of extracts was measured with a spectrophotometer at 520 nm, as described previously (Streisfeld and Rausher 2009). Carotenoids were extracted in hexane following a similar protocol, and absorbance was measured at 450 nm. LG1 LG2 LG3 LG4 LG5 LG6 LG7 LG8 LG9 LG10

Corolla Width
Chromosome 1 2 3 4 5 6 7 8 9 10  Figure S9 continued.   Figure S10. QTL effect plots for each of the 26 QTL discovered in the analysis. Each plot shows the mean (circle) and standard deviation (crosses) for each trait plotted against the three genotypes. RR, both alleles inherited from a red grandparent; YY, both alleles inherited form a yellow grandparent; RY, Heterozygous. Red lines indicate that the effect occurs in the opposite direction of that inferred by the trait differences measured in the red and yellow grandparents (see Fig. S8).    This supplement outlines the rationale behind the calculation of the cline similarity score, which describes the relative shape and position of a cline, relative to some other cline of interest. In our case, we wanted to be able to view the variation in multiple parameters that describe different features of a geographic cline in an integrated way, so we could do things, like plot clinal variation along a genome.
Consider a simple, sigmoid cline, like the one shown in Figure A. This cline model, which is the most commonly used in empirical studies, is described by 4 fitted parameters: the cline centre (c), cline width (w), the mean trait value on the 'high' side of the cline (Qmax), and the mean trait value on the 'low' side of the cline (Qmin). The last 2 parameters can be reduced to a single parameter, ΔQ = Qmax -Qmin, which quantifies the total change in a trait across the transect. It is worth noting that we use the notation ΔQ to denote change in ancestry scores across the cline, rather than Δz, which is typically used for a quantitative trait (or Δp for allele frequency). Each parameter tells us something about the feature of the cline and is relevant to inferences that we might make about selection in a hybrid zone (provided that we are happy to make some assumptions). For example, the width of the cline is inversely proportional to the strength of selection acting on the trait; the centre tells us about the spatial pattern of selection across the hybrid zone; and the change in the trait value across the cline tells us about the strength of the difference in the trait across the cline. In a situation where selected and neutral loci are at equilibrium, the above cline would indicate that variation in the trait is maintained by selection. The problem is that inferences cannot easily be made from a single cline parameter. This is highlighted in Figure B, where several clines are shown, with certain parameters fixed and others varying. The left plot shows three clines with the same values of w and c, but substantially different values of Δz; The middle plot shows multiple clines with the same values of Δz and c, but with different values of w; the third shows clines with identical shapes (i.e., same Δz and w), but different values of c. In the above plots, we can easily evaluate each cline in terms of its fit to some hypothesis by visually inspecting them. However, when we have many clines, this is impractical and highly subjective. But these plots clearly show that we cannot infer much by looking at the distribution of a single parameter, especially in real data where all parameters will vary simultaneously.

The 'cline similarity' score
To help simplify the process of identifying clines of interest, we created an ad-hoc statistic that we call the 'cline similarity' or 'cs' score. The cs score summarises cline shape and relative position with a single number that can be calculated from the ML cline parameters. cs is calculated as: Cline similarity, cs =! ∆# $%& ' ( (*(|,/&|)) / (eq. 1).
Where ΔQ w and c, are defined as above and l is a scaling factor that will be explained further below. Eq. 1 can be broken down into two terms that determine the cs score in different ways. The first term is a shape score, that describes the shape of a cline, independent of the location of the cline centre: Shape score = ! ∆# $%& ' (eq. 2).
The second term is a centre penalisation score, which downgrades the shape score based on the value of the fitted cline centre relative to some point of reference (explained further below). These formulae and the decisions made to arrive at them will next be explained by walking through its calculation for the Mimulus dataset.

Application to the Mimulus dataset
For the Mimulus dataset, we fit clines to ancestry scores in 2,173 non-overlapping windows, each containing 100 SNPs. The details of how the fits were done are described in the main text, but the clines can be fitted in any way. We only need the parameters for the fitted model to calculate cs.
Starting with the shape score, we wanted a quantity that described variation in the shape of the cline. Ultimately, we wanted clines with a small w and large ΔQ to have high shape scores, and clines with very low ΔQ and high w to have a low shape scores. Figure C shows the joint distribution of ΔQ and w for all of the windows. The numbers in the plot correspond to the cline fits for 10 selected windows, which are shown on the right side of the plot. The orange star (cline 2), is not for a specific window, but rather is the fit to mean Q score for each location, averaging across all windows (i.e., it is a cline fitted to the mean of means for each location). The green star shows the location of the genome-wide cline in this space (i.e., the one shown in figure 3a of the main text, and figure D (ii)). * interest, because ΔQ is always very small. The ones in the top right are more interesting, because they show high ΔQ but are very wide; these clines could be generated by processes like isolation by distance, or the collapse of secondary clines when barriers to gene flow are porous.
Figure D(i) shows the same plot, but with the points coloured by the shape score (10 coloured bins, but the values are continuous), as given in Eq. 2. Here we define l, the scaling parameter, as 0.5t, where t is the length of the transect. l determines precisely how variation in the shape scores is spread out across the joint distribution of ΔQ and w. In our case, 0.5t, gives a distribution shape scores that suits our purposes, but other values might be more suitable for other study systems. By colouring the cline models for each window by their shape scores, as shown in Figure  D(ii), we can see that clines with 'hotter' colours do have shapes more similar to the genome wide cline (the dashed cyan line) than 'cooler' colours. However, it is also clear that some clines with high shape scores have centres that are displaced a very long way from the genome wide cline. Some of these have been coloured solid yellow to help highlight them.
Although these clines are interesting for a variety of reasons, they are not spatially associated with the centre of the hybrid zone that also coincides with the transition in floral trait differences. Ideally, we would like our summary statistic to also reflect where the cline is positioned in relation to this location. To do this, we penalise the shape score based on the position of the cline centre, relative to a position of interest, which needs to be given position 0. This could be a feature of the environment or a cline in a focal marker or trait. In our case, the position of interest is the centre of the genome-wide ancestry cline.
Figure E(i) shows how the value of the inferred centre informs the centre penalisation function outlined in Eq. 3. If the fitted cline centre coincides exactly with the point of interest (c = 0), the cs score for that cline is simply the shape score (i.e., cs = shape score). However, as the difference between fitted centre and point of interests increases (i.e., |0 -c|), then the shape score is downgraded according to the centre penalization function, resulting in a cs score lower than the shape score. Figure E. Penalisation of the shape score to give the final cs score. (i) The effect of the cline centre penalisation function on the shape score. Here, an arbitrary shape score of 1 is used for illustrative purposes. (ii) The cline models for all windows after the cline penalisation score, coloured by the overall cs score.
The results of this final transformation step can be seen in figure 3c of the main text, which is reproduced in Figure E(ii) for convenience. In that figure, the cs scores are scaled between 0 and 1, where 0 is the cline with the lowest cs score, and 1 is the cs score for the genome-wide cline.

Some afterthoughts (if others attempt something similar)
This was done 5 years ago and if I was doing this again from scratch, I would probably think about it a bit differently. First, I think it might be useful to try and incorporate the variance explained by the cline model into the cs score. i.e., if the data around the cline are extremely noisy (common for clines with a small ΔQ, e.g., cline 1 in Figure C ), we might conclude that these are less likely to be of interest. The variance explained was used primarily in Westram et al. 2017 and seemed to be good at identifying non-neutral clines in simulations.