Abstract
Major histocompatibility (MHC) genes encode proteins that play a central role in vertebrates’ adaptive immunity to parasites. MHC loci are among the most polymorphic in vertebrates’ genomes, inspiring many studies to identify evolutionary processes driving MHC polymorphism within populations, and divergence between populations. Leading hypotheses include balancing selection favoring rare alleles within populations, and spatially divergent selection. These hypotheses do not always produce diagnosably distinct predictions, causing many studies of MHC to yield inconsistent or ambiguous results. We suggest a novel strategy to distinguish balancing versus divergent selection on MHC, taking advantage of natural admixture between parapatric populations. With divergent selection, immigrant alleles will be more infected and less fit because they are susceptible to novel parasites in their new habitat. With balancing selection, locally-rare immigrant alleles will be more fit (less infected). We tested these contrasting predictions using threespine stickleback from three replicate pairs of parapatric lake and stream habitats. We found numerous positive and negative associations between particular MHC IIβ alleles and particular parasite taxa. A few allele-parasite comparisons supported balancing selection, others supported divergent selection between habitats. But, there was no overall tendency for fish with immigrant MHC alleles to be more or less heavily infected. Instead, locally rare MHC alleles (not necessarily immigrants) were associated with heavier infections. Our results illustrate the complex relationship between MHC IIβ allelic variation and spatially varying multi-species parasite communities: different hypotheses may be concurrently true for different allele-parasite combinations.
Introduction
MHC class II loci, which aid in the recognition of extracellular parasites, are among the most polymorphic loci in vertebrates’ genomes (Figueroa et al. 1988). Evolutionary biologists have long sought to elucidate the evolutionary processes that maintain the exceptional diversity of MHC within and among populations. Most studies have focused on documenting parasite-mediated selection on MHC given its role in immunity. Parasite-derived proteins (antigens) are collected and fragmented by antigen presenting cells (Roche & Furuta 2015). MHC proteins bind to certain antigen sequences, and export these to the cell surface for presentation to T-cells, which may then initiate an immune response. MHC II β chains with different peptide binding region sequences enable recognition of different parasite antigens (Eizaguirre & Lenz 2010; Hedrick 2002). Accordingly, MHC polymorphism contributes to variation in resistance to parasites including pathogenic and symbiotic bacteria (Bolnick et al. 2014; Kubinak et al. 2015; Lohm et al. 2002), viruses (Thursz et al. 1995), protozoa (Hill et al. 1991; Sinigaglia et al. 1988; Wedekind et al. 2006), helminthes (Paterson et al. 1998), fungi (Savage & Zamudio 2011), and even contagious cancers (Siddle et al. 2010). Despite these and many other studies, it remains unclear how MHC polymorphism is sustained. The leading hypotheses invoke balancing selection within populations, or divergent selection among populations, each of which has received mixed support (Bernatchez & Landry 2003; Piertney & Oliver 2006; Tobler et al. 2014; Yasukochi & Satta 2013).
Balancing selection occurs when rare alleles gain an inherent fitness advantage over common alleles, preventing their loss and maintaining allelic diversity (Takahata & Nei 1990; Takahata et al. 1992). Balancing selection can result from heterozygote advantage because individuals carrying more diverse MHC alleles recognize and resist more diverse parasites (Doherty & Zinkernagel 1975; Oliver et al. 2009), thanks to co-dominance (Lohm et al. 2002). Because rare alleles tend to occur in heterozygotes, they increase fitness and are protected from loss (Wegner et al. 2003). Alternatively, balancing selection can result from negative frequency-dependent selection. Parasites evolve strategies to exploit locally common host genotypes, such as evading detection by locally common MHC alleles (Slade & McCallum 1992). Because rare alleles do not provoke parasite counter-evolution, they may be more effective at detecting and protecting against local parasites (Muirhead 2001; Schierup et al. 2000).
Divergent natural selection (divergent selection) is also widely invoked to explain MHC diversity (Hedrick 2002; Hill et al. 1991; Meyer & Thomson 2001). Parasite communities often differ among host populations, favoring different MHC alleles in different locations and driving between-population divergence but undermining local polymorphism. Many studies have invoked divergent selection on MHC to explain allele frequency differences between populations with different parasites (e.g., (Copley et al. 2007; Matthews et al. 2010; Pavey et al. 2013). But, many studies do not formally test the null hypothesis that MHC divergence is neutral and unrelated to parasitism (Miller et al. 2010). Those that do consider neutrality often find mixed results: MHC divergence sometimes is greater than, less than, or equal to neutral genetic markers (Lamaze et al. 2014; Mona et al. 2008; Schwensow et al. 2007; Sutton et al. 2011). An alternative approach to test divergent selection is to evaluate whether different MHC alleles confer protection in different populations, using spatial variation in MHC-parasite associations to argue for divergent selection (e.g. (Eizaguirre et al. 2012a; Loiseau et al. 2009). Comparatively few studies have used experimental transplants or infections to test for of local adaptation at MHC loci (Eizaguirre et al. 2012a, b; Evans et al. 2010), and some of these have yielded negative results (Rauch et al. 2006).
Unfortunately, divergent and balancing selection may be difficult to distinguish because in certain contexts they can result in similar patterns, as pointed out by Spurgin and Richardson (2010) and more recently by Tobler et al. (2014). Both heterozygote advantage and negative frequency-dependent selection can lead to fluctuating allele frequencies through time (Slade & McCallum 1992). If these allele frequency fluctuations are asynchronous across host populations (Gandon 2002), then populations will be genetically divergent. Experimental transplants between such populations may (transiently) yield signals that appear to support divergent selection, even though MHC divergence arose from balancing selection within populations. Still more problematic, balancing and divergent selection are not mutually exclusive phenomena. Balancing selection may act within populations (driven by some parasites), while other parasites generate divergent selection favoring differences between populations. Simultaneous balancing and divergent selection may obscure each force’s effect on within-population diversity and between-population divergence. Lastly, the majority of studies using MHC-parasite associations to test for selection have focused on one parasite species at a time. This inevitably yields an incomplete picture of the selective forces shaping diversity at MHC, especially because different parasites may drive different kinds of selection. Consequently, tests for balancing selection and divergent selection have yielded mixed evidence (Eizaguirre & Lenz 2010; Spurgin & Richardson 2010; Yasukochi & Satta 2013).
Balancing versus divergent selection in parapatry
In certain settings, balancing and divergent natural selection can lead to unique and thus testable outcomes. In particular, we suggest they can be distinguished in parapatric populations that actively exchange migrants but experience distinct parasite communities (Fig. 1), by estimating three parameters. δm measures how strongly an allele m is enriched in a focal habitat. θp, measures how strongly a parasite taxon p is enriched in a focal habitat. βmp measures the association between allele m and parasite p; negative values imply that the presence of the allele coincides with lower parasite abundance (Fig. 1A).
We calculate an effect size and direction (βmp) for all pairwise associations between MHC allele m versus parasite taxon p, within a given habitat (lake or stream). Then, we test whether βmp depends on the extent to which allele m is lake- or stream-biased δm), and the parasite is lake- or stream-biased (θp). We expect that βmp covaries with δm, but the sign of this trend is opposite within the lake versus stream samples. With divergent selection, each population will contain locally common alleles that confer protection against locally common parasites, whereas immigrants will tend to be susceptible to unfamiliar parasites. For example, alleles that are particularly common in the lake (δm>0) should confer protection (βmp<0) against lake-specific parasites (θp>0), but susceptibility (βmp>0) against parasites in the stream (θp<0). In contrast, balancing selection favors rare alleles, so immigrant alleles should benefit. Stream-biased MHC alleles (δm<0) that migrate into the neighboring lake should be rare and confer resistance (βmp<0) to lake-specific parasites (θp>0). Therefore, both divergent and balancing selection should produce a θp*δm interaction effect on βmp, but the direction of this interaction depends on the form of selection.
Divergent selection will tend to increase the abundance of an allele in the habitat where it confers a protective benefit (βmp<0), or decrease the allele in a habitat where it confers susceptibility (βmp>0). Consequently, alleles that are strongly enriched in a particular habitat (δi) should tend to be protective (βmp<0) against parasites enriched in that same habitat (θp,>0). In the context of our study system (lake and stream populations of threespine stickleback, details below), this means that the more lake-biased alleles should protect against lake-biased parasites (and be susceptible to stream-biased parasites). Conversely, stream-biased alleles should protect against stream-biased parasites and be susceptible to typical lake parasites (Fig. 1B).
Balancing selection will tend to favor alleles that are locally rare, which in parapatric settings includes immigrants. Namely, when there is balancing selection we expect that alleles enriched in a particular habitat (relative to the neighboring habitat) will be particularly susceptible to parasites from that habitat (βmp>>0; Fig. 1C). In contrast, alleles that are scarce in a focal habitat will tend to protect against the local parasites (Muirhead 2001; Schierup et al. 2000). These locally rare alleles could be new mutations or (more frequently in a parapatric setting) immigrants (Lamaze et al. 2014). In this regard, balancing selection resembles local maladaptation, the diametric opposite of expectations for divergent selection.
Thus, divergent and balancing selection make opposite predictions regarding the sign of the correlation between δm (the extent to which an allele is habitat-specific) and βmp (the allele’s effect on parasites), for habitat-biased parasites (θp Fig. 1). Do endemic macroparasites disproportionately infect hosts with locally-enriched alleles (implying balancing selection) or locally-depleted alleles (implying divergent selection)? To test these predictions we rely on natural migrants between populations, which add rare genetic variants that are either beneficial (balancing selection), or deleterious (divergent selection). Of course, both selective forces might act concurrently, for instance if certain parasites select against immigrants while other parasites select against locally common alleles. Therefore, any test of these alternative predictions should take into account the full set of MHC alleles, and all common parasites within each population.
Some previous studies have used a related approach, testing whether MHC alleles confer protection or susceptibility to different parasites in different habitats (e.g., estimating βmp) (Tobler et al. 2014). But, a key element of our approach is that the sign and strength of MHC-parasite associations (βmp) will depend on the extent of between-population differences in parasite and allele frequencies (θp, and δm). To our knowledge, previous studies of MHC adaptation have not tested for an interactive effect of θp, and δm on βmp (parasite-habitat and allele-habitat biases jointly affecting the parasite-allele association). Here, we use this novel approach to test for signatures of balancing or divergent selection in connected (parapatric) lake and stream populations of threespine stickleback (Gasterosteus aculeatus).
Study system: threespine stickleback
Genetic sequencing suggests that threespine stickleback have between 4 and 6 functional MHC class IIβ loci in their genome (Reusch et al. 2004; Reusch & Langefors 2005; Sato et al. 1998), though this may vary between individuals (Reusch & Langefors 2005). Expression analysis indicates that all putative MHC class IIβ loci are typically expressed (Reusch et al. 2004).
Prior studies of stickleback have provided evidence for balancing or divergent selection on MHC IIβ. Balancing selection is supported by several observations. Individuals with an intermediate number of alleles are more resistant to infection (Kurtz et al. 2004; Wegner et al. 2004), harbor fewer parasites (Wegner et al. 2003), build better –quality nests (Jager et al. 2007), survive better (McCairns et al. 2011; Wegner et al. 2008), and attain higher lifetime reproductive success (Kalbe et al. 2009). Divergent selection is supported because MHC IIβ allele differ between (i) co-occurring benthic and limnetic stickleback species pairs (Matthews et al. 2010), (ii) closely parapatric estuarine stickleback in Quebec (McCairns et al. 2011), and (iii) lake and river stickleback from northern Germany (Rauch et al. 2006; Reusch et al. 2001). The German lake/river system has been used for experimental tests of divergent selection on MHC IIβ. Lab-bred F2 lake-stream hybrids placed into field mesocosms gained more weight if they had local MHC alleles, but non-native MHC genotypes were not systematically more infected (Eizaguirre et al. 2012a). An earlier F2 hybrid transplant experiment found that genomic background but not MHC genotype explained habitat-specific infection rates (Rauch et al. 2006).
Here, we present a simultaneous test both balancing and divergent natural selection on stickleback MHC IIβ in three replicate lake-stream pairs of stickleback. We first document between-habitat differences in parasite composition (θj), and MHC genotypes (δi). Then, for each of the three pairs, we test for associations between each MHC IIβ and each parasite taxon (βij) within a multispecies parasite community. Lastly, we test whether allele-parasites associations covary positively or negatively with habitat differences in allele and parasite frequencies (Fig. 1). Specifically, we test whether locally common parasites disproportionately infect locally-enriched alleles, or locally-rare immigrant alleles.
Methods
Collections
In July 2007, we sampled threespine stickleback from three lakes on northern Vancouver island, British Columbia (Roberts Lake, Farewell Lake, and Comida Lake) and their corresponding outlet streams (three ‘lake-stream pairs’, Fig. 2). Most lake-stream pairs on Vancouver Island evolved independently in situ, after marine stickleback colonized freshwater after Pleistocene deglaciation (Clague & James 2002; Hendry et al. 2013; Stuart et al. In review).
We collected adult stickleback using unbaited minnow traps (0.5-cm gauge). We placed traps haphazardly along the shoreline of each lake (< 3m depth) within 350 meters of the outlet stream, and at 5 traps at each of multiple locations along each lake’s outlet stream (Table 1, Fig. 2). Stream samples spanned the genetic clinal transition from lake- to stream-genotypes (Berner et al. 2009; Weber et al. 2017). Upon capture, fish were immediately euthanized in MS-222. Caudal fin clips were taken and preserved in 90% ethanol for later DNA extraction. Fish were preserved in 10% neutral buffered formalin. Collection and animal handling were approved by the University of Texas Institutional Animal Use and Care Committee (Protocol # 07-032201), and a Scientific Fish Collection Permit from the Ministry of the Environment of British Columbia (NA07-32612).
Approximate locations of each lake-stream pair on the island are indicated by their respective letters (A: Farewell, B: Roberts, C: Comida). Arrows indicate separate sampling locations within each pair. Separate scale bars are provided for the entire island and each pair individually.
Parasite load
Each fish was exhaustively screened to enumerate macro-parasites (helminths, crustaceans, molluscs, and microsporida) visible under a standard dissection microscope. This included scans of the outer body (i.e. skin and bony armour structures), mouth and gills, interior body cavity including all organs (liver, swim bladder, gonads), the interior of the intestinal tract (stomach and intestine), and the eyes (interior and exterior). Only the gills on the right (but not left) side of the fish were scanned for parasites, as the common gill parasites (Thersitina sp. and Unionidae glochidia) were present at very high abundances on both left and right gills. All parasites were identified to the lowest possible taxonomic unit (genus in most cases).
Analysis of habitat effect on infection
To determine whether parasite abundance differed between lake and stream habitats, we first fit hierarchical generalized linear models separately for each lake stream pair. The GLMs used (additively) overdispersed-Poisson distributions to model each parasite taxon’s abundance in individual fish. The basic form of each model was:
where yi is the abundance of a focal parasite taxon in individual i. The term αj denotes a habitat-specific intercept where j=lake, or stream. The vector β includes the regression parameters β1 through β5 which indicate, respectively, the means for lake (β1) and stream (β2) habitat, the covariate effect of fish standard length of lake fish (β3) and stream fish (β4), and a coefficient for sex (β5). Random effects (αi) associated with each sampled stream site (i.e. 100m, 200m, etc.), are modeled as a normal random variable with mean equal to zero and standard deviation σj. The error terms εi that account for overdispersion in the abundance data were also modeled as normal random variables with mean equal to zero and standard deviation σi. Sex was centered at zero, and length was centered at zero prior to fitting the models (Gelman & Hill 2006). Thus, the models explicitly account for the effects of sex, size, and heterogeneity among sampling locations (e.g., within-stream clines) and sample sizes when estimating mean abundances within each habitat (β1 and β2).
All parameters were estimated by drawing 1000 samples from their joint posterior distributions using the Markov Chain Monte Carlo (MCMC) algorithm implemented the MCMCglmm package (Hadfield 2010) in R version 3.2.1. Weakly informative normal priors with a scale of 3 and 10 were applied to all fixed slope and intercept coefficients respectively, providing some shrinkage of β estimates away from extremely large values (Gelman et al. 2008). Half-Cauchy priors with scale equal to 10 were applied to αj’s, while a uniform prior was applied to the residual standard deviation αi. In cases where hyperparameter variances were close to zero, stronger half-Cauchy or inverse-Wishart priors were used to improve model convergence. MCMC chain parameters were determined heuristically by increasing the thinning interval until all estimated parameters achieved an autocorrelation less than 0.1.
As our metric of parasite habitat bias we calculated the posterior distributions for a derived parameter (θp), which was the log of the ratio of parasite p’s mean abundance estimates (on the data scale) between the lake and the stream. When θp>0, the focal parasite is more abundant in the lake, and when θp<0 the parasite is more abundant in the stream. Parasites with greater than 95% percent posterior probability of being at least two times more abundant in one habitat that were considered strongly 'habitat-specific’ in subsequent analyses. We use ‘habitat-biased’ to refer to weaker habitat effects.
MHC sequencing and genotyping
We genotyped MHC IIβ from a random subset of the fish that were screened for parasites (sample sizes listed in Table 2), by 454 pyrosequencing of PCR amplicons. The procedures for DNA extraction, quantitation, PCR amplification, and library preparation, and computational analysis are described fully in (Stutz & Bolnick 2014). We used PCR primers that produce a 210 base pair amplicon (excluding primer sequences) covering 75% of the length of exon 2 (210 bp out of 265 bp of exon2; (Stutz & Bolnick 2014). This covers 70 out of 88 amino acid residues, including the highly variable peptide binding region (PBR) of the exon (Lenz et al. 2009a). Of the 846 fish genotyped in the present study, 295 were previously described in Stutz and Bolnick (2014). We genotyped the additional samples in four new pyrosequencing runs (1/4 plate per run).
Our analytical pipeline uses a quasi-Dirichlet process to iteratively cluster similar sequence reads into groups at increasing levels of sequence similarity, and estimates whether clusters represent single true allelic variants present in the original sample (Stutz & Bolnick 2014). A separate research group independently tested this bioinformatics pipeline, using multiple datasets, and confirmed its accuracy (Sebastian et al. 2016). Allelic sequences for each individual were aligned to the cloned sequences in Sato et al. (1998) to ascertain phase, then translated into amino acid sequences for further analysis. Hereafter we refer to a unique amino acid sequence as an ‘allele’. We focus on allele presence or absence, because MHC is expected to have co-dominant effects on parasites (Doherty & Zinkernagel 1975).
Analysis of habitat effect on MHC genotype
We applied a similar hierarchical modeling approach estimate allele frequency bias between habitats within each lake-stream pair. Because an MHC allele may be distributed across multiple paralogs, this is not a traditional allele frequency, but rather the proportion of fish carrying an allele. For each allele we fit the following model:
where yi=1 indicates that fish i carries the allele. The vector β contains separate intercept coefficients for the lake and stream (β1 and β2) as well as coefficients for sex (β3) and size (β4, β5) while the αj term indicates additional (random) effects associated with each sampled stream site j. The variance of εi was fixed at one due to non-identifiability of individual-level overdispersion in binomial GLMs (Gelman & Hill 2006). As with parasites, non- or weakly informative priors were used for all parameters, which were estimated by drawing 1000 samples from their joint posterior distributions using MCMCglmm.
As a metric of habitat bias (whether allele frequency was greater in one habitat or the other), we estimated the derived parameter δm for each allele m, which is equal to the log of the ratio of allele frequency estimates for the lake and the stream. Alleles with δ<0 are more common in the lake, and δ<0 are more common in the stream. Alleles with a 95% posterior probability of occurring at least twice as frequently in one habitat were considered ‘habitat-specific’ in subsequent analyses. We use 'habitat-bias' to refer to a less stringent form of divergence (e.g., 95% posterior for δ excludes 0).
Estimating MHC allele effects on infection
We next estimated whether the presence or absence of each MHC allele m in individual fish is associated with each parasite taxon p’s abundance. If an allele helps the host recognize and resist a particular parasite, then individuals with that allele should be less intensely infected by that parasite. We call this a 'negative' allele-parasite association because the allele has a negative effect on the parasite. Positive associations (an allele’s presence coincides with heavier infection loads) can arise for several reasons including susceptibility, if the parasite directly exploits that allele to establish an infection (Westerdahl et al. 2012).
For each lake-stream pair, we used hierarchical models to estimate the effect of each MHC allele on each parasite. The basic form of each model was:
where yi is the abundance of a given parasite taxon p in individual i. The vector β includes the same 5 regression coefficients as the parasite specificity models, plus an additional coefficient associated with the presence/absence of the focal allele m (β6). As before, the αj term indicates samplinglocation effects on abundance (i.e. 'random' effects). The error term εi gives the fitted residual error for individual i, accounting for any observed overdispersion in the abundance data. For the few instances where one MHC allele strongly covaried with another allele (Yule’s |Q| > 0.8, see Supplementary Material), we included the correlated (non-focal) allele as an additional factor in our model. When two alleles were perfectly correlated, however, we dropped the less common allele.
Models were fit in a Bayesian probability framework using MCMC sampling implemented in the MCMCglmm package (Hadfield 2010). Alleles were transformed from 0/1 variables to a continuous variable with a mean zero to avoid issues with separation. Fish length was scaled to a mean of zero and standard deviation of 1 (within habitats) prior to model fitting (Gelman et al. 2008). As before, MCMC chain parameters were determined heuristically by increasing the thinning interval until all estimated parameters achieved an autocorrelation less than 0.1. Posteriors for each allele/parasite combination were estimated by drawing 1000 samples from their joint posterior distributions. Posterior means, standard errors, and 95% high probability density intervals (HPDIs) for all estimated allele effect sizes were calculated from these posterior distributions.
We use the parameter β6 as our measure of the effect of allele m on parasite p, which we hereafter denote βm,p. Because we assume that most true allele effects are zero (a given allele has no discernible effect on a given parasite), but that a few alleles will have moderate to strong effects on parasite abundance, those effects whose 95% HPDI's for βm,p did not include zero were delineated as "non-zero" effects. Note that when βm,p>0, the focal allele is associated with higher abundance of the given parasite in a given habitat, and when βm,p<0 the allele is associated with lower abundance. For shorthand we refer to these alternative outcomes as susceptibility and resistance, respectively.
Testing for balancing or divergent selection
We used the estimates of MHC allele frequency differences between habitats (δm), parasite abundance differences between habitats (θp) and MHC-parasite association strengths (βmp coefficient in model (3) above), to test for signatures of balancing or divergent selection using the logic explained in the introduction (Fig. 1). Specifically, we tested whether lake-biased alleles (δm>0) disproportionately protect the host (βmp<0) from lake-biased parasites (θp> 0), and conversely whether stream-biased alleles (δm<0) protect the host (βmp<0) from stream-specific parasites (θp<0). When defining stream-specific parasites (or alleles), we retain those whose 95% HPDI of θp (or δm) excludes zero. Our prediction can be tested with a linear model examining whether alleles’ protective effects (βmp) depend on an interaction between the allele-frequency bias (δm) and which habitat a parasite is specific to (θp). Balancing selection should also generate a δm*θp interaction, but with the opposite slopes compared to divergent selection (Fig. 1C).
The above analysis focuses only on strongly lake- and stream-specific parasites, because these are most likely to drive divergent selection. As a consequence, that analysis omits parasites that are common or rare in both habitats. We repeated the analysis by regressing each MHC-parasite effect (βmp) on the relevant allele’s habitat bias (δm), parasite’s habitat bias (θp), and a δm*θp interaction, with habitat as a factor as well. We expected to observe a significant δm*θp interaction whose direction would distinguish between selection models.
The preceding tests focus on allele frequency differences between habitats (δm), rather than absolute allele frequencies within habitats. This is most appropriate when considering gene flow and divergent selection, but local absolute allele frequency may be more relevant to frequency-dependent selection by parasites. We therefore repeated the analyses described above, but using within-habitat MHC allele frequency instead of the between-habitat frequency difference (δm). Specifically, we regressed the MHC-parasite effect βmp against the allele’s frequency in whichever habitat the focal parasite is most abundant in (e.g., stream frequency when θp<0, lake frequency when θp>0). We did this focusing on only the convincingly non-zero MHC-parasite associations (whose 95% HDPI excludes zero), and then again using all pairwise associations.
Results
Parasite abundance differences between habitats
A total of 34 parasite taxa were identified across the three lake-stream pairs, although not every parasite was present in every population or pair (Fig. 3). Within each pair, the parasite community differed substantially between habitats. Per-fish parasite richness was significantly higher in lake than stream habitats for all three pairs (Fig. S1). More parasite taxa were strongly habitat-specific to lakes (θp >> 0, for 8, 5, and 5 taxa in Comida, Farewell, Roberts Lakes respectively) than to their adjoining streams (2, 0, and 0 taxa respectively). Crepidostomum was the only parasite that was considered lake-specific in all three lake-stream pairs (Blackspot, Thersitina, and Unionidae were lake-specific in two of three pairs). Anisakis and Bunodera met our approaches our strict habitat-specific threshold in the three streams. But, most parasites exhibit variable, weak, or no habitat affiliation (Fig. 3).
Positive values of θp indicate lake-biased parasites, while negative values indicate stream bias. Note θp is calculated on a logarithmic scale. Posterior means are indicated by circles while the bars indicate 95% credible intervals. Credible intervals must fall completely above or below the gray band in each panel to meet our criterion of regarding parasites as habitat-specific (a high probability of being at least twice as abundant in one habitat than in the other). Filled circles indicate habitat-specific parasites.
Allele prevalence differences between habitats
We identified 374 unique MHC alleles across our three lake-stream pairs, 95% of which were restricted to a single lake-stream pair. Within each lake-stream pair, up to 13% of the MHC alleles were strongly habitat-specific specific (at least a 2-fold frequency difference, |δm|>>0; Fig. 4). There were more lake-specific alleles (9,7, and 9 in Comida, Farewell, and Roberts respectively) than stream-specific (2, 6, 2 respectively). No allele was habitat-specific in more than one lake-stream pair. For the 19 alleles shared among replicate pairs we found no parallel evolution of habitat differences (e.g., δm was not correlated across independent pairs).
Positive values of δm indicate lake-biased alleles, while negative values indicate stream-biased alleles. Note that δm is calculated on a logarithmic (non-linear) scale. Posterior means are indicated by circles while the bars indicate 90% credible intervals. Credible intervals must fall complete above or below the gray band in each panel to indicate alleles with high probability of occurring twice as frequently in one habitat compared to the other. Habitat-specific alleles are indicated by filled circles and are labeled. Alleles are ordered along the x axis by increasing values of δm. Note that few alleles are shared between the three lake-stream pairs.
The diversity of MHC is comparable between habitats (Fig S2). All sites exhibit on average about six unique MHC amino acid sequences per fish, albeit with substantial among-individual variation. In contrast, neutral genomic SNPs exhibited consistently lower nucleotide diversity in stream than lake fish (Fig. S3). Consequently, relative to neutral expectations MHC diversity is relatively higher in stream stickleback than in lake stickleback.
Associations between MHC alleles and parasite infection
We estimated association strengths (βmp) between a total of 6006 combinations of MHC allele versus parasite taxon. Models for an additional 678 possible allele-parasite combinations failed to converge adequately, usually due to low extreme rarity of the parasite or allele. For the few cases where two alleles are statistically strongly linked (see Supp. Mat. on Yule’s Q), we dropped one of the redundant alleles. These tests found a substantial number of robust associations. Sixty-two MHC allele – parasite taxon associations had 95% HPDIs that did not include zero (Figs. 4&5, Table 4). No single allele is strongly associated with more than one parasite.
Overall, there were approximately 50% more negative than positive effects estimated within each pair (Table 3). Negative associations imply that fish carrying the focal allele are less-heavily infected by the focal parasite. The bias towards negative effects may be an artifact of comparing rare alleles to rare parasites, so henceforth we restrict our attention to strongly supported associations. Of those strong effects, 23 were negative. For example, stream-biased allele P293 coincided with an 18 fold reduction (β=-2.87 [-5.17,-0.78]) in the abundance of the lake-specific parasite Crepidostomum in Comida Lake (Fig. S4). Another 39 strong effects were positive, for instance lake-specific allele P342, associated with an 16-fold increase (β=2.75, [1.25,4.11]) in (lake-specific) Blackspot infection loads.
Several of the strong MHC-parasite associations exhibit a pattern consistent with local adaptation via divergent selection (Table S4): rare alleles conferring susceptibility to local parasites.Fish carrying MHC allele P273 carry ~3-fold more Unionidae (Fig. S5), which may explain why this allele is less common in the lake (δ=-2.73 [-4.68,-1.15]) where Unionidae are 314-fold more abundant (θ=5.75, [4.38,7.27]). Conversely, two MHC alleles are rarer in Comida stream and confer susceptibility to stream-specific Apatemon.
Other strong MHC-parasite associations support balancing selection: common alleles conferring susceptibility to local parasites. MHC allele P231 is 12-fold more common in Roberts Lake than stream (δ=2.50 [0.84, 4.36]). In the lake, it confers a 4-fold greater probability of infection by a lake-specific cestode (Fig. S6; θ=6.93, [5.12, 9.31]). Similarly, allele P403 is 72-fold more common in the Comida Lake (δ=4.28 [2.13, 6.94]) where it confers 3-fold higher risk of infection by lake-specific Crepidostomum (θ=2.99, [2.14, 3.91]; Fig. S7). Assuming these parasites reduce host fitness, these associations favor locally rare alleles over the currently-abundant P231 or P403 alleles.
Tests for divergent or balancing selection
Aggregating across many such allele-parasite associations, we found no overall trends towards towards divergent or balancing selection. Regressions revealed no significant effect of habitat-biased MHC allele frequency (δm) on the posterior mean estimate of the strong MHC-parasite associations (βmp; Fig. 6A). There was no significant relationship for either lake-specific parasites (θp>>0, t40=1.22, P=0.246), or stream-specific parasites (θp<<0, t18=0.465, P=0.647). Lake-stream pair had no effect in either regression (P>0.2). Putting these regressions into a single ANCOVA, we confirmed that the slopes of βmp ~ δm were indistinguishable from zero for both lake- and stream-specific parasites (overall δm effect P=0.1876). There was no significant interaction between δm and parasite-habitat (P=0.733), thus refuting both hypotheses’ expectation of opposing slopes for lake- versus stream-specific parasites. The effect of δm and the δm* habitat interaction were also non-significant if we expanded the analysis to include all MHC-parasite associations (βmp, regardless of their strength, using only habitat-specific parasites. We expanded this still further to use all 6006 βmp estimates (regardless of strength of δm, θp, or βmp) we still found no significant θm*δp interaction (t6680=1.49, P=0.1365). The only significant effect was that all MHC alleles (regardless of δm) were more susceptible to lake-biased parasites than stream-biased parasites (Fig. S8; positive effect of θp on βmp; t6680=2.47, P=0.0137).
Each point represents a single estimated effect of βmp, calculated for a given lake-stream pair. The y-axis shows the proportion of the posterior distribution greater than zero (for positive effects) or less than zero (for negative effects). Effect sizes are given on the latent (i.e. natural log) data scale. Black circles indicated effects with 95% HPD intervals that do not include zero.
We only plot associations whose 95% HPD intervals do not include zero (black circles from Fig. 5). Blue squares represent positive effects (alleles associated with higher parasite load), red represent negative effects (alleles conferring lower parasite load). We plot each lake-stream pair separately (Comida at the top, then Farewell, then Roberts). Within each pair, alleles to the right of the vertical dashed line are more common in the lake (δm>0), alleles to the left are more common in the stream (δm<0). Parasites above the horizontal dashed line are more common in the lake (θp>0), parasites below the line are more common in the stream (θp <0). Examples of allele-parasite associations are plotted in the Supplementary Figures.
Lastly, we tested for effects of local allele frequency, rather than allele habitat-bias. MHC-parasite effect sizes (βmp) were negatively correlated with the focal allele’s frequency in the habitat where the parasite is relatively common (Fig. 7). This is true whether we focus only on large-effect estimates of βmp (t60=-3.87, P=0.00028), or on all estimates of βmp (t6682=-2.10, P=0.0446). For either variant on the analysis, the effects are weak (r2=0.186 and 0.0005, respectively). The negative trend arises because locally rare alleles (which are not necessarily immigrants) are most likely to confer susceptibility to local parasites (βmp>0). In contrast, locally common alleles are equally likely to exhibit positive or negative effects on local parasites (Fig. 7).
An empirical test of local adaptation versus balancing selection, as illustrated in our conceptual diagram (Fig. 1). We plot MHC allele effect size on parasites (βmp; positive values imply susceptibility, negative values imply resistance) as a function of the alleles’ relative abundance in the lake or stream (δm; positive values imply higher frequency in the lake, and negative values imply higher frequency in the stream). Each point represents a non-zero association between an MHC allele and a parasite taxon (95% HPD intervals of βmp do not include zero; black circles from Fig. 4). We plot separate regression lines for parasites that tend to be more common in the lake (blue, θp>0) versus stream (green; θp<0), because we predicted their slopes would have opposite signs. Habitat-specific parasites (strong frequency bias) are indicated by filled points. Habitat-specific alleles are indicated by larger points. We combine all three lake-stream pairs in this plot, because different alleles were involved in parasite susceptibility or resistance in each pair.
Allele frequency is calculated as the log2 of the fraction of individuals carrying the allele. Each point is an allele in a particular habitat in a lake-stream pair. For each allele, we calculated its average effect size (βm.) across all present parasites. Only those alleles with at least some significant parasite associations are included.
Discussion
Stickleback in lake and stream habitats harbor distinct but overlapping MHC class II genotypes, and substantial MHC diversity within populations (Chain et al. 2014; Eizaguirre et al. 2012a; Stutz & Bolnick 2014). We tested whether this between- and within-population MHC variation systematically supports a role of divergent versus balancing selection. We estimated pairwise statistical associations between 374 MHC alleles and 34 parasite taxa in three replicate lake-stream pairs, revealing a moderate number of clear associations between the presence of an MHC allele and less or greater infection by a particular parasite. This represents an exceptional survey of MHC-parasite associations in nature, revealing many more MHC-parasite associations than we expected, though most alleles were strongly associated with just a single parasite, if any. These associations included both positive and negative effects that may be loosely interpreted as evidence of susceptibility and resistance respectively.
These associations do not, by themselves, provide a clear test for divergent or balancing selection (Piertney & Oliver 2006). Instead, the MHC-parasite associations should be interpreted in the context of the alleles’ and parasites absolute or relative frequencies in each habitat (Fig. 1). To our knowledge, this has not previously been reported. We found a few MHC-parasite associations that clearly supported with either balancing or divergent selection. But, was no overall tendency for immigrant MHC alleles to be more susceptible to local parasites (implying divergent selection) or less susceptible (implying balancing selection). Instead, we found a weak but unexpected trend for locally rare (but not foreign) alleles to confer greater infection susceptibility.
We observed significant between-habitat differences in parasite community composition. A few parasites were systematically habitat-specific or habitat-biased (Crepidostomum, Unionidae, Thersitina, and Blackspot in lakes; Anisakis, Apatemon and Bunodera in streams). Consistent with prior findings from other lake-stream pairs (Eizaguirre et al. 2010; Feulner et al. 2015), stream parasite communities were a less diverse subset of the neighboring lake parasites.
We expected these differences in parasite community composition to generate divergent natural selection on stickleback immune genes. Consistent with this expectation, we observed significant differences in MHC genotypes between parapatric lake and stream sites. Approximately 10% of alleles exhibit substantial (> 2-fold) frequency differences between parapatric habitats. But,genomic SNPs (most of which will be neutral) also exhibit significant divergence along the sampled clines (Weber et al. 2017). Despite divergence in MHC genotypes, we saw little evidence for local adaptation. A few alleles are rare in habitats where they confer susceptibility to a parasite (e.g., P273 is rare in Comida Lake where Unionidae is common). Such patterns may arise if selection largely eliminates alleles from habitats where they are detrimental. However, this depletion of locally susceptible alleles involves only a few MHC alleles from Comida. Taking a larger view across many allele-parasite associations, we found no general trend for alleles common in a given habitat to confer (i) protection against that habitat’s parasites, or (ii) susceptibility to parasites in the neighboring habitat. The prediction illustrated in Fig. 1B is therefore not supported.
In a few cases, we found rare MHC alleles associated with reduced infection rates. This apparent rare-allele advantage fits expectations from balancing selection. However, as with local adaptation, this particular outcome is confined to a few examples. There is no overall tendency for locally rare alleles to be more resistant (or, locally common alleles to be more susceptible). We therefore found no overall support for balancing selection in any of the three lake-stream pairs, despite trying multiple variants on our analytical approach.
An alternative approach to testing balancing selection is to ask whether MHC-parasite associations depend on allele frequency within a given habitat, rather than frequency differences between habitats. We find that locally rare alleles tend to confer susceptibility to local parasites, whereas locally common alleles are equally likely to be susceptible or resistant. This result is consistent with the notion that selection removes alleles that are susceptible to local parasites. But, by focusing specifically on local allele frequency (rather than between-habitat frequency difference), this result does not prove that different alleles are favored in the two habitats. Indeed, the susceptible rare alleles are typically not immigrants (e.g., not more common in the other habitat).
To summarize, we proposed contrasting predictions to distinguish between balancing versus divergent selection. Both predictions were supported by a few allele-parasite combinations, but neither was supported overall. There may be several explanations for these equivocal results. First, divergent- and balancing selection may in fact be weak or absent. This conclusion would be odd, given the high MHC diversity within populations and divergence between adjoining populations that readily exchange migrants. Second, MHC has been widely linked to mate choice decisions in stickleback and other vertebrates (Lenz et al. 2009; Milinski 2006), so divergent sexual selection, might plays a primary role in MHC population structure. Lastly, divergent and balancing selection might act concurrently. As we show here, some parasites might drive divergence in some alleles’ frequencies, while other parasites target locally common alleles. But, the net effect may be that these two selective processes obscure each other’s signals in an overall meta-analysis, as we find.
Some additional caveats are worth noting. Our results are based on a brief survey of three lake-stream pairs in a single season and year. It may be that the strongest selection occurs at another season, ontogenetic stage, or year. Also, we focused exclusively on readily visible macroparasites, but MHC evolution could plausibly also depend on readily overlooked symbionts including but not limited to gut microbiota (Bolnick et al. 2014). Lastly, although some stickleback parasites are well known to reduce host fitness, we do not presently know how host survival or fecundity depend on infection loads of all parasites examined here.
We observed a substantial number of MHC allele –parasite associations, consistent with typical expectations that MHC IIβ is involved in immunity to macroparasites. However, it is surprising that positive and negative associations (‘susceptibility’ and ‘resistance’, respectively) were about equally common. Why would so many MHC alleles, when present in a fish, coincide with greater infection by a certain parasite? A first possibility is that positive effects are spurious consequences of having alternative alleles. The presence of one allele may imply the absence of an alternative allele with protective value. This explanation is unlikely in our present study, because we statistically accounted for moderately correlated alleles. A second explanation could be that an allele facilitating recognition of one parasite might result in immunological trade-offs that inhibit resistance to another parasite. For instance, an MHC allele that recognized a microbe might drive an inflammatory response that inhibits resistance to a subsequent helminth infection (Moser & Murphy 2000; Oladiran & Belosevic 2012; Salgame et al. 2013). A third possibility entails direct interactions among parasites. If one parasite inhibits invasion of another parasite, then an allele that resists the former may facilitate infection by the latter (Hafer & Milinski 2015). Lastly, because we are sampling wild-caught adult fish, a positive correlation between genotype and infection could reflect a tolerance effect of the allele. If individuals with a given allele are more likely to survive a chronic infection, the allele will be enriched among infected survivors, compared to uninfected individuals(Westerdahl et al. 2012). This last point brings up an important caveat about our analysis: we assume that higher infection load implies lower fitness, but variation in infection tolerance, and survival prior to our sampling effort, complicates this interpretation.
Prior studies of stickleback have suggested that MHC heterozygosity is itself under stabilizing selection (Wegner et al. 2004; Wegner et al. 2003). The suggestion is that individuals with few MHC alleles are unable to recognize enough parasites, whereas individuals with too many alleles have reduced T-cell receptor diversity, resulting in an intermediate optimal MHC heterozygosity. Prior studies suggested that lower parasite diversity in streams than in lakes, causes a lower optimal allelic diversity (Feulner et al. 2015). In contrast, we find comparable MHC diversity in the lake and stream populations, even though the stream parasite community is less diverse. Moreover, stream sticklebacks had lower effective population sizes based on genomic SNP data (Weber et al. 2017). So, MHC diversity is actually relatively high in the stream (compared to neutral markers) despite their lower parasite diversity. Our samples therefore do not support the proposal that stream stickleback have a lower optimal MHC diversity.
Conclusions
A great many studies have tested for divergent or balancing selection on MHC, in numerous vertebrate species (reviewed by (Bernatchez & Landry 2003; Edwards & Hedrick 1998; Eizaguirre & Lenz 2010; Hedrick 2002; Piertney & Oliver 2006; Yasukochi & Satta 2013). Few of these studies have simultaneously tested for both forms of selection (Tobler et al. 2014). Most of these studies yield some support for one hypothesis or the other, but frequently the supporting evidence has important caveats and some inconsistencies. Consequently, the evolutionary maintenance of MHC diversity within and between populations remains something of a puzzle despite extensive research. Our own data exacerbate this puzzle. We found some support for both divergent and balancing selection, depending on which allele and parasite we considered. But, at the scale of all alleles and parasites, we found no predominant signal favoring one form of selection over the (Fig. 1).
We propose that there in fact may not be a predominant form of selection at this multi-locus gene family. Rather, balancing and divergent selection act simultaneously on different MHC II alleles, in association with different parasites. Some alleles may experience a native advantage, while others may experience a rare-allele advantage. Current analytical approaches are not effective at separating such simultaneous forms of selection. Future work on MHC evolution must therefore account for many parasite species concurrently, and the distinct but simultaneous selective pressures that each may exert.
Data Accessibility
Data required to reproduce the analyses presented in this paper will be made publically available at the time of publication. 454 amplicon sequencing reads have been uploaded to____. Tables of allele presence/absence from these 454 amplicon sequences will be uploaded to Dryad doi:____. Tables of parasite infections and fish ecomorphology and sex will be uploaded to Dryad at doi:____, along with meta-data linking individual fish to their MHC genotype and sampling location.
Author contributions
WES and DIB collaboratively planned the data collection and analysis. WES conducted the field work, lab work, sequencing, bioinformatics. The statistical analyses were conducted primarily by WES with contributions from DIB. WES and DIB wrote the manuscript together.
Acknowledgments
We thank Claire Patenia and Jason Clu for lab assistance. This research was supported by fellowships to DIB from the David and Lucile Packard Foundation and the Howard Hughes Medical Institute, and a grant to WES from the Graduate Program in Ecology Evolution and Behavior at UT Austin.