The variable ELF3 polyglutamine tract mediates complex epistatic interactions in Arabidopsis thaliana

Short tandem repeats are hypervariable genetic elements that occur frequently in coding regions. Their high mutation rate readily generates genetic variation contributing to adaptive evolution and human diseases. We recently proposed that short tandem repeats are likely to engage in epistasis because they are well-positioned to compensate for genetic variation arising at other loci due to their high mutation rate. We previously reported that natural ELF3 polyglutamine variants cause reciprocal genetic incompatibilities in two divergent Arabidopsis thaliana backgrounds. Here, we dissected the genetic architecture of this incompatibility and used a yeast two-hybrid strategy to identify proteins whose physical interactions with ELF3 were modulated by polyglutamine tract length. Using these two orthogonal approaches, we identify specific genetic interactions and physical mechanisms by which the ELF3 polyglutamine tract may mediate the observed genetic incompatibilities. Our work elucidates how short tandem repeat variation, which is generally underascertained in population-scale sequencing, can contribute to phenotypic variation. Furthermore, our results support our proposal that highly variable STR loci can contribute disproportionately to the epistatic component of heritability.

b), which can reveal or conceal the phenotypic consequences of many other genetic 1 variants. A mechanistic explanation of this robustness phenomenon is epistasis, in 2 which a robustness gene interacts with many other loci ; 3 Lachowiec et al. 2015), as for the promiscuous chaperone HSP90 (Taipale et al. 2010). 4 Our previous findings and the many studies describing ELF3's crucial functions in plant 5 development lead us to hypothesize that ELF3 lies at the center of an epistatic network 6 and that the ELF3's polyglutamine tract modifies these interactions. 7 It is well-established that ELF3 functions promiscuously as an adaptor protein in 8 multiple protein complexes that are involved in a variety of developmental pathways (Liu 9 et Polyglutamine tracts such as the one encoded by the ELF3-STR often mediate protein 11 interactions (Perutz et al. 1994;Stott et al. 1995;Schaefer et al. 2012). Therefore, it is 12 plausible to assume that variation in the ELF3 polyglutamine tract affects ELF3's 13 interactions with its partner proteins. The ELF3 C-terminus, which contains the STR-14 encoded polyglutamine tract, is necessary for nuclear localization (Herrero et al. 2012) 15 and ELF3 homodimerization (Liu et al. 2001), but thus far only one other protein 16 (Phytochrome Interacting Factor 4, PIF4) has been shown to interact with this ELF3 17 domain (Nieto et al. 2014). Thus, the phenotypic and epistatic effects of ELF3-polyQ 18 variation may arise from altered protein interactions, altered ELF3 nuclear localization, 19 altered regulation of the PIF4 developmental integrator, or a combination thereof. 20 Here, we dissect the epistatic landscape modifying the function of the ELF3-STR 21 through both physical and genetic interactions, and present evidence that this STR 22 forms the hub of a complex network of epistasis, likely due to its role as a compensatory 23 modifier of several other loci.

32
Genotyping: For genotyping the ELF3 STR and other loci across many F 2 segregants, 33 1-2 true leaves from each seedling were subjected to DNA extraction. Seedlings were 34 stored on their growth plates at 4º before genotyping but after phenotypic analysis. For 35 genotyping the ELF3 STR, PCR was performed in 10 µL volume containing 0.5 µM 36 primers (Table S1), 0.2 µM each dNTP, 1 µL 10X ExTaq buffer, and 0.1 U ExTaq 37 (Takara, Tokyo, Japan); with initial denaturation step of 95º for 5', followed by 40 cycles 38 of 95º 30", 49º 20", 72º 10", with a final extension step at 72º for 5'. For other loci, PCR 39 was performed in 20 µL volume containing 0.5 µM primers (Table S1) Mini kit (Qiagen, Valencia, CA) according to the kit protocol. This DNA was quantified 5 using high-sensitivity Qubit fluorescence analysis (ThermoFisher Scientific, Waltham, 6 MA) and re-genotyped with ELF3-STR primers (Table S1). We used 10 ng DNA from 7 each F 2 segregant in NextEra transposase library preparations (Illumina, San Diego, 8 CA), or a standard 50 ng preparation for the Ws library. Library quality was assessed on 9 a BioAnalyzer (Agilent, Santa Clara, CA) or agarose gels. The Ws individual was 10 sequenced in one 300-cycle MiSeq v2 run (300 bp single-end reads) to ~12X coverage. 11 The F 2 segregant libraries were pooled and sequenced in one 200-cycle HiSeq v3 run 12 to ~2X average coverage (100 bp paired-end reads, interactions were PCR cloned into the EcoRI/XhoI sites of pGADT7 from cDNAs of 5 indicated strains (Table S1 for primers). Clones were confirmed by restriction digest and 6 sequencing. The Y2H screen was performed against the Arabidopsis Mate and Plate 7 cDNA library (Clontech, Madison, WI), essentially according to the manufacturer's 8 instructions, except selections were performed on C-leu -trp -his plates incubated at 9 23º. Clones which also showed activation of the ADE2 reporter gene and did not 10 autoactivate were subsequently tested against the various ELF3-polyQ constructs (see 11 Supplementary Text for details, full details on clones given in File S1). 12 LacZ activity was assayed through X-gal cleavage essentially as previously 13 described ( Col and Ws backgrounds did not substantially differ in this trait (p = 0.16, Kolmogorov-39 Smirnov test, Figure 1A). Although most F 2 seedlings showed phenotypes within the 40 range of the two parental lines, the F 2 phenotypic distribution showed a long upper tail alleles. We replicated this observation in a much larger population (1106 seedlings), 2 which was used for further genetic analysis.

4
To investigate the genetic basis of the phenotypic transgression in hypocotyl length, we 5 harvested the 720 most phenotypically extreme seedlings (longest and shortest 6 hypocotyls) for genotyping ( Figure S1). Each individual seedling was genotyped at the 7 ELF3 locus, using primers directly ascertaining the 27bp ELF3-STR-length 8 polymorphism between Col and Ws. Across these individuals, we observed a strong 9 main effect of the ELF3 locus on phenotype ( Figure 1B), in which the Col allele of ELF3 10 frequently showed transgressive phenotypes, though some individuals homozygous for 11 the Ws allele also showed transgressive phenotypes. Specifically, a naïve regression 12 analysis of the data in Figure 1B indicated that each ELF3-Col allele increased 13 hypocotyl length by 0.87±0.077 mm, and that the ELF3 locus thereby explained 15% of 14 phenotypic variation. This analysis is misleading, because it implies that Col seedlings 15 should show longer hypocotyls than Ws seedlings due to ELF3 genotype -this is not 16 the case ( Figure 1A).

17
Among these seedlings, the individuals with extreme phenotypes and individuals 18 homozygous at the ELF3 locus are expected to be most informative about ELF3-STR 19 effects on phenotype. Furthermore, ELF3 genetic interactions are expected to be most 20 apparent in ELF3 homozygotes. Consequently, we used a novel genetic approach to 21 detect epistasis between ELF3 and other loci as follows. For each ELF3 STR allele, we 22 selected 24 homozygotes (Ws/Ws and Col/Col) at each phenotypic extreme (the 23 shortest and longest hypocotyls). The sampling of extremes is an effective and 24 statistically justified method for genetic mapping (Lander and Botstein 1989). These 96 25 individuals were analyzed in a genotyping-by-sequencing approach ( With these data, we performed a one-dimensional QTL scan to identify 2 chromosomal regions contributing to hypocotyl length ( Figure 2A). This analysis 3 indicated a QTL on Chr2 corresponding to ELF3 as expected, but also significant QTL 4 on Chr1, Chr4, Chr5, and potentially one or more additional QTL on Chr2 affecting the 5 phenotype. A two-dimensional QTL scan suggested that at least some of these QTL 6 interact epistatically with the ELF3 locus ( Figure S4). 7 We binned F 2 s homozygous at ELF3 according to their ELF3 genotype, and 8 performed one-dimensional QTL scans on each homozygote group separately (masking 9 the genotypes of all other individuals). We observed that the same LOD peaks were 10 replicated well in ELF3-Col homozygotes, but poorly in ELF3-Ws homozygotes ( Figure  11 2B LOD scores between the two QTL scans at all loci, we found that the peaks on Chr1, 31 Chr2, and Chr4 (and to a lesser extent Chr5) were all stronger in Col ( Figure 2C). 32 Consequently, these loci constitute background-specific ELF3 interactors. 33 We considered the genetic contribution of these loci to the phenotype using a 34 multiple QTL mapping approach, using both the independently estimated QTL locations 35 and a refined model re-estimating QTL positions based on information from all QTLs 1 (Table S4). In each case, loci of strong effect on Chr1, Chr2, and Chr4 were supported, 2 along with interactions between Chr2 (ELF3) and the other two loci. In the refined 3 model, the Chr5 locus and the second (other than ELF3) Chr2 locus were also strongly 4 supported. We conclude that although ELF3 interacts epistatically with a variety of other 5 loci in determining hypocotyl length, the principal contributors to ELF3-mediated effects 6 on the trait are on Chr1 and Chr4. Moreover, direct inspection of phenotypic effects of 7 ELF3 in interaction with each putative locus among F 2 segregants supported the 8 hypothesis of epistasis with ELF3 most clearly for the Chr1 and Chr4 loci ( Figure S5). regions (Khattak 2014). 16 We phenotyped mutants of several candidate genes in the Col background under 17 the conditions of our intercross experiment (15d SD hypocotyl length, Figure S6). We 18 observed small phenotypic effects of the T-DNA insertion mutants lsh9 and nup98. 19 However, these small effects on their own cannot explain the transgressive phenotypic 20 variation in F 2 s ( Figure 1A). Ler-1, Cvi-0), with the small [TA] n STR boxed. Ws-2 is a separately maintained stock of 7 the Ws (Wassilewskija) strain. The Ws-2-specific polymorphism is highlighted in red. 8 9 We generated double mutants between these mutants and the elf3 null mutant to 10 determine whether these genes interacted epistatically with ELF3. We found little 11 evidence for an interaction between nup98 and elf3 mutations ( Figure S6C). However, 12 we detected a significant interaction between ELF3 and LSH9, in the form of reciprocal 13 sign epistasis between the two null mutants affecting hypocotyl length ( Figure 3A). 14 Although lsh9 single mutants had significantly shorter hypocotyls than WT, lsh9 elf3 15 double mutant hypocotyls were substantially longer than in elf3 single mutants. LSH9 16 (LIGHT-DEPENDENT SHORT HYPOCOTYLS 9) is an uncharacterized gene belonging 17 to a gene family named for LSH1, which is known to act in hypocotyl elongation (Zhao 18 et al. 2004). Like other genes in this family, LSH9 encodes a putative nuclear 19 localization sequence but no other distinguishing features. 20 To test our hypothesis that ELF3-STR mediated epistasis may be due to altered 21 protein interactions, we investigated whether LSH9 and ELF3 interacted physically 22 using Y2H. However, we were unable to detect a physical interaction between the Col 23 or Ws variants of LSH9 and ELF3 ( Figure S7), suggesting a different mechanistic basis 24 for the observed genetic interaction. For example, LSH9 expression may depend on 25 ELF3 function as a transcriptional regulator. Alternatively, ELF3 expression may depend 26 on LSH9 function. We tested both hypotheses by measuring expression levels of 27 LSH9/ELF3 in, respectively, elf3 or lsh9 mutant backgrounds (elf3 mutants were 28 available in both Col and Ws backgrounds, lsh9 only in Col). ELF3 expression levels 29 were unchanged in lsh9 mutants ( Figure S8). Moreover, levels of LSH9 transcript did 30 not significantly differ between WT and elf3 mutants in either strain background. 31 However, LSH9 expression was reduced in the Ws background relative to Col 32 independently of ELF3 genotype( Figure 3B). This result is consistent with the observed 33 phenotypic interaction in F 2 s, which showed elongated hypocotyls when Col alleles at 34 the ELF3 locus co-segregated with Ws alleles at the LSH9 locus ( Figure S5), thereby 35 pairing poorly-functioning ELF3 alleles with potentially lower LSH9 expression levels.

36
Taken together, ELF3-LSH9 epistasis between Col and Ws may be due to 37 regulatory changes between these two backgrounds altering LSH9 transcript levels. 38 Coincidentally, we observed that the LSH9 promoter contains an STR polymorphism in 39 the Ws background that may alter LSH9 expression; alternatively LSH9 altered 40 expression in Ws may be due to trans-effects ( Figure 3C). were repeated with independent PJ69-4α + pGADT7-X transformants with similar 7 results. (B): LacZ assays support polyQ effects on ELF3-ELF4 interaction. The strains 8 shown in (A) also express LacZ from the Y2H promoter, whose activity was assayed in 9 cell lysates (see Methods). In each assay, all observations are expressed relative to the 10 activity of the empty vector, whose mean is set to 0. Error bars indicate standard 11 deviation across three technical replicates. This experiment was repeated with similar 12 results. activation in yeast when paired with an empty vector ( Figure S9). The ELF3-interacting 22 domain of PHYB has two coding variants between Col and Ws, and we thus tested both 23 Ws and Col variants of this domain. We found that both forms showed apparently equal 24 affinity with all polyQ variants of ELF3. ELF4, which has no coding variants between Col 25 and Ws, also interacted with all polyglutamine variants of ELF3, though rather weakly 26 compared to PHYB. Under these conditions, a subtle preference of ELF4 for longer 27 polyQ variants (e.g. ELF3-16Q and ELF3-23Q) was apparent. We confirmed this 28 preference in a quantitative, growth-independent assay in which LacZ expression is 29 driven by the Y2H interaction ( Figure 4B). 30 We were not able to replicate the previously reported ELF3-PIF4 interaction 31 (Nieto et al. 2014) for any ELF3-polyQ variant in our Y2H system ( Figure S10), and 32 were thus unable to evaluate effects of polyQ variation on ELF3-PIF4 interactions. 33 Together, our data suggest that ELF3-polyQ tract variation can affect ELF3 1 protein interactions, in particular if these interactions are weaker (as for ELF4) and 2 presumably more sensitive to structural variation in ELF3. assays support polyQ effects on ELF3-At-GLDP1 interaction. The strains shown in (A) 10 also express LacZ from the Y2H promoter, whose activity was assayed in cell lysates 11 (see Methods). In each assay, all observations are expressed relative to the activity of 12 the empty vector, whose mean is set to 0. Error bars indicate standard deviation across 13 three technical replicates. This experiment was repeated with similar results. 14 15 Y2H screen identifies three novel ELF3 interactors, one of which is polyQ-16 modulated: None of the known ELF3 interactors were encoded by genes located in the 17 major Chr1 and Chr4 QTLs identified by our genetic screen. If the ELF3-polyQ tract 18 mediates protein interactions, these regions should contain additional, previously-19 undescribed polyQ-modulated ELF3 interactors. We screened the ELF3-7Q protein for 20 interactions with proteins from a commercially available library derived from Col, to 21 detect ELF3-protein interactions within the Col background. 22 We subjected Y2H positives to several rounds of confirmation (Supporting Text), 23 yielding a total of three novel proteins that robustly interacted with ELF3: PLAC8-24 domain-containing protein AT4G23470, LUL4, and AtGLDP1 ( Figure 5). AT4G23470 25 was recovered in two independent clones, and LUL4 was recovered in three 26 independent clones. The PLAC8-domain protein AT4G23470 is encoded by a gene 27 within the QTL interval on chromosome 4, but this protein showed no variation in affinity 28 among the various ELF3-polyQs. LUL4, a putative ubiquitin ligase, is not encoded in 29 any of the mapped QTL and also shows no variation in affinity among the various ELF3-30 polyQs.Thus, differential interaction with these proteins is unlikely to underlie the 31 observed epistasis. 32 In contrast, the AtGLDP1 protein, which is encoded on chromosome 4 but not 1 within the QTL interval, appeared to show a subtle preference for the synthetic ELF3-0Q 2 construct over longer polyQs tracts. We confirmed this preference in a quantitative LacZ 3 assay ( Figure 5B). Although our screen is unlikely to exhaust hitherto-unknown ELF3 4 interactors, our data suggest that the ELF3-polyQ tract can affect ELF3's interactions 5 with other proteins. Moreover, polyQ variation appears to affect weaker ELF3-protein 6 interactions; strong protein interactions (for example ELF3-LUL4, ELF3-PHYB) are 7 robust to polyQ variation.
The contribution of STR variation to complex traits is thought to be considerable (Kashi 11 et al. 1997;Press et al. 2014). Specifically, it has been proposed that STR variation 12 contributes disproportionately to the epistatic term of genetic variance, due to its 13 potential to contribute compensatory mutations. However, the molecular mechanisms 14 by which different STRs contribute to genetic variance should derive from their 15 particular features. For instance, polyQ variation may be expected to affect protein 16 interactions (Perutz et al. 1994;Schaefer et al. 2012) or the transactivation activity of 17 affected proteins (Escher et al. 2000). In this study, we considered the case of the 18 previously-described ELF3 STR (Undurraga et al. 2012). 19 We found that the genetic architecture of ELF3-dependent phenotypes is highly 20 epistatic between the divergent Col and Ws strains, leading to substantial phenotypic 21 transgression in the well-studied hypocotyl length trait. We identified at least 3 QTLs 22 showing genetic interactions with the ELF3 STR in a Col x Ws cross. These QTLs 23 generally did not coincide with obvious candidate genes known to affect ELF3 function.

24
Our confirmation of one genetic interaction (LSH9) in the Col background suggests that 25 these QTLs encompass variants affecting hypocotyl length in tandem with ELF3 STR 26 variation. We cannot formally exclude the hypothesis that variants linked to ELF3 (other 27 than the Col and Ws ELF3-STR variants) may contribute to the observed phenotypic 28 variation. For example, the ELF3-A362V-substitution in the A. thaliana strain Sha affects 29 ELF3 function in the circadian clock (this site is invariant between Col and Ws) (Anwer 30 et al. 2014). However, our previous work demonstrated that ELF3-STR variation 31 suffices to produce strong phenotypic incompatibility between the Col and Ws 32 background (Undurraga et al. 2012). Therefore, we reason that ELF3-STR variation is 33 the most parsimonious explanation for the phenotypic variation, and in particular the 34 observed transgression in the Col x Ws cross.

35
We further used Y2H screening to explore whether ELF3 polyQ tract variation 36 affects protein interactions. ELF3's promiscuous physical associations with other 37 proteins are essential to its many functions in plant development (Liu et al. 2001;38 Kolmos et al. 2011;Nusinow et al. 2011;Herrero et al. 2012). Disruption of these 39 interactions suggested an attractive mechanism by which ELF3 polyQ tract variation 40 might affect ELF3 function. Assaying several known and novel ELF3-interacting proteins 41 yielded evidence for a modest effect of polyQ variation on weaker protein interactions. 42 However, there was no generic requirement for specific ELF3 polyQ tract lengths across 43 all interactors. Indeed, the modest effects that we found were interactor-specific and 1 thus not likely to generalize. 2 We did find that the ELF3-ELF4 interaction, which is crucial for circadian function 3 and thus hypocotyl length (Nusinow et al. 2011;Herrero et al. 2012), demonstrates a 4 subtle preference for the Ws 16Q ELF3 variant. However, there is no sequence 5 variation in ELF4 between Col and Ws and we did not detect the ELF4 locus by QTL 6 analysis, suggesting that this binding preference does not explain the transgressive 7 phenotypes revealed by ELF3-STR variation ( Figure 1A). However, the subtle polyQ-8 dependence of the ELF3-ELF4 interaction may play a role through indirect interactions. 9 Alternatively, rather than modulating ELF3 function as an encoded polyQ tract, 10 the ELF3-STR may affect ELF3 transcription or processing. A previous study of an 11 intronic acts as a 'robustness gene' (Lempe et al. 2013). The best-described example of such is 21 the protein chaperone HSP90 (Rutherford and Lindquist 1998), whose multiple transient 22 interactions with many proteins-about 10% of the yeast proteome (Zhao et al. 2005)-23 lead to pleiotropic effects upon HSP90 inhibition or dysregulation (Sangster et al. 2007).

24
ELF3 has been previously proposed as a robustness gene (Jimenez-Gomez et al. 25 2011), consistent with its promiscuity in protein complexes and the pleiotropic nature of 26 elf3 mutant phenotypes. Our finding that functional modulation of ELF3 by polyQ 27 variation reveals several genetic interactors is consistent with this interpretation.

28
A similar hypothesis is that ELF3 "gates" robustness effects from robustness 29 genes with which it interacts. For instance, we have recently shown that ELF3 function 30 is epistatic to some of HSP90's pleiotropic phenotypic effects (unpublished data, M. 31 Zisong, P. Rival, M. Press, C. Queitsch, S. Davis), and ELF4 has also been proposed 32 as a robustness gene governing circadian rhythms and flowering (Lempe et al. 2013). 33 Here, we show that polyQ variation affects ELF3-ELF4 binding, which would provide a 34 mechanistic link between ELF3 polyQ effects and a known robustness gene. 35 These hypotheses remain speculative in the absence of more explicit tests. 36 Nonetheless, we suggest that the pleiotropic effects of polyQ variation in ELF3 (or 37 similar cases) may be better understood by considering ELF3 as a robustness gene, in 38 which phenotypic effects are determined by a variety of important but individually small 39 interactions of this highly connected epistatic hub. 40 41 ACKNOWLEDGMENTS 42 We thank Karla Schultz and Katie Uckele for technical assistance. We thank Choli Lee 43 and Jay Shendure for assistance with high-throughput sequencing of the Col x Ws F 2 44 population and members of the Shendure laboratory for advice regarding library 1 preparation. We thank Amy Lanctot for generating the pGBK-ELF3-0Q and pGBK-2 ELF3-23Q constructs. We thank Daniel Melamed and Stanley Fields for guidance in 3 carrying out Y2H experiments and the generous gift of yeast strains. We thank Giang 4 Ong and Maitreya Dunham for access to the MiSeq instrument for resequencing the Ws 5 genome. We thank Stanley Fields and Evan Eichler for access to LightCycler 6 instruments. We thank Elhanan Borenstein and members of the Queitsch and 7 Borenstein laboratories for helpful conversations. MOP was supported in part by 8 National Human Genome Research Institute Interdisciplinary Training in Genome 9 Sciences Grant 2T32HG35-16. CQ is supported by National Institute of Health New 10 Innovator Award DP2OD008371.