Requirements for efficient cotranscriptional regulatory switching in designed variants of the Bacillus subtilis pbuE adenine-responsive riboswitch

Riboswitches, generally located in the 5’-leader of bacterial mRNAs, direct expression via a small molecule-dependent structural switch informing the transcriptional or translational machinery. While the structure and function of riboswitch effector-binding (aptamer) domains have been intensely studied, only recently have the requirements for efficient linkage between small molecule binding and the structural switch in the cellular and cotranscriptional context begun to be actively explored. To address this, we have performed a structure-guided mutagenic analysis of the B. subtilis pbuE adenine-responsive riboswitch, one of the simplest riboswitches containing a secondary structural switch. Using a cell-based fluorescent protein reporter assay to assess ligand-dependent regulatory activity in E. coli, these studies revealed previously unrecognized features of the riboswitch. Most importantly, it was found that local and long-range conformational dynamics in two regions of the aptamer domain have a significant effect upon efficient regulatory switching. Further, sequence features of the expression platform including the pre-aptamer leader sequence, a nucleation helix and a putative programmed pause have clear affects upon ligand-dependent regulation. Together, these data point to sequence and structural features distributed throughout the riboswitch required to strike a balance between rates of ligand binding, transcription and secondary structural switching via a strand exchange mechanism.


Introduction
Riboswitches are RNA elements that can adopt two distinct structures to direct expression of a message, depending upon the occupancy status of a small-molecule receptor domain (known as the aptamer domain) [1][2][3]. These structures generally inform transcription through differential formation of a rho-independent transcriptional terminator or translation by either occluding or exposing the ribosome binding site, although a diverse set of other mechanisms of expression regulation have been found [4][5][6]. This straightforward mechanism of ligand-dependent control of gene expression without the assistance of accessory proteins has made these devices attractive for use in various synthetic biology applications [7,8]. In part, the means to raise RNA sequences that bind with high affinity and specificity to a broad spectrum of small molecules using in vitro selection approaches and the modularity of RNA secondary structure make the design and implementation of such input-output devices conceptually simple [9,10].
However, engineering modular riboswitches and other RNA devices has been problematic, such that only a few robust regulatory elements have achieved real-world application [11]. This suggests that the basic understanding of how these RNAs mechanistically translate ligand occupancy of the aptamer domain into an observable response remains incomplete. One major hurdle in the robust design of riboswitches is that they generally only function in the context of transcription [1,[12][13][14], requiring careful consideration of kinetic processes such as rates of ligand binding, higher-order RNA folding and secondary structural switching processes such as strand exchange [15]. Furthermore, the influence of RNA sequence on processes such as transcriptional pausing and participation of factors such as NusA or Rho on riboswitches in the cellular context further confounds their design and implementation [16][17][18]. It is likely that many of these processes are influenced by cryptic sequence elements difficult to engineer into an RNA of interest.
Natural riboswitches represent important model systems for understanding the sequence features that enable how RNA devices can be engineered to be efficient ligand-dependent genetic regulators that function within a broad spectrum of industrially useful bacteria [19]. One of the simplest riboswitches containing a discrete ligandbinding domain and secondary structural switching domain (otherwise known as the expression platform) is the Bacillus subtilis adenine-responsive pbuE regulatory element (Fig 1) [20], making it ideal for detailed analysis of coupling of ligand binding to regulatory activity [21]. The aptamer domain shares the same general threedimensional architecture as all purine riboswitches with the ligand binding site embedded within the three-way junction, adjacent to the 3'-side of the P1 helix [22,23].
Overall organization of the aptamer is achieved by a loop-loop interaction (L2 and L3, Fig 1) distal to the junction and P1 helix whose formation is ligand-independent in some riboswitches such as the B. subtilis xpt-pbuX guanine-sensing riboswitch [24][25][26] and dependent on ligand binding in others such as the pbuE aptamer [27][28][29]. When ligand is bound, the aptamer including the P1 helix is stabilized against formation of an alternative structure enabling full transcription of the mRNA and expression of the encoded purine efflux pump under high adenine conditions [20]. In this scheme, cyan represents uridine, green represents adenine, red represents guanine, and purple represents cytosine; residue numbering reflects the experimentally determined start site of transcription [30]. (B) Secondary structure of the pbuE riboswitch in the unbound state (OFF) with the intrinsic terminator formed at the expense of the P1 and P3 helices.
Single molecule analysis of the adenine-dependent folding of the pbuE riboswitch in the context of transcription has suggested a mechanism for how ligand binding to the aptamer domain prevents formation of the competing intrinsic terminator element [31].
In this model, during transcription the aptamer hierarchically and rapidly folds into a productive conformation by the time the polymerase reaches a hexa-uridine tract found at the 3'-side of the nucleator element. The nucleator element caps the terminator element and its structure is independent of ligand binding (Fig 1). Shortly thereafter, sequence on the 3'-side of the helix begins to strand exchange through the P1 helix. In the absence of ligand, this migration rapidly continues through J3/1 and P3 to form the functional terminator helix that prompts the RNA polymerase to disengage from RNA synthesis, and thereby gene expression is repressed. However, in the presence of ligand, strand exchange is impeded by ligand-dependent organization of the three-way junction and formation of base triples involving J2/3 and the two junction-proximal base pairs of P1. This set of triples is proposed to act as a kinetic roadblock to further strand exchange into the junction and P3, providing the RNA polymerase sufficient time to synthesize the message past the intrinsic terminator's poly-uridine tract and escape the riboswitch. Together with other biochemical and biophysical studies, there is compelling evidence that the pbuE riboswitch is a "kinetically controlled" switch such that it does not reach thermodynamic equilibrium with respect to adenine binding during the timeframe of transcription of the leader sequence [12,31]. Ultimately, the terminator form of the riboswitch represents the most thermodynamically stable state of RNA and the riboswitch adopts this structure regardless of ligand concentration [32].
Despite serving as a model system for understanding the physical basis for the coupling of binding and regulation, the pbuE riboswitch has not been extensively investigated in the cellular context. Because many riboswitches only function in the context of transcription, understanding the sequence and structural requirements of the expression platform conferring ligand-dependent activity requires a cell-based reporter assay. Towards this end, a robust fluorescent protein reporter assay has been established for the pbuE riboswitch using the adenine analog 2-aminopurine (2AP) as the ligand [33]. This study revealed several facets of the expression platform that influence ligand-dependent regulatory activity. The most important finding was that the riboswitch is highly tolerant to significant variation in the length of the P1 helix. This contrasts results of biophysical studies which proposed that P1 helix length and associated thermodynamic stability are critically important for proper switching. Instead, this result supports a low-energy strand displacement mechanism that governs formation of the riboswitch's alternative secondary structures [33]. Second, these data suggested that a ligand-independent secondary structure in the expression platform that we call the "nucleator element" plays an important role in regulatory activity. This indicates that sequences and secondary structures within the expression platform influence the regulatory response.
In the current study, we seek to further define regions of the pbuE riboswitch that influence the magnitude of the ligand-dependent regulatory activity using a structureguided mutagenic approach. To assess whether the aptamer domain has features beyond the conserved ligand binding site that are required for efficient secondary structural switching, a set of B. subtilis purine-responsive aptamers was spliced into the pbuE expression platform to create chimeras homologous to the native pbuE riboswitch.
Despite very high sequence and structural similarities, these designed chimeras fail to function in E. coli. Regulatory activity is restored by introducing point mutations in two regions that influence the conformational ensemble of the unliganded state, suggesting new roles for sequences in the aptamer domain. To further clarify structural effects on strand invasion dynamics, we examined the sequence of the P1 helix, the nucleator element, and a putative programmed pause. These mutants revealed altered regulatory properties that indicate the nucleator element and an adjacent six uridine tract are crucial for efficient ligand-dependent upregulation, while the sequence of the P1 helix is likely tuned to influence the rate of strand invasion immediately prior to the key regulatory roadblock. Together, these data reveal new aspects of how the riboswitch's activity is modulated by key sequence and structural elements and suggest new design strategies for the engineering of novel riboswitches.

Construction of reporter plasmids
For each riboswitch a set of overlapping DNA oligonucleotides was synthesized to construct the gene using recursive PCR [34]. Using standard molecular cloning techniques [35], each riboswitch variant (S1 Table) was cloned upstream of the gfpuv gene to regulate its expression in a ligand-dependent manner. The parental vector containing gfpuv is derived from a low-copy pBR327 plasmid [36] and contains the strong rrnB terminator upstream of a synthetic insulated promoter [37] of moderate strength to limit transcription of upstream genes. S1 Table provides  Cell-based fluorescence assays E. coli K12 strain BW25113 (Keio knockout collection parental strain) [38] was transformed with sequence-verified reporter plasmid using standard protocols [35].
Transformants were plated on 2xYT growth medium agar plates containing 100 µg/mL carbenicillin for resistance marker selection. A single colony was picked and used to inoculate an overnight 2 mL culture of 2xYT containing 100 µg/mL of ampicillin and grown at 37 °C. A 20 µL volume of overnight culture was used to inoculate three individual 2 mL cultures of rich, chemically defined growth (CSB) medium [36] containing 100 µg/mL of ampicillin. Three additional 2 mL CSB plus ampicillin cultures containing 1 mM 2-aminopurine were also inoculated with saturated overnight cultures in a 1:100 dilution. These secondary cultures were grown to an OD 600 between 0.  Table).

Co-transcriptional RNA folding simulations
The co-transcriptional folding landscapes of select mutant riboswitches were modeled using Kinefold (http://kinefold.curie.fr; [39]) Riboswitch RNA sequences starting at the transcription initiation site and ending after the poly-uridine tract of the rho-independent transcriptional terminator were used in the simulation with the transcriptional speed set as a new nucleotide (nt) added every 20 milliseconds (50 nt sec -1 ) with pseudoknots disallowed. Helical tracing graphs were obtained from these simulations as well as folding movies showing the lowest free energy structure with the addition of each base. The resultant folding trajectories allow for visual determination of persisting secondary structures that are not reported in the helical folding graphs.

Other purine riboswitch aptamer domains can drive the pbuE expression platform
It has been previously hypothesized that expression platforms are modular and capable of hosting different aptamers to create "chimeric" riboswitches, which has been shown experimentally for both "OFF" and "ON" switches using a single-turnover in vitro transcription assay [11,36,40]. To determine whether the pbuE structural switch can be regulated by other purine riboswitch aptamers, a conservative chimeric riboswitch was designed that fused the B. subtilis xpt-pbuX guanine/hypoxanthine-responsive riboswitch aptamer domain [22] with the B. subtilis pbuE adenine-responsive riboswitch expression platform (Fig 2a). For these studies, the C74U variant of the xpt aptamer was used that switches its specificity to adenine/2-aminopurine, which has been extensively used to study this aptamer and does not alter the properties of this RNA [41][42][43]. Note that while only P2 and P3 are boxed in Fig 2a, the junction-proximal two base pairs of P1 as well as all nucleotides in J1/2 and J3/1 are identical between the xpt(C74U) and pbuE aptamers as well as seven out of eight nucleotides in J2/3. The only nucleotide difference in J2/3 is G56 of pbuE, the equivalent position being uridine in xpt. Thus, the primary differences in these aptamer domains lie in P2/L2 and P3/L3. Since the P3 sequence is different between these two aptamers, compensatory changes were made in the expression platform of the xpt(C74U)/pbuE chimera to ensure formation of a stable transcriptional terminator hairpin that is the same length as that of the wild type pbuE riboswitch (Fig 2a). As the secondary and tertiary structure of all purine riboswitches is highly conserved, we hypothesized that this conservative aptamer substitution in the chimera (xpt/pbuE A) would robustly turn on reporter protein expression in the presence of 2-aminopurine (2AP). However, compared to the wild type pbuE riboswitch, the chimera exhibited no activation of gene expression, with low reporter expression in both the presence and absence of 2AP indicating strong constitutive termination (Fig 2B). To understand why this chimeric RNA fails to regulate gene expression, an extensive mutational analysis of the aptamer domain was performed in which elements of the xpt(C74U) P2 and P3 were systematically substituted back for the pbuE sequences. While most of these variants failed to rescue regulatory activity (data not shown), it was found that two specific point substitutions made in the xpt(C74U) aptamer resulted in robust 2AP-dependent gene regulation ( Fig 2B). The first point substitution was in a region of the aptamer referred to as the P2 "tune box", which comprises the junction-proximal two base pairs in P2 helix along with the last nucleotide of J1/2 ( Fig 1A) [43]. Natural variation in this region of the purine riboswitch aptamer has a significant effect on its kinetic and thermodynamic ligand binding properties. It was hypothesized based upon chemical probing data that this region affects the organization of the three-way junction in the absence of effector ligand [43]. However, introducing a tune box sequence identical to that of pbuE into the xpt(C74U)/pbuE chimera (xpt/pbuE C) alone was insufficient to restore switching ability ( Fig 2B).
The second region that affects regulatory activity of the xpt(C74U)/pbuE chimera is the base pair in P3 proximal to L3. To achieve the "OFF" state, the pbuE terminator must invade into L3 and disrupt the stable L2-L3 interaction in the process. Single molecule and fluorescence lifetime studies demonstrated that while the xpt L2-L3 interaction is stable in the absence of ligand [25,26,29]; the equivalent interaction in pbuE is only transiently stable in the absence of ligand binding [27,28]. Further, introduction of an A-A mismatch as seen in pbuE at the base of the xpt L3 destablizes the L2-L3 interaction [24]. Introducing a point substitution into the xpt(C74U)/pbuE hybrid to yield an A-A mismatch (xpt/pbuE B) was insufficient to restore regulation ( Fig   2B). In contrast, simultaneously introducing the pbuE P2 tune box and L2-L3 destabilization element into the hybrid (xpt/pbuE D) successfully restored switching activity to that of wild type with decreased leaky gene expression in the absence of ligand as compared to wild type ( Fig 2B). Furthermore, simply repairing the mismatched base pair in the xpt aptamer domain P2 to a Watson-Crick (xpt/pbuE E) also restores switching performance and improves leakiness when introduced in concert with the distal A-A mismatch. These data strongly suggest the conformational dynamics of the aptamer and associated crosstalk between the L2-L3 interaction and the three-way junction in the unliganded state are critical for establishing a strong ligand-dependent regulatory response.
To determine whether these two features of the xpt aptamer also influence the ability of other guanine-binding aptamers to regulate the pbuE structural switch, we investigated several other chimeras. Using the same strategy as with xpt to create a 2AP-responsive chimera, the B. subtilis yxjA (S5 Fig) and purE (S4 Fig) guanine riboswitch aptamer domains [44] were placed into the wild type pbuE context. The wild type purE riboswitch aptamer domain lacks both the P3 A-A mismatch and a fully base paired P2 tune box, leading to the design of a "repaired" purE with the same mutations as the optimal xpt/pbuE E construct (xpt A-A P3 plus paired tune box). Consistent with xpt(C74U)/pbuE, the purE(C74U)/pbuE hybrid did not show any switching activity ( Fig   2C) and gene expression in the presence or absence of 2AP was low. The repaired purE(C74U)/pbuE hybrid showed improved termination in the absence of 2AP and increased aptamer stability in the presence of 2AP with a 4-fold induction of reporter expression ( Fig 2C). Switching performance was not as effectively repaired in the case of purE(C74U)-pbuE compared to xpt(C74U)-pbuE, but a significant restoration of regulatory activity was still observed.
Unlike the purE guanine-sensing aptamer, the wild type yxjA aptamer contains a P3 A-A mismatch at the base of L3 as well as a P2 tune box with two Watson-Crick pairs like the pbuE aptamer. Thus, this aptamer was anticipated to work with the pbuE expression platform and yield effective switching without altering its sequence.
Surprisingly, the yxjA(C74U)/pbuE chimera displayed little 2AP-dependent induction of expression ( Fig 2C). A variant of yxjA containing the tune box sequence identical to the pbuE aptamer was also examined; this repaired yxjA(C74U)/pbuE chimera showed only modest switching activity, though expression increased in both the presence and absence of 2AP relative to the unrepaired chimera. Together, these data suggest that two elements within the P2 and P3 helices of the purine aptamer domain that often harbor base mismatches, while not directly involved in ligand binding, are nonetheless critical for establishing a regulatory response. However, it is clear there remain other still unidentified features in the aptamer domain that are important for regulatory activity.

The P1 helix has a modest influence on regulatory activity
The above data reveal that sequence features in P2 and P3 have a direct impact upon regulatory activity. It is also likely that the sequence of the P1 helix, a critical player in the regulatory switch, has a significant role in promoting the ligand-dependent structural switch. In a prior study, it was hypothesized that the sole G-C base pair in pbuE P1 represents an important road block to rapid invasion through P1, giving the aptamer sufficient time to bind effector [45]. To directly test this hypothesis, mutations were made to P1 that either replace the G-C pair with an A-U pair or add G-C pairs to the junction-distal region of the P1 helix ( Fig 3A). Mutations designed to weaken the P1 helix were anticipated to decrease expression in the absence of ligand due to the increased ease of strand invasion by the expression platform through P1 while expression in the presence of ligand would decrease due to destabilized aptamer folding. When the P1 helix is weakened with an A-U Watson-Crick or G-U wobble pair (P1-AU, P1-GU), a small increase in fluorescence relative to wild type was observed in the absence of ligand along with more significant decreases in fluorescence relative to wild-type in the presence of ligand, in support of this hypothesis (Fig 3B).
Conversely, adding G-C pairs to P1 was expected to decrease expression in the absence of ligand as strand invasion through P1 is impaired. Given this expectation, it was unanticipated that increasing the G-C base pair content of the P1 helix with G-C pairs (P1-GC2a, -GC2b and -GC3) results in only slightly less expression in the absence of ligand than wild-type while expression in the presence of ligand gradually decreased. The most marked decrease in regulatory activity occurs in the P1-GC3 variant, whose levels of expression are low regardless of 2-AP and overall fold-induction is below 2. Altering the pre-aptamer leader sequence unmasks a strong influence of P1 sequence on regulatory activity While the above results reveal a modest influence of the sequence composition of the P1 helix on activity, the effects of overall trends were not clear. In particular, it was surprising that P1-GC2b and -GC3 did not show strong expression of the reporter in the absence of ligand, suggesting that the aptamer domain was somehow debilitated despite stabilizing the P1 helix. One possibility is that mutations in P1 resulted in alternative structure(s) that disrupt formation of a functional aptamer or terminator. To explore this, a subset of sequences were analyzed by Kinefold, a program that simulates co-transcriptional folding pathways [39]. Analysis of the wild type pbuE riboswitch used in this study reveals a folding pathway in which P2 and P3 fold, followed by a brief formation of P1 that is then disrupted by folding of the terminator--consistent with experimentally derived models of the pbuE folding pathway ( [45], [31]). However, in both the P1-GC2b and P1-GC3 variants, the helical formation traces revealed the appearance of alternate pairing schemes involving the 5'-side of P1 and the preaptamer leader sequence, indicating alternative folds disruptive to the P1 helix and aptamer formation.
With this hypothesis in mind, these two variants were redesigned to minimize alternative pairing. Using Kinefold simulations, the pre-aptamer sequence was altered to prevent alternative pairing with P1 helical elements (P1-GCb,rep and P1-GC3,rep).
These "repair" variants ( Fig 4A) show different regulatory properties. Repairing the mismatched folding in P1-GC2b was sufficient to restore switching to the level of wild type pbuE. Notably, only a few point substitutions of the pre-aptamer sequence that destabilize an alternative helix are sufficient to restore activity (Fig 4B). Conversely, repair of P1-GC3 either by point substitutions or a larger deletion did not greatly improve switching. The above data indicates that the pre-aptamer leader sequence can confound mutagenic analysis of riboswitch function via the formation of unanticipated alternative structure. To minimize this issue, a new set of P1 mutants was designed in which the pre-aptamer sequence was almost completely removed through a deletion of nucleotides 1-27 (Fig 1). This deletion in the context of the wild type riboswitch did not substantially alter its regulatory properties, although a small increase in the 2APdependent reporter gene expression was noted (S3 Fig). In the absence of the pre-aptamer sequence (Fig 4C), a clear trend in the function of the P1 helix variations was revealed (Fig 4D). Variations that weakened the P1 helix, (∆27)P1-AU and (∆27)P1-GU showed no or very little ability to activate gene expression in the presence of 2AP. Notably, (∆27)P1-AU displayed a dramatic decrease in termination ability, suggesting that the terminator does not efficiently invade through the aptamer even in the absence of ligand binding. In contrast, (∆27)P1-GU is able to terminate, but not as well as other variants, suggesting that the weakened P1 helix (and weakened terminator helix in the complementary region) does not promote full strand exchange on the timescale that enables the regulatory switch. Variants that stabilized the P1 helix by adding another G-C base pair ((∆27)P1-GC2a and (∆27)P1-GC2b) display expression levels and fold-induction comparable to wild type, indicating that the riboswitch accommodates a more G-C rich P1 helix. However, further addition of a third G-C pair to the P1 helix ((∆27)P1-GC3) yields an RNA that is constitutively terminated, indicating that the G-C rich P1 helix likely supports highly efficient strand exchange to form the terminator helix.

The nucleator element is critical for secondary structural switching
In a strand exchange model of ligand-dependent secondary structural switching, the terminator helix must rapidly nucleate to begin the process of strand exchange with the P1 helix of the aptamer domain. It has been observed that riboswitch expression platforms often contain a hairpin element that can be formed in either conformational state that is proposed to serve as a nucleation element (Fig 1A) for the secondary structure that competes with the P1 helix [33,46]. Like related toehold-mediated strand displacement processes [47,48], the kinetics of invasion is likely related to the length and/or base pair composition of the nucleator helix. To examine the importance of nucleator stem-loop length on regulatory activity, the wild type nucleator element was replaced with a set of designed nucleator helices of varying lengths capped with a stable GAAA tetraloop (Fig 5A). This series spans from 0 to 8 Waston-Crick base pairs in length (note that the wild type nucleator helix is 8 base pairs including two U-U mismatches), with their predicted free energies ranging from 0 to -17.2 kcal/mol. This series of RNAs displays a trend towards increased fold induction in the presence of 2AP with one clear exception. (Fig 5B). As the length of the nucleator helix increases from 0 to 6 base pairs, the fold induction increases from 2-to over 4-fold, indicating that increasing helix length has a positive effect on the regulatory activity. NH-6 has the most similar expression levels of GFP reporter in the absence and presence of 2-AP to the wild type riboswitch, although the overall fold induction is lower. Further decreases in the length of the nucleator element yield systematically lower levels of fold induction (~2-3-fold). Surprisingly, regulatory activity is not completely abolished in the NH-0 variant, indicating that the nucleator element is not essential for activating gene expression, albeit at low levels. Notably, the most conservative variant, NH-8, that preserves the overall length of the nucleator helix shows fold induction close to that of the 0 base pair variant, NH-0. This suggests that while the nucleator element's length contributes to promoting ligand-dependent regulatory activity, its effect is potentially obscured by perturbing other key elements in this region of the expression platform.
A second feature of the nucleator element that may be important for efficient regulation is a potential "programmed pause" at the 3'-side of the helix. It well established that many riboswitches contain stretches of uridines in their expression platforms that represent RNA polymerase pause sites that give more time for events such as RNA folding and ligand binding to occur [13,17,49,50]. In the pbuE riboswitch, a hexa-uridine tract is located at the 3'-side of the nucleator element that could serve as a pause site [13]. This uridine tract has been observed to be a pause site in vitro using E. coli and B. subtilis RNAP [30], but was not identified as a pause site in a genomewide survey of transcriptional pausing in B. subtilis [51].
Other than wild type, none of the variants in the nucleation length series likely support transcriptional pausing at the uridine tract on the 3'-side of the nucleator element because the hexa-uridine tract at nucleotides 103-108 at the 3'-side of the nucleator helix have been altered. To determine the importance of this pause, a variant of NH-8 was tested in which two G-C pairs were converted to A-U pairs, thereby restoring the third and fourth uridines of the tract (NH-8 U) (Fig 6A). This change resulted in complete recovery of 2AP-dependent regulatory activity with a fold induction that exceeded wild type (Fig 6B). Without the pause site, it is likely that the terminator rapidly invades through P1 regardless of the presence of effector because the aptamer is not given sufficient time to bind ligand. If the stability of the nucleator helix were the only determinant of its influence on regulatory activity, then NH-8 should perform equally well as this variant. To confirm this hypothesis, a number of variants were designed that preserve the six uridines on the 3'-side, including a variant with a five base pair helix (NH-5 U) and variants of NH-4 and NH-0 from the original nucleator length series (NH-4 U, NH-0 U) (Fig 6C). NH-5 U was found to have better termination efficiency and switching ability than NH-6, NH-4, NH-2, NH-0 ( Fig 6D) and NH-5 without the six uridine pause (Fig 6E), while NH-4 U and NH-0 U both outperform their hexauridine pause deficient counterparts, further suggesting that the uridine tract is critical for modulating the temporal window of transcription to enable key events more time to occur.
To explore whether the position of the hexa-uridine tract is crucial for efficient regulation, variants of NH-8, NH-8 U, and NH-5 containing a six uridine loop in place of the GNRA tetraloop were designed. In the case of NH-5 6U Loop, the performance of the riboswitch was equivalent to that of wild type and NH-5 ( Fig 5G). NH-8 U 6U Loop (containing a hexauridine loop as well as a hexauridine tract) performed similarly to wild type but did not achieve the same performance as NH-8 U suggesting some impairment by the loop substitution. However, NH-8 6U Loop with only a hexauridine tract in the terminal loop (equivalent to NH-5 6U) displayed minimal switching ability (Fig 5D).
Together, these data further indicate that the pause is important for aptamer binding rather than nucleation of the competing terminator element, however the exact mechanism of kinetic control via the hexauridine pause remains unclear. In an attempt to assimilate the various advantageous regulatory features observed in this study, a riboswitch was designed with no pre-aptamer sequence and a five base pair nucleator helix with a six-uridine pause on the 3' side. This construct outperformed wild type and all other variants tested in this study with a fold induction near 25-fold (Fig 6E). The ability to pare down the expression platform to only the required elements with great success further reinforces the validity in approaches that exploit natural expression platforms are hosts for alternative aptamers to create novel riboswitches.

Discussion
While the structure, conformational dynamics and ligand-binding properties of isolated riboswitch aptamer domains have been extensively studied, how these properties direct the expression platform to affect gene regulation remains poorly understood. In this current study, we have used a structure-guided mutagenesis strategy in concert with a cell-based fluorescent protein reporter to examine how sequence elements in both the aptamer and the expression platform influence the regulatory activity of the B. subtilis pbuE adenine riboswitch. These data reveal that many sequence elements outside of the core ligand-binding site and secondary structural switch play a significant role in establishing ligand-dependent reporter gene expression. Unexpectedly, establishing regulatory function to the xpt(C74U)/pbuE chimera required two point-substitutions that are located ~21 Å from each other in the structure of the xpt aptamer. Individually, each point substitution does not rescue activity, suggesting that these the point substitutions are required to either suppress two distinct defects in the aptamer domain or re-establish cross-talk between the loop-loop interaction and the three-way junction. If the latter is the case, this reveals that the conformational dynamics of the three-way junction and the L2-L3 interaction are interlinked. The linkage between L2-L3 dynamics and adenine binding has been previously observed in single-molecule and fluorescence lifetime measurements of the Vibrio vulnificus add adenine riboswitch [27][28][29]. As the importance of the L3 closing pair and the P2 tunebox on regulatory activity was further validated using chimeras with B.

Base pairs proximal to interhelical regions play a significant role in establishing regulatory activity
subtilis yxjA(C74U) and purE(C74U) aptamer domains, we propose that this linkage is a universal and essential feature of guanine and adenine riboswitches. Furthermore, it is very likely that since the wild type xpt(C74U) aptamer binds 2AP with high affinity as established in previous work [43] these point substitutions are required to restore some aspect of the linkage between ligand binding and the regulatory switch. This opens the intriguing possibility that aptamer dynamics in the apo state are not only crucial for ligand binding but also for efficient structural switching.
It is important to note that these restorative point substitutions reside not in noncanonical features but within adjacent helical elements. This reveals that the sequence elements at the termini of the helical elements play a significant role in RNA function. In an analysis of the base pairing patterns in the 50S ribosomal RNA, it was observed that the ends of helices are enriched in non-canonical pairs over the interior of helices (6.3% versus 1.1%) [52]. Base pairs in helices flanking junctions were found to play an important role in both local and global conformational dynamics of RNA [53]. In a recent example, the two base pairs flanking an internal bulge motif were found to influence ligand selectivity by a cobalamin riboswitch [54]. Notably, one of these pairs is a noncanonical A-C pair, as observed in the ligand-bound crystal structure [55]. Together, these data suggest that helical pairs-Watson-Crick or non-canonical-flanking interhelical regions should be considered as part of the junction or loop module. For example, the GNRA and UNYG tetraloop motifs are actually two nucleotide loops flanked by a non-canonical base pair (G-A or U-G, respectively) [56]. Further, the proximal Watson-Crick pair to tetraloop is often a C-G pair and is considered to be part of the extended tetraloop element and serves to impart significant thermostability to the loop [57]. This has implications for design of novel RNAs using a modular design approach with recurrent tertiary structural motifs [58]. Rather than considering just the module such as a terminal loop (for example, a tetraloop or T-loop) or internal loop (such as a kink-turn), flanking helical regions should be considered as part of the module.

Efficient strand exchange is strongly dependent on the sequence of the P1 helix
Another helical structural element whose sequence composition has been proposed to be important for efficient ligand-dependent regulatory switching is the P1 helix. While the two base pairs proximal to the three-way junction are critical for ligand binding due to long-range interactions with bases in J2/3, the other base pairs in P1 are not directly involved in ligand recognition. Yet, phylogenetic alignments of purine riboswitches reveal a marked preference for either specific base pairs or R-Y versus Y-R orientation [19,59]. Since these base pairs do not participate directly in ligand binding or ligand-induced RNA structure, the most likely explanation for sequence preferences in this region is to facilitate strand exchange between the P1 helix and an alternative helix in the expression platform.
The initial set of P1 sequence variants designed to test this hypothesis revealed no clear correlation between sequence and regulatory activity (Fig. 3B). A cotranscriptional analysis of several variants suggested that misfolding involving the pre-aptamer leader sequence and the aptamer can occur. This was validated for one variant (P1-GC2b) in which mutations introduced into the pre-aptamer sequence that disrupt this alternative fold rescue regulatory activity. Alternative structures involving the pre-aptamer sequence have been observed in the V. vulnificus add adenine riboswitch in which alternative secondary structure was proposed to be a critical feature of regulation at different temperatures [60]. To resolve this issue, the pre-aptamer leader sequence, which has been previously shown to have only slight effects on performance of the switch [33], was completely removed. These observations reinforce that the preaptamer sequence, while generally ignored, can nonetheless influence the regulatory activity of a riboswitch.
The sequence composition of the three base pairs at the 5'-side of P1 helix has a very strong influence over secondary structural switching. Surprisingly, conversion of the G-C pair to an A-U strongly impaired the ability of the terminator to form, even in the absence of 2AP, while three G-C pairs drove strong termination both in the presence and absence of 2AP. This suggests that the progress of strand exchange through the P1 helix by the competing terminator helix is less efficient in the context of weaker A-U pairs than with stronger G-C pairs. In contrast, robust regulatory activity is supported by either one or two G-C pairs in P1. Together, these results support a hypothesis that sequence conservation patterns in the P1 helix of many riboswitches are directly related to facilitating a rapid strand exchange process, a key component of the kineticallycontrolled regulatory mechanism.

Nucleation of the intrinsic terminator
Many riboswitches have secondary structural elements in the expression platform whose formation is independent of ligand binding to the aptamer. These elements presumably facilitate the switching function of the expression platform, such as by promoting rapid nucleation of alternative secondary structure. These features have previously been shown to be required for efficient regulatory activity in the lysine [61] and a purine riboswitch [33]. In the later study on the pbuE riboswitch, a limited set of variants that strengthened the terminator helix was tested, demonstrating that stabilizing the wild type nucleator element facilitated termination.
In this study, we designed a more systematic set of nucleation elements that explored both length and sequence composition of the nucleator. Unexpectedly, the length of the stem of the nucleator hairpin had only a moderate effect on the degree of activation of reporter expression. The stem length of the nucleator helix is clearly not the only structural feature that in the expression platform that optimizes performance. An element as short as four base pairs support near wild type levels of regulatory activity when a hexa-uridine tract is included in or directly following the antiterminator helix.
However, even the absence of a helix supported minimal upregulation indicating that the presence of the nucleator is an important, but not essential, aspect of liganddependent secondary structural switching.
These data strongly support the role of a poly-uridine RNA polymerase pause site as a significant component of the regulatory switch. This finding is best exemplified by comparison of the NH-8 and NH-8 U variants. While an eight base pair nucleator helix shows strong switching performance (NH-8 U), conversion of two A-U pairs to G-C pairs that disrupt the 3'-uridine tract resulted in an almost complete loss of activity despite further thermodynamically stabilizing the intrinsic terminator. The same effect is observed with a five base pair nucleator helix as little to no switching activity occurs in the absence of a six uridine pause yet activity is fully recovered upon inclusion of the pause. Furthermore, all nucleator helix variants that preserve the stretch of uridine residues on the 3' side of the helix showed strong ligand-dependent activation of reporter expression as seen in variants NH-8 U, NH-4 U, and NH-0 U.
The role of the pause sequence appears to be to increase the time window for ligand binding rather than nucleation of the terminator helix. In the wild type pbuE expression platform, the hexa-uridine sequence occurs at site where it could affect either the timeframe of ligand binding or nucleation of the terminator helix. Since hairpin nucleation is very rapid-on the order of microseconds [62,63]-it is likely that the role of this pause is in increasing the binding time for the aptamer, as has been proposed for the B. subtilis ribD FMN riboswitch [13]. This is supported by the observation that the uridine tract could be moved to the loop of the nucleator helix where it can only serve to increase the time window for aptamer folding and ligand binding and still support regulatory activity. This indicates also that the precise positioning of poly-uridine tracts is not essential in the expression platform, as long as it is in a position that lengthens the timeframe of ligand binding prior to the onset of the strand invasion process.

Prospects for the design of artificial riboswitches
The ability to engineer synthetic riboswitches that function in cells open doors for many applications. Of particular interest is the use of riboswitches as biosensors [64] that can sense the presence of certain molecules and provide a visual readout to indicate their presence. By knowing the requirements for optimal riboswitch performance, specificity for a particular ligand can be easily toggled while expression can be tuned precisely. Another application of interest rests in engineering genetic pathways in organisms that perform beneficial reactions. For example, chimeric riboswitches using parts derived from natural riboswitches have been used to establish an inducible genetic pathway in nitrogen fixing cyanobacteria, suggesting promise for controlling the varying abilities of cyanobacteria as well as other simple organisms [65].
While seemingly simple to design, artificial riboswitches are difficult to implement, particularly in the co-transcriptional and cellular context. In part, this is due to features of natural riboswitches beyond the aptamer and alternative secondary structure that contribute to efficient ligand-dependent regulation. In this work, we have revealed several features of the natural pbuE riboswitch that would be difficult to rationally consider in design using current tools. For example, predicting P1 helix sequences that promote strand exchange on timescales compatible with the rates of ligand binding to the aptamer domain cannot be computationally predicted. Further, rational design can be easily frustrated by alternative structures involving sequences outside of the core riboswitch making de novo design more difficult. Finally, conformational dynamics of the unbound state of the aptamer that influence rates of ligand binding and/or strand exchange cannot be accurately predicted or modeled. Adding to the above difficulties, the unanticipated interdependence of these elements that can hamper rational design.
Instead of a pure design approach, this work suggests that a hybrid design and screening approach might be more successful in implementing novel riboswitches. In this approach, an "approximate" riboswitch is created using computational design strategies and "mix-and-match" aptamer domains and expression platforms. In a second step, an interdomain sequence or communication module is randomized and a library of variants expressed in cells is screened for individual sequences with optimal properties, as in the "dual-selection" approach [66]. However, this study reveals that multiple regions of the riboswitch may have to be screened to fully optimize performance in cells-merely optimizing a communication element that couples the aptamer domain to a structural switch is likely not sufficient to generate a robust regulatory device.
Supporting information S1 Table. Riboswitch mutant sequences. a Full promoter and leader of wild type pbuE sequence through the polyuridine tract. The promoter is underlined and italicized, the leader sequence is in bold. b Full leader sequence in bold with mutations underlined of each of the remaining riboswitch mutants. Nucleotide coloring scheme is the same as Fig. 1. Pre-aptamer leader sequence removed for simplicity. Nucleotide coloring scheme is the same as for Fig. 1. Pre-aptamer leader sequence removed for simplicity.