Abstract
Reverse transcriptase (RT) inhibitors are core components of antiretroviral therapies; however, HIV’s high mutation rate prompts the emergence of drug resistance and necessitates new inhibitors with high genetic barriers to resistance. RNA aptamers that have been selected to bind RT exhibit potent RT inhibition and suppress viral replication when targeting the strain-specific RT that they were originally selected to bind, but some of these same, otherwise potent aptamers fail against both single-point mutant and phylogenetically-diverse RTs. We hypothesized that a subset of the total aptamer population in libraries pre-enriched against a single RT may exhibit broad-spectrum RT binding and inhibition, and we devised a multiplexed Poly-Target selection approach to elicit those phenotypes against a panel of diverse lentiviral RTs. High-throughput sequencing of starting, negative, and final libraries, followed by coenrichment analysis of parallel and duplicate selection trajectories, narrowed the list of candidate aptamers by orders of magnitude. Biochemical characterization of candidates identified a novel aptamer motif and several rare and unobserved variants of previously-known motifs that inhibited recombinant RTs from HIV-1, HIV-2 and SIV to varying degrees. These broad-spectrum aptamers also suppressed replication of viruses carrying phylogenetically-diverse RTs. The Poly-Target selection and coenrichment approach described herein is a generalizable strategy for identifying broad-spectrum behavior and cross-reactivity among related targets from combinatorial libraries.
INTRODUCTION
Human immunodeficiency virus type 1 (HIV-1), the predominant etiologic agent of acquired immunodeficiency syndrome (AIDS), remains a major global health concern. While control of viral replication through appropriate surveillance and highly active antiretroviral therapy (HAART) has dramatically improved patient outcomes, several challenges remain in the absence of a universal functional cure or vaccine. The high rate of viral replication, coupled with the low fidelity of reverse transcriptase (RT) (1) and recombination between genetically diverse strains (2), result in a high mutation rate for the virus (3). In addition, selection pressures exerted by antiviral compounds in patients have led to the emergence of drug resistant strains of HIV (4), requiring continued development of therapeutics that provide a high genetic barrier to resistance. Because reverse transcription of the viral RNA genome into DNA is an early and essential step in the viral life cycle with a non-analogous function in hosts, inhibition of RT remains a core component in HIV treatment. RT’s major functions are to copy the viral RNA genome into DNA (polymerase activity) and to degrade the viral RNA after cDNA synthesis (RNase activity) (5). Current guidelines for HAART recommend a three-drug regimen consisting of two nucleoside reverse transcriptase inhibitors in combination with an agent from the non-nucleoside RT inhibitor, integrase inhibitor, or protease inhibitor classes (6). Concerns with toxicity, patient adherence, and the continual emergence of drug resistance against RT and other HIV inhibitors motivate efforts to identify alternative therapeutic strategies for the treatment of infection and eradication of the viral reservoir (7).
Aptamers are single-stranded nucleic acids that bind with high affinity to defined molecular targets, and they can be generated through an iterative in vitro selection process known as SELEX (8-10). RNA aptamers selected against HIV-1 RT are able to bind RT and inhibit its enzymatic activities, and they overwhelmingly form pseudoknot structures (11-13). These pseudoknot RNA aptamers show robust in vitro inhibition of the RT they were selected to bind (14) and they suppress viral replication and infectivity in cell culture (15-19). Structural and biophysical investigations suggest that these pseudoknot RNA aptamers bind the closed conformation of RT in a positively-charged groove, 50-60 Å in length, that spans the polymerase and RNase H domains (20, 21). This same groove can accommodate 16-19 base pairs of nucleic acid and naturally binds a duplex of viral RNA and host tRNALys primer. An observation from our lab is that a common polymorphism in this groove, near the polymerase active site (K277 in the p66 subunit), confers resistance to the Family 1 Pseudoknot subset of RNA aptamers (22), thereby preventing broad-spectrum inhibition. In contrast, single stranded DNA aptamers selected against RT (23) demonstrate broad-spectrum inhibition in vitro (24-27), but their use in genetic therapy is complicated by the requirement to produce or deliver single stranded DNA in the relevant cells.
High-throughput sequencing of anti-RT RNA aptamer libraries recently revealed several additional motifs capable of RT binding and inhibition, including the (6/5) asymmetric loop (28) and the UCAA bulge (29). These non-pseudoknot motifs were difficult to detect with low-throughput sequencing, as their low sampling frequencies were eclipsed by the abundance of aptamers that carried pseudoknot motifs (>93% in one library (28)). Although these novel motifs were neither selected nor initially characterized for their broad-spectrum inhibition, recent work, done in conjunction with the work described here, demonstrated that these non-pseudoknot motifs were capable of broad-spectrum suppression of viral replication when actively encapsidated into nascent virions (19). The additional functional diversity present in these libraries suggested that new genotypes and phenotypes were potentially accessible under different selection conditions.
Herein we describe a Poly-Target selection method developed to enrich for RNA aptamers with broad-spectrum binding and inhibition of diverse primate lentiviral RTs. Rather than performing a de novo selection from a random-sequence library, we utilized two well-characterized libraries that had been pre-enriched for binding to RT from a specific HIV-1 strain. Negative selections against nitrocellulose were performed to remove RT-independent binders, followed by three rounds of positive selections against a panel of seven diverse lentiviral RTs, in parallel. Each selection trajectory was performed in duplicate, and high-throughput sequencing of the starting, negative, and final libraries enabled an informatics approach to identify candidate aptamers through coenrichment analysis of the various selection trajectories. Motif analysis and biochemical screening of candidates identified several rare variants of known motifs and one previously unknown motif capable of varying degrees of broad-spectrum inhibition of RT polymerase activity. Furthermore, these broad-spectrum aptamers inhibit RTs that were not included in the selection panel and suppress viral replication of constructs containing diverse RTs.
RESULTS
Poly-Target Selection Panel
Two distinct, pre-enriched aptamer libraries with randomized regions of either 70 or 80 nucleotides (70N and 80N, respectively) were used to initiate Poly-Target selection. These libraries were originally selected in vitro against recombinant RT from a group M, subtype B strain of HIV-1 (BH10) (13). In these original selections, partitioning for protein-bound RNA was accomplished through eleven rounds of nitrocellulose filtration and three rounds of native gel shifts, for a total of fourteen rounds. The vast majority of RNA aptamers identified through low-throughput Sanger sequencing of these populations were those that form pseudoknot structures, which bind with low nanomolar affinity and inhibit the RT by competing with primer-template for access to the enzyme. High-throughput sequencing recently enabled deeper analyses of these libraries and revealed previously undetected motifs, such as the 6/5 asymmetric loop ((6/5)AL) (28) and UCAA bulge (29), suggesting that significant diversity exists within these libraries. A central goal of the present work is to exploit untapped diversity within these preenriched libraries by identifying low-abundance structural motifs that may offer broad-spectrum binding and inhibition of RTs.
To elicit broad-spectrum phenotypes from the pre-enriched libraries, we assembled a panel of phylogenetically diverse primate lentiviral RTs (22) (Supplementary Figure 1). The panel of targets included RTs from multiple strains of the major group (group M) of HIV-1. Group M is responsible for the global pandemic and is further divided into subtypes and circulating recombinant forms (CRFs) that contain genomic segments from two or more subtypes and represent a growing percentage of worldwide infections. Group M, subtype B (M:B) HIV-1 is the most prevalent form across the Americas, Western Europe and Australia, and accounts for 11% of global HIV infections (30). HXB2 is an M:B strain closely related to the original selection target (98.9% by amino acid identity), and it is often used as a reference strain for sequence comparisons. We therefore included RT from strain HXB2 in the panel as a “continuity control.” Also included was an R277K single point mutant of HXB2, a natural polymorphism which confers resistance to inhibition by Family 1 Pseudoknot aptamers in vitro (22) and in cell culture (19), to enrich for aptamers that bind in spite of this escape mutation (henceforth referred to as R277K). However, most HIV-1 infections worldwide are from non-B subtypes of Group M. Therefore, RT from strain 94CY was included as a representative of subtype A (M:A) HIV-1, which is responsible for approximately 12% of global infections and is common throughout Eastern Europe, Eastern Africa and Central Asia (30). We also included RT from 93TH, which groups phylogenetically within CRF01_AE, a circulating recombinant form that is the predominant form of HIV in Southeast Asia (30) and carries genomic segments corresponding to subtypes A and E.
The selection panel also included RTs from several non-group M strains of HIV-1 and simian immunodeficiency virus (SIVcpzPts). Although these additional strains are of less epidemiological relevance, they help to capture the broader genetic diversity of lentiviruses (Supplementary Figure 1). The outlier group of HIV-1 (group O) is estimated to infect 1-5% of HIV-1 patients, mainly in West and Central Africa (31). Group O is phylogenetically distinct from group M and originated independently. We therefore included an RT from the group O strain MVP. Because HIV-1 arose from zoonotic transmission of SIVcpz from chimpanzees (32), we included an RT from the TAN1B strain of SIVcpz. Finally, we included RT from the EHO strain of HIV-2, which is only distantly related to HIV-1, to further broaden the scope of the Poly-Target selection. Selecting independently against each of these RTs presents a gradient of selection pressures on the pre-enriched library to search the sequence space for ligands that are capable of making molecular contacts with conserved features of primate lentiviral RTs. A table depicting the unabbreviated name of the strains, their reference sequences, and changes in their amino acid sequence relative to the database reference sequences are provided in Supplementary Table S1.
Poly-Target Selection, High-throughput Sequencing and Bioinformatics
A schematic for the Poly-Target selection strategy is shown in Figure 1. Pre-enriched round 14 libraries were transcribed and partially depleted for nitrocellulose-binding species. Libraries were then separated into replication populations (A and B) and further split into independent populations for each of the seven RT targets. For each round of Poly-Target selections, transcribed RNA for a given trajectory was independently incubated with a single RT from the selection panel. Bound RNA:RT complexes were captured on nitrocellulose filters, and the recovered RNA was reverse transcribed into cDNA and then PCR amplified for additional rounds of selection or for high-throughput sequencing (HTS). Each selection trajectory was performed for a total of three rounds, after which the starting round 14 libraries, the nitrocellulose binding populations, and the round 17 libraries were submitted for HTS.
HTS of the libraries generated over 184 million raw reads. Demultiplexing, trimming and quality filtering (Supplementary Figure S2) of sequences left over 52 million high quality reads for the 70N library and 54 million for the 80N library. Each selection trajectory contained an average of three million total reads and 35,000 to >100,000 unique sequences (Supplementary Table S2). The average number of reads per unique sequence increased for all populations, indicating enrichment and convergence on sequence space within each library, with the exception of 70N TAN1B (A), which was the only population with fewer than one million total reads (Supplementary Figure S3). Aptamer populations were further processed by the FASTAptamer toolkit to determine sequence frequencies, compare populations and calculate enrichment of each unique sequence relative to the starting round 14 library (33). Sequences were clustered with FASTAptamer using Levenshtein edit distance – a string similarity metric that calculates the minimum number of insertions, deletions and substitutions necessary to transform one sequence into another. Sequences within a small edit distance from each other likely diverged from a parent sequence in an earlier population and grouping them into a single cluster simplifies analysis. Based on previous observations that sequences in these populations typically diverge by less than 10% from their parent sequences (28), we defined clusters as having a maximum edit distance of 7 for the 70N library and 8 for the 80N library.
To determine the degree to which replicate selections against the same target experienced similar evolutionary paths, normalized sequence frequencies from the A and B trajectories for each target were compared using FASTAptamer-Compare (33). For both the 70N and 80N round 14 libraries, we observe a tight y = x relationship between replicates (Supplementary Figures S4A and S5A), suggesting highly similar input diversity for each pair of replicate trajectories for a given target. In contrast, comparing sequence frequencies between replicate trajectories after three rounds show a multimodal distribution for many populations, wherein different subsets of sequences either enrich, remain neutral, or deplete (Supplementary Figures S4B-H and S5B-H). These minor variations between replicate trajectories likely reflect stochastic variations in sample handling (34) and highlight the potential value in providing multiple, independent evolutionary opportunities for a given sequence to enrich or deplete. As expected, distributions were widened more dramatically when comparing different selection rounds or selections against different targets (Supplementary Figures S6 and S7).
Coenrichment Analysis Identifies Candidate Broad-Spectrum Aptamers
Sequences from the Poly-Target selection that enriched in more than one selection trajectory were expected to be the most promising candidates as broad-spectrum aptamers. As a first step towards coenrichment analysis, we identified sequences within each post-selection population that met a given enrichment threshold, which we defined as a two-fold or greater increase in reads per million (RPM) relative to the starting round 14 population (Supplementary Figure S8A). To mitigate the potential impact of low abundance sampling artifacts, we confined this analysis to sequences with an aggregate RPM ≥ 10 when summed across the populations being compared (e.g., round 14 RPM + round 17 RPM). We also included sequences that were below the detection limit in round 14 libraries but later increased in relative abundance and were sampled with ≥ 10 RPM in round 17, although their absence in round 14 precluded calculating an enrichment ratio. On average, each trajectory had approximately 1,800 sequences that met our enrichment criteria (Supplementary Table S3). Depleting sequences were similarly flagged by identifying sequences with aggregate RPMs of ≥ 10 that either decreased by at least two-fold (0.5-fold enrichment) in each trajectory, or that were present in round 14 at ≥ 10 RPM and below the detection limit in round 17. Each trajectory had approximately 2,600 sequences that met our depletion criteria (Supplementary Table S3). In nearly all trajectories, the number of sequences that depleted throughout the course of the selections exceeded the number that enriched.
We next identified the sets of sequences that experienced coenrichment or codepletion in two or more trajectories (Supplementary Figure S8B). These were pooled according to starting library and replicate, irrespective of which individual trajectories provided the basis for identifying them as coenriching or codepleting. Each starting library (70N and 80N) and each replicate within them (A and B) were analyzed independently, thereby generating four unique sets of coenriched sequences (average of 3,200 per set) and four sets of codepleted sequences (average of 3,500 per set) (Supplementary Table S4). Sequences demonstrating coenrichment or codepletion in the negative nitrocellulose selections were eliminated from analysis, reducing each library replicate to an average of 2,500 coenriched and 1,800 codepleted sequences. Dataset complexity was further reduced by reclustering each set of coenriched or codepleted sequences into sequence families. To facilitate comparison across trajectories, the identified sequences were mapped to their cluster identity from round 14 (Supplementary Figure S9). Clustering and mapping in this manner allowed us to readily discern highly-enriched clusters that arose in either or both replicate trajectories.
As individual clusters can contain numerous sequence variants, individual sequences within those clusters can sample advantageous or deleterious mutations. To focus on clusters where the majority of sequences exhibited coenrichment with minimal codepletion, we coupled the output from the coenrichment and codepletion analysis (Supplementary Figure S10). Fourteen clusters from the 70N libraries and nineteen clusters from the 80N libraries were chosen for further characterization based on this analysis (33 total). Two clusters that heavily depleted in the 70N trajectory (70N 2 and 70N 3) were also included to test whether codepletion is predictive of low fitness. For each cluster, the most abundant corresponding sequence in round 14 was chosen to represent the entire cluster in biochemical characterization. Candidate aptamers are named by their library of origin and their cluster identity in the pre-enriched round 14 libraries (e.g. aptamer 70N 89 is the 89th most abundant sequence in the 70N round 14 library and served as the seed sequence for the 89th cluster). Sequences that were not sampled in the round 14 libraries were named according to the cluster identity generated in the coenrichment analysis. A heat map of genotypic frequencies for each selected candidate aptamer in each trajectory is shown in Figure 2A, with more detailed information on the frequency of individual sequences in each trajectory detailed in Supplementary Table S5.
Genotype Analysis Identifies Known, Rare and Previously Unobserved Sequence Motifs
The 33 sequences identified above (Figure 2, Supplementary Figure S9, Supplementary Table S5) and two additional sequences that both heavily codepleted were analyzed for the presence of sequence motifs that have been previously shown to robustly inhibit subtype B RTs in vitro and suppress viral replication in cell culture (15-19, 28, 29). The FASTAptamer-Search function of the FASTAptamer toolkit (33) was employed to search for degenerate sequence motifs, and Mfold (35) was used to generate secondary structure predictions. Although Mfold is unable to predict pseudoknot motifs, we reasoned that it would help to provide structural evidence for UCAA and (6/5)AL motifs and could help identify aptamers with novel motifs. Applying this approach with relaxed search constraints identified the three major secondary structural motifs known to bind HIV-1 RT – pseudoknot, (6/5)AL and UCAA – within most, but not all of these 35 sequences.
FASTAptamer-Search identified ten sequences as Family 1 Pseudoknot (Supplementary Table S6), including the two 70N sequences noted above as having codepleted significantly (70N 2 and 70N 3). The Family 1 Pseudoknot motif (F1Pk) is characterized by the string UCCG-N7/8-CGGGANAAN≥3, where N refers to any nucleotide, and N≥3 refers to a segment that base pairs with at least three connector nucleotides to form Stem II (Figure 2B) (11-13, 28). Some of these F1Pk-like aptamers diverged slightly from the expected sequence motif definition, and their enrichment patterns appear to reflect their specialized characteristics. For example, 70N 27 and 70N 73 contained only two potential base pairs in Stem II. They also showed the strongest enrichment in the EHO (HIV-2) trajectory, potentially allowing for this variation of the F1Pk motif to maximize contacts with that particular RT. Similarly, while nine of the ten F1Pk sequences contain a seven or eight nucleotide connector, 70N 144 contains a ten-nucleotide connector. The strong enrichment of this sequence in all of the trajectories, including the nitrocellulose binding population, indicates that it may be a selection parasite, rather than a high-affinity RT binder.
The UCAA motif is minimally defined by a four-nucleotide UCAA bulge flanked by AC/GU closing base pairs on one end and by a stem that is interrupted by a single unpaired U on the other end (Figure 2D) (29). Searching for the UCAA motif identified four sequences, two of which (80N 16 and 80N 148) conformed fully to the motif definition. 80N 16 is identical to aptamer 80.103, which had been identified previously and shown to inhibit HXB2 RT in vitro and to suppress viral replication in cell culture (29). The core element within 80N 148 was validated through aptamer truncation and RT inhibition analysis (Supplementary Figure S11). The UCAA elements in the other two identified aptamers were within non-canonical structural contexts and may only superficially resemble the UCAA motif. Aptamer 80N 142 contained the sequence UCAACU but presents the internal CAAC as the four-nucleotide bulge, rather than the UCAA portion, and both flanking U’s are predicted to pair within stems. Nevertheless, this sequence also contained the downstream unpaired U and upstream AC/GU base pairs characteristic of the motif. Although the fourth sequence (80N 433) contained a UCAA motif, it was predicted to present this sequence either as a tetraloop capping an interrupted helix, or as a part of a seven-nucleotide bulge (GUUCAAA), neither of which conforms to the canonical definition of the UCAA.
The (6/5)AL motif is characterized by the presence of six unpaired nucleotides (ARCGUY) opposite five unpaired nucleotides (RARAC) in an asymmetric loop embedded within a stem (Figure 2C); where R is any purine and Y is any pyrimidine (28). Of the 16 loop combinations that fit this definition, 6a/5a (AACGUU/GAAAC) and 6b/5b (AGCGUC/AAGAC) variants account for >98% of all (6/5)AL motifs in the 70N round 14 library, along with a small fraction of additional minor variants (28). Searching the candidate sequences for the (6/5)AL motif identified eight sequences that fit the consensus. The 6a/5a variation was present in three sequences (70N 8, 70N 177, and 80N 218). Although most of the previously-identified (6/5)AL aptamers utilize portions of the 80N constant region to form the motif (28), 80N 218 contained the motif entirely within the aptamer’s randomized region. Four aptamers containing the 6b/5b variation of the motif were also identified, with two in the common circularly permuted form (70N 89 and 80N 653) and two in the less-common, non-permuted form (70N 105 and 70N 869). A circularly permuted 6c/5b variant that contained the motif within the randomized region was also found (80N 42A). The 6c/5b combination is exceedingly rare, having been previously observed at less than 0.1% of the 70N round 14 population (28). To validate these motif identifications, aptamer:RT 3’ boundary determination and inhibition assays with truncated variants were carried out for two non-circularly permuted (70N 105 and 70N 869) and one circularly-permuted (70N 89) 6b/5b aptamer. All three fully inhibited RT from HIV-1 strain HXB2 when the 3’ end was truncated to only three base pairs beyond the asymmetric loop, irrespective of whether the closing stem was Stem I or Stem II (Supplementary Figures S12–S14).
Sequences with no obvious motifs were analyzed for their predicted secondary structures. Four of these contained asymmetric internal loops suggestive of (6/5)AL variants that did not conform to the ARCGUY/RARAC consensus sequence and hence escaped detection in the initial sequence-based screening. Two of these contain known, but extremely rare, variations of the motif, including 6a/5d (AACGUU/UAAAC, non-conforming deviations underlined) in 80N 46A29B and 6d/5a (AACGUG/GAAAC) in 70N 24A28B. These sequence combinations were previously detected in a 70N library with frequencies of less than 0.6% (6a/5d) and 0.08% (6d/5a) of all (6/5)AL reads (28). The other two contained previously unobserved variants of the (6/5)AL motif, including the newly-termed 6f/5b (AGCGUA/AAGAC) in 80N 433 – which is flanked by AG/CC closing pairs in Stem I – and 6g/5a (AGCCUC/GAAAC) in 80N 51. The observation of a variant (6/5)AL motif in 80N 433 was especially surprising, as this sequence also contains a non-canonical UCAA bulge, as described above. Minimization of the 80N 433 aptamer, followed by deletion of either the (6/5)AL motif or UCAA bulge, suggested that only the (6/5)AL motif is required for inhibition of HXB2 RT and that the portion containing UCAA is not sufficient for inhibition (Supplementary Figure S15).
In total, motif analysis of coenriched and codepleted cluster seed sequences, combined with biochemical validation of a subset of these, identified ten putative F1Pk aptamers, three potential UCAA motif aptamers and twelve (6/5)AL aptamers that include several circular permutations and rare or novel variants. The F1Pk motif comprised the vast majority of the round 14 libraries and continued to dominate populations despite their overall depletion. One aptamer (80N 433) contained both a cryptic (6/5)AL motif and a dispensable, non-canonical UCAA bulge. Ten additional aptamers neither conformed to previously-defined motifs nor showed sequence or structure similarity with each other. We suspect, based on the abundance of the motif-unknown sequences in the nitrocellulose binding populations, that some of these aptamers may be low-fitness species with protein independent enrichment. Overall, Poly-Target selection and coenrichment analysis helped to reduce the lead candidate space by several orders of magnitude to identify highly enriched, rare, and unique genotypes that were present within the pre-enriched round 14 libraries. It is unlikely that low-throughput sequencing would have identified these unknown and previously unobserved sequences, as their relative abundance in each library remained low even after three additional rounds of selection.
Broad-spectrum Inhibition Varies with Aptamer Motif
Coenrichment against multiple, divergently-related protein targets could signal broad-spectrum recognition and inhibition across the panel of RTs. To test this possibility, we assayed the 35 cluster-representing aptamers for inhibition of the DNA-dependent DNA polymerase (DDDP) activity of RT using primer extension assays. Three previously characterized 70N (6/5)AL aptamers (88.1, A9, ×1.1) were included as inhibition controls. Aptamers A9 and 88.1 contain a circularly permuted 6b/5b motif and enriched along many selection trajectories, despite being notably absent from sequences obtained here for the round 14 library. Aptamer ×1.1 contains a 6a/5a motif and was the most abundant sequence in the 70N round 14 library (28) yet was among the most depleted sequences in the round 17 populations. The vast majority of aptamers containing known motifs robustly inhibited RT from strain HXB2 (Figure 3A, top), which is very closely related (98.9%) to the BH10 strain used to pre-enrich these libraries originally (13). In contrast, only three of the aptamers without recognizable structural motifs inhibited RT from HXB2 (70N 11,70N 55 and 80N 44), suggesting that the other candidates with unknown structures may have been enriched for binding to nitrocellulose, that they bind RT surfaces that do not interfere with DNA polymerization, or that they are selection parasites.
The R277K point mutant of RT from HXB2 served as initial surrogate for evaluating broad-spectrum inhibition. The K277 polymorphism is prevalent among sequenced strains and present in five of the seven RTs used in these selections, so we reasoned that true broad-spectrum aptamers should be able to inhibit the R277K RT. When all 38 aptamers were tested for inhibition of R277K, 18 of these inhibited both HXB2 and R277K (Figure 3A, bottom). The (6/5)AL and UCAA motif aptamers demonstrated strong inhibition of R277K, with the exception of 80N 51, which weakly inhibited RT from HXB2 and failed to inhibit the point mutant. This aptamer is a previously unobserved 6g/5a variation of the (6/5)AL motif and experienced a mix of both strong depletion and enrichment, often against the same target in replicate selection trajectories. Its coenrichment may therefore reflect the stochastic nature of the selection rather than a true positive coenrichment signal. Of the three aptamers of unknown structures that inhibit HXB2, only aptamer 80N 44 inhibits R277K. As expected, the F1Pk pseudoknot aptamers uniformly failed to inhibit.
When the 18 aptamers that inhibited both HXB2 and R277K were evaluated for inhibition of the panel of RTs used in the selection (Figure 3B), the (6/5)AL and UCAA motif aptamers were consistently broad-spectrum. Among the (6/5)AL aptamers, 70N 89 exhibited robust broad-spectrum inhibition of DDDP activity across the lentiviral RT panel; to a slightly lesser degree aptamers 70N 653 and 88.1 also inhibited broadly across the panel. These three aptamers all carry 6b/5b motifs. Aptamer A9, which also carries a 6b/5b motif, showed less inhibition and greater variability across the panel, potentially due to non-productive contributions of sequences surrounding the motif. Aptamer 70N 105, which carries a non-circularly permuted form of the 6b/5b motif, also exhibited a moderate to strong inhibition profile, as did the 6c/5b variant (80N 42A), which is a single nucleotide change from the 6b/5b motif. Other variations of the (6/5)AL motifs showed no discernible pattern of inhibition across the panel. In particular, 80N 433, which contains both a (6/5)AL and a UCAA motif, behaved most like ×1.1 in that it inhibited RTs from 94CY, MVP and EHO. RT from 94CY and EHO were inhibited more or less uniformly by all (6/5)AL aptamers tested, but inhibition was more variable for RTs from 93TH, MVP and TAN1B. Aptamers that carry UCAA motifs inhibited RTs from 94CY and EHO and exhibited mixed behavior against RTs from 93TH and MVP. Aptamer 80N 148 moderately inhibited RT from 93TH, but it failed against RT from MVP, while aptamer 80N 16 demonstrated the opposite pattern. All three UCAA aptamers inhibited RT from HIV-2 (EHO) to an extent that is comparable to that observed for the (6/5)AL aptamers, but they all failed to inhibit RT from TAN1B. The unknown aptamer, 80N 44, was only moderately broad-spectrum, inhibiting RTs from 94CY and EHO RTs, but none of the other RTs from non-subtype B lentiviruses. 3’ boundary determination and truncation experiments with 80N 44 against HXB2 RT identify a minimized 54 nucleotide sequence predicted to fold as a stem, interrupted with bulges and a (4/2) asymmetric loop, and capped with a six nucleotide loop (Supplementary Figure S16). The control F1Pk aptamers weakly inhibited RT from EHO, which is the only RT outside the HIV-1 subtype B to contain R277, but they failed to inhibit RTs from the other non-subtype B lentiviruses (Figure 3B).
Sorting the aptamers by their average RT inhibition reveals a clear correlation of inhibition profiles with secondary structural motifs (Figure 4). Taken together, these data indicate that the (6/5)AL and UCAA motifs are capable of inhibiting a broad collection of non-subtype B RTs, and that several 6b/5b variants are capable of robust inhibition against every RT in our selection panel. While the UCAA and unknown motifs show limited inhibition outside of group M, they are potent inhibitors of the M:A and M:B HIV-1 RTs tested here.
Conversion of (6/5)AL variants to the canonical 6b/5b improves RT inhibition
Aptamers 88.1 (6b/5b), 80N 42A (6c/5b), and 80N 433 (6f/5b) all carry 6b/5b motifs or single-nucleotide variations, and they represent a gradient of inhibitory potency against RTs from non-B clades. To assess the relative contributions of nucleotide composition within and outside the asymmetric loop, we converted aptamer 88.1 from 6b/5b to 6f/5b and 6c/5b (Figure 5A) and each of the other two into canonical 6b/5b forms. These modifications are single nucleotide changes to interconvert AGCGUC (6b) to either AGCGUA (6f) or AACGUC (6c), or vice versa. Establishing a canonical 6b/5b structural context for 80N 433 also required an A to G substitution immediately upstream of the 6b element to stabilize the flanking stem. The three original aptamers and the four variants were assayed for inhibition of RTs from HXB2, R277K, and five non-B clades (Figure 5B). Conversion of 88.1 away from the 6b/5b variant had little effect on the overall inhibition profile, with the exceptions of partial loss of inhibition for RTs from 93TH and TAN1B. In contrast, conversion of 80N 433 to 6b/5b (80N 433bb) dramatically increased its inhibitory potency, especially for RTs from 93TH, MVP and TAN1B, making 80N 433bb the strongest broad-spectrum inhibitor among all aptamers examined in this study. Conversion of 80N 42A to 6b/5b also improved its performance against the panel, with lowered mean primer extension values, albeit to a lesser degree than for 80N 433. Taken together, these data suggest that the exact composition of the 6 nucleotide bulge of the (6/5)AL loop is important for broad spectrum binding, that flanking sequences contribute directly or indirectly to net inhibition, and that the terminal C of the 6 nucleotide bulge, present in both 6b and 6c variants, may be especially important for establishing productive molecular contacts, particularly for RTs from 93TH and TAN1B.
Broad-Spectrum Aptamers Inhibit Outside of the Selection Panel
We reasoned that aptamers capable of inhibiting phylogenetically diverse RT likely interact with conserved features that would also be present in RTs that were not included in the Poly-Target selection panel. Anti-RT aptamers with broad-spectrum ability are therefore expected to bind and inhibit RTs that were not part of this training set. The aptamers chosen for this analysis were the most potent representatives of the three structural motifs that inhibited outside of subtype B: the 6b/5b converted form of aptamer 80N 433 (80N 433bb), a UCAA aptamer (80N 148) and the putative (4/2)AL motif (80N 44). The RTs for this assay sample two additional subtypes from HIV-1 group M and one from SIVCPZ. The group M RTs included a subtype C (98CN) and a circulating A/D recombinant form (92UG) for which the pol gene encoding RT groups within subtype D. Subtype C is the dominant subtype in South and Central Africa and the Asian subcontinent and represents nearly half of all current HIV infections globally (30). The SIVCPZ RT is from the US strain, derived from the chimpanzee species Pan troglodytes troglodytes (SIVcpzPtt). It is more evolutionarily related to HIV-1 group M subtypes than the TAN1B SIV strain used in the selection panel (SIVcpzPts), which is from the species Pan troglodytes schweinfurthii (Supplementary Figure S1) (32). Against the RT from subtype D (92UG), the UCAA aptamer showed moderate inhibition and the 6b/5b aptamer showed complete inhibition (Figure 6). When assayed against the RT from subtype C (98CN), all three aptamers showed moderate inhibition, with the 6b/5b aptamer exhibiting virtually complete inhibition of the DDDP activity. Finally, only the 6b/5b aptamer inhibited the RT from SIVCPZ (US). These results provide further evidence of the ability of the UCAA and unknown motif to inhibit non-subtype B RTs and of the 6b/5b motif to strongly inhibit RT from phylogenetically diverse lentiviral strains.
Broad-Spectrum Anti-RT Aptamers Bind with Low Nanomolar Affinity
We hypothesized that the observed patterns of sequence enrichment and enzymatic inhibition above should reflect relative affinities for each aptamer-RT combination. For example, because modifying aptamer 80N 433 into a canonical 6b/5b form improved its inhibition of RTs from 93TH and TAN1B RT, the modified aptamer is expected to exhibit improved affinity for these RTs. Apparent dissociation constants for a number of aptamer-RT pairs were determined by a nitrocellulose filter binding assay (Table 1 and Supplementary Figure S17). As expected, aptamer 70N 2 (F1Pk) bound well to RT from HXB2 (Kd ≈ 110 nM) but showed no discernable affinity for the R277K point mutant, consistent with its inhibition patterns (Figure 3A). Furthermore, its affinity for RT from HXB2 was notably weaker than affinities observed for the broad-spectrum aptamers tested here, consistent with its observed moderate depletion in the HXB2 trajectory and powerful depletion in each of the others (Figure 2 and Supplementary Table S5). Dissociation constants for all of the other aptamers were in the low nanomolar range (< 100 nM) for binding to HXB2. Because these assays were all performed at the same time using the same protein preparation, we conclude that the affinity of aptamer 80N 433bb for RT from HXB2 is 5-fold improved relative to that of the pseudoknot aptamer 70N 2. All of the broad-spectrum aptamers showed reduced affinity for the R277K mutant, relative to their affinities for RT from HXB2. For the RT’s from non-subtype B lentiviruses, relative binding affinities among the broad-spectrum aptamers correlated with their inhibition profiles. In particular, Kd values for the 6b/5b variant of 80N 433 were in the low nanomolar range across the RT panel, and this aptamer uniformly bound more strongly to each RT than did any of the other aptamers. Affinity improvements associated with converting the original 80N 433 from 6f/5b to 6b/5b ranged from approximately 3-fold for HXB2 and 93TH to 13-fold for TAN1B, illustrating the variable contributions of these nucleotide changes upon the RT-aptamer interactions.
Broad-Spectrum Anti-RT Aptamers Suppress Viral Replication in Cell Culture Assays
A target application of broad-spectrum RNA aptamers is to suppress productive replication of diverse viruses or emergent and potentially resistant forms. Therefore, to determine whether aptamers that demonstrate broad-spectrum activity in vitro also inhibit diverse HIV in a biological context, we adapted an assay that we had previously developed for monitoring aptamer-mediated inhibition of HIV replication in cell culture (18, 19, 36). Briefly, transfection of plasmids encoding a replication-deficient HIV reporter virus and the glycoprotein from vesicular stomatitis virus (VSV-G) leads to production of virus that are competent for a single cycle of replication and for which transfection and infection efficiencies can both be measured using the virally-encoded EGFP. Cotransfection of aptamer-expressing plasmid simultaneously with viral plasmids can lead to packaging of the aptamer into the virus during assembly, inhibition of reverse transcription in the target cells, and loss of infectivity that can be measured by a loss of fluorescence signal in the target cells. To incorporate phylogenetically diverse RT subtypes into this assay, we first built and evaluated proviral plasmids to express several RTs from our Poly-Target selection panel (HXB2, R277K, 93TH, 94CY, and MVP) such that only the RT segments differed in each construct (19). This assay was used to evaluate viral inhibition by aptamers that demonstrated broad-spectrum RT inhibition in vitro by co-transfecting various combinations of proviral and helper VSV-G plasmids with aptamer or control plasmids, collecting the pseudotyped virus, and determining the percent of EGFP-positive cells after infection of fresh 293FT cells (Figure 7). Infectivity was normalized to p24 levels determined by ELISA. As observed previously (19), aptamer 70.05 (an F1Pk pseudoknot) strongly inhibited infectivity of virus carrying the HXB2 RT, but not the R277K point mutant or any of the non-subtype B RTs. In contrast, aptamers 80N 433 and 80N 433bb inhibited infectivity of all recombinant viruses in the panel. Aptamers 80N 148 and 80N 44 demonstrated a variable pattern against the non-subtype B RTs. Both inhibited replication by viruses carrying RT from 94CY but only 80N 148 inhibited virus carrying RT from 93TH, even though the RTs from 94CY and 93TH both group within subtype A and are 91.4% identical (98.2% similar). Neither 80.148 nor 80.44 inhibited viruses carrying RT from the MVP5180 strain of HIV-1 Group O. These results are consistent with our biochemical inhibition results and closely mirror our previous observations for inhibition by the major anti-RT motifs in cell culture (19).
DISCUSSION
Recent advances in aptamer technology have enabled increased selection throughput and high-resolution analysis of selection outcomes (37, 38). Microfluidic (39) and multiplexed selection platforms (40, 41) facilitate efficient and simultaneous selections against numerous targets, and emulsion methods that encapsulate single genotypes into “monoclonal aptamer particles” readily partition or sort according to phenotype (42, 43). Post-selection, high-throughput sequencing (HTS) has empowered optimization of candidate aptamers based on round-to-round enrichment of sequences and by providing large data sets that allow for powerful bioinformatic analyses of sequence:structure:function relationships (28, 44-46). In addition to advances in selection methodology and candidate identification, high-throughput aptamer characterization platforms that utilize microarrays can screen thousands of aptamers simultaneously (47) and directly link genotype to phenotype when performed on a sequencing platform (48, 49). Here we sought to combine HTS with multiplexed selections against a diverse panel of recombinant RTs. We applied these technologies to elicit specialized binding and inhibition phenotypes (cross-reactivity) from well-characterized and pre-enriched aptamer libraries. Furthermore, we sequenced negative selection populations and performed our positive selections in duplicate. The combination enabled coenrichment analysis of multiple selection trajectories, nuanced with data from duplicate and negative selections. This strategy efficiently narrowed the number of potential candidates to an examinable amount, allowing us to identify highly-potent RNA broad-spectrum inhibitors of lentiviral RTs among these candidates.
Other selection approaches have previously been used to identify cross-reactive aptamers or to exploit branched selections. In “toggle SELEX” (50), the target of a single-selection trajectory is alternated, or “toggled” between rounds; this approach was used to identify inhibitors of human and porcine thrombin and later to identify broadly-reactive aptamers targeting whole bacterial cells (51). In another approach, single-pot selections against numerous targets were used to identify cross-reactive aptamers against several M-types of Streptococcus pyogenes (52) and against HA proteins from multiple strains of Influenza A (53). Although both of these methods are powerful (alternating or simultaneous targets), they rely on a single evolutionary trajectory and lack the target-specific information that can inform the aptamer selection and design process. Recently, Dupont et al. reported on a clever branched selection approach against alanine point mutants of serpin plasminogen activator inhibitor-1 (PAI-1) (54). Utilizing a library previously selected against wild type PAI-1, those authors performed one additional round of selection against a panel of 10 targets, in parallel, followed by high-throughput sequencing, to identify aptamer binding site preferences. Unlike the Poly-Target selection described here, the pre-enriched library utilized by Dupont et al. had not been saturated. In the original selection against PAI-1, enrichment of the library was detected after 5 rounds and proceeded for a total of 8 rounds (55). Rather than using the heavily-enriched library from round 8, the branched selection began with round 5, allowing for a greater diversity of aptamers to emerge.
The Poly-Target approach utilized pre-enriched libraries to simultaneously explore a wider sequence space than available in a traditional aptamer directed evolution strategy. Interestingly, neither abundance, enrichment nor coenrichment alone was a strong predictor of fitness in the Poly-Target selection performed here. Highly abundant sequences in the starting libraries depleted across most selection trajectories yet remained as dominant species within the round 17 populations. In contrast, most of the 6b/5b variants that enriched through the selection were so rare in the starting populations that, despite their large fold-enrichment, they remained as a relatively small percentage of the populations and could only be reliably identified through HTS. Coenrichment was slightly more predictive of broad-spectrum phenotypes after removing aptamers that enriched against nitrocellulose alone. Nevertheless, several F1Pk aptamers demonstrated enrichment across several trajectories despite their lack of broad-spectrum inhibition. The combined use of coenrichment and codepletion analysis of sequence clusters identified sequences that experienced mixed enrichment and depletion, either across trajectories or between replicates. The most productive analytic strategy for the present datasets combined each of these approaches with careful curation based on sequence and structural motifs established from previous investigations. The Poly-Target approach may be broadly applicable for any combinatorial selection or directed evolution strategy that seeks to identify broad-spectrum behavior given an appropriate phenotypic screen.
Our selection strategy was designed to intentionally select for promiscuity against a family of related targets, rather than specificity for a single target or its point-mutants, and it relied on the sequencing depth provided by HTS to identify functionally rare aptamers that enriched from our round 14 starting libraries. We developed and utilized a coenrichment analysis strategy to efficiently reduce the dataset from over a hundred million sequences to focus on dozens of candidate sequence clusters for further analysis. We were able to identify several rare variants of (6/5)AL motifs and also a previously-unknown aptamer inhibitor of HIV-1 M subtype RTs. Inhibition assays identified the 6b/5b variant of the (6/5)AL motif as an especially potent inhibitor of the DDDP activity of RTs utilized in the selection panel and several RTs that were not included in the selection. Conversion of aptamer 80N 433 towards the 6b/5b motif improved its inhibition profile across the panel of RTs and most notably against 93TH, MVP and TAN1B. Furthermore, the UCAA aptamer (80N 148) and aptamer 80N 44 demonstrated inhibition of RT outside of subtype B strains. Follow-up assays correlated inhibition of RTs with binding affinity and suppression of viral replication. As our efforts were focused on identification of broad-spectrum inhibitors, we did not attempt to find strain-specific inhibitors nor did we closely examine the role of individual nucleotides and their interactions with specific RTs. The wealth of data provided by the Poly-Target selection approach should, in principle, allow one to look for enrichment exclusive for a particular target, in addition to advancing the broad-spectrum objectives that were the focus of the present study.
MATERIAL AND METHODS
RT Expression and Purification
DNA encoding the p66 and p51 subunits of each RT (Supplementary Table S1) were PCR amplified from previously described pET200/D-TOPO expression vectors (22). Amplicons were then restriction digest cloned into the pRT-Dual plasmid, a derivative of pETDuet-1 vector (a gift from Stefan Sarafianos (56)). Plasmids expressing RTs from HXB2, R277K, 94CY and TAN1B utilized the PpuM1/SacI cloning sites for the p51 subunit, and the SacII/AvrII sites for the p66 subunit (“pRTD” series plasmids). Whereas RTs from 93TH, MVP and EHO utilized the same restriction sites for the p66 subunit but utilized NotI/SalI for the p51 subunit (“pRTD2” series plasmids). Additional plasmid engineering was performed on the ribosome binding site upstream of the p66 subunit in the pRTD2 series plasmids to optimize expression. Plasmid were confirmed by Sanger sequencing (University of Missouri DNA Core Facility) and are available upon request.
RT expression plasmids were heat shock transformed into E. coli BL21(DE3)pLysS competent cells and then incubated overnight at 37° C on LB-Agar plates supplemented with selective antibiotic (50 μg/mL streptomycin). Clonal isolates from each plate were recovered and incubated for approximately 16 hours at 37° C with 250 RPM shaking in 10 mL of 2xYT selective liquid media. Cultures were then diluted into 1 L of fresh media and optical density at 600 nm was monitored until reaching 0.5, whereupon protein overexpression was induced with 1 mM IPTG. Following a 4-hour incubation, samples were centrifuged for 15 minutes at 4,000 RCF. Supernatants were decanted and cell pellets were frozen at -80° C until purification. For purification, cell pellets were resuspended in sonication buffer (25 mM Tris-HCl pH 8.0, 500 mM NaCl, 1 mM PMSF, 0.15 mg/mL lysozyme) and then subjected to four ultrasonication steps of 30 seconds, on ice, with 2 minutes of rest in between steps. Cell lysates were then centrifuged for 15 minutes at 4° C and 12,000 RCF to remove cell debris. Cell extracts were then passed through a 0.45 μm filter to remove insoluble material and applied to a Ni-NTA agarose affinity column (Qiagen) for His tag purification. Purification was performed according to the manufacturer’s protocol, with an added high salt wash (2 M NaCl) to remove any endogenous nucleic acids bound to RTs. Eluted protein samples were pooled and quantified using UV absorbance at 280 nm on a NanoDrop 1000 spectrophotometer (Thermo Fisher) with estimated extinction coefficient and molecular weight information. Purified proteins were concentrated using Amicon Ultra Centrifugal Filters (Millipore Sigma) and dialyzed into 2x storage buffer (100 mM HEPES pH 7.5, 100 mM NaCl) using Slide-A-Lyzer cassettes (Thermo Scientific). RT preps were validated for purity and size by SDS-PAGE, and for activity by comparison of primer extension assays against previous preps. Proteins were stored at -20° C after addition of glycerol to 50% (v/v).
Poly-Target Selection
cDNAs corresponding to the pre-enriched 70N and 80N aptamer libraries (13) were transcribed using T7 RNA polymerase, in vitro transcription buffer (50 mM Tris-HCl pH 7.5, 15 mM MgCl2, 5 mM DTT, and 2 mM spermidine), and 2 mM of each ATP, CTP, GTP and UTP. Reactions were incubated at 37° C for a minimum of 4 hours and halted with the addition of gel loading buffer (95% formamide, 5mM EDTA, with trace amounts of xylene cyanol FF and bromophenol blue). Transcribed RNAs were purified through denaturing polyacrylamide gel electrophoresis (6% TBE-PAGE, 8 M urea) and bands corresponding to the expected size were gel extracted and eluted while tumbling overnight in 300 mM Sodium Acetate pH 5.4. Eluates were ethanol precipitated, resuspended in TE buffer (10 mM Tris-HCl pH 8.0, 1 mM EDTA), and stored at -20° C until use. RNA concentrations were determined on a NanoDrop 1000 spectrophotometer (Thermo Fisher). For each trajectory and round of selection, 200 pmol of the transcribed libraries were resuspended in 100 μL binding buffer (150 mM KCl, 10 mM MgCl2, 50 mM Tris-HCl pH 7.5) and renatured by heating to 65 °C and cooling on ice. 40 pmol of the respective RT was then added, and the mixture was incubated on ice for an additional 20 minutes. Separately, a 25 mm nitrocellulose filter (#HAWP02500, Millipore Sigma) was pre-wet with 2 mL binding buffer on a sampling manifold (#XX2702550, Millipore Sigma). Immediately before application of incubated RNA:RT complex, the nitrocellulose filter was washed again with 1 mL of binding buffer under applied vacuum. RNA:RT complex was then applied to filter under vacuum and washed with 1 mL of binding buffer. Suction continued for 10 minutes. The filter was then removed and incubated in extraction buffer (8 M urea, 50 mM NaCl, 10 mM EDTA). RNA was recovered by phenol:chloroform extraction of the filter and ethanol precipitation. Recovered RNA was reverse transcribed using ImProm-II Reverse Transcriptase (Promega) and PCR amplified to repeat the selection process, or for high-throughput sequencing (see below). Prior to round 15, transcribed round 14 libraries were incubated with a thin strip of nitrocellulose for several minutes to subtract non-specific binders and RNA was extracted and purified as described above, with the exclusion of the RT binding and partitioning steps. Nitrocellulose-binding RNAs were recovered and reverse transcribed for HTS, while nitrocellulose-subtracted RNAs were used for independent selections.
High-throughput Sequencing and Bioinformatics Analysis
Libraries were prepared for sequencing using a series of PCR steps to add Illumina adapters and sequencing indices for multiplexing of the 70N and 80N libraries as previously described (28). Sequencing was performed on an Illumina HiSeq2000 (University of Missouri DNA Core Facility). Populations were demultiplexed to identify and parse the 5’ and 3’ constant regions. Data pre-processing was performed using cutadapt (57) to trim 5’ and 3’ constant regions from sequences and to discard any uncut sequences or sequences not within ± 3 of the expected size (70 or 80) after trimming. Trimmed sequences were then filtered for high-quality reads using FASTQ quality filter from the FASTX-Toolkit (http://hannonlab.cshl.edu/fastx_toolkit/). Quality filtering eliminated a sequence if a single position had a Phred quality score of less than 20. Trimmed and quality filtered sequences were then processed using the FASTAptamer toolkit (33) to count and normalize sequence reads (FASTAptamer-Count), calculate fold-enrichment from round 14 to round 17 (FASTAptamer-Enrich), compare sequence frequencies across populations (FASTAptamer-Compare), group related sequences into clusters (FASTAptamer-Cluster) and search for known sequence motifs (FASTAptamer-Search). Custom scripts (open source and available at https://github.com/FASTAptamer/PolyTarget) were written in Perl to perform coenrichment analysis, to map sequences onto their round 14 identity, and to recluster the sequences as described in the results. RNA secondary structures were predicted using the Mfold webserver (35) and depicted using VARNA: Visualization Applet for RNA (58).
Aptamer Generation
Oligonucleotides for DNA transcription templates were ordered from Integrated DNA Technologies and ligated as previously described (59). Ligated oligonucleotides were then PCR amplified with Pfu DNA polymerase and primers containing the T7 promoter and the remainder of the constant region that was not included in the oligonucleotides. Amplification products were verified for size using agarose gel electrophoresis. Run-off transcription reactions using T7 RNA polymerase and RNA extraction and purification were performed as described above for transcription of Poly-Target libraries. RNAs were refolded by heating to 65° C and cooling on ice prior to use.
Primer Extension Assays
DNA-dependent DNA polymerase activities of RTs were assayed using a 31 nucleotide DNA template (5’ CCATAGATAGCATTGGTGCTCGAACAGTGAC 3’) and a complementary, 5’-Cy3-labeled 18-nucleotide DNA primer (5’ Cy3-GTCACTGTTCGAGCACCA 3’) as previously described (29). In short, 20 nM of RT and 100 nM of aptamer (omitted for “No Aptamer” control) were pre-incubated on ice for 10 minutes in extension buffer (50 mM Tris-HCl pH 7.5, 50 mM NaCl, 5 mM MgCl2). 10 nM primer, 20 nM template and 100 μM of each dNTP were then added immediately before incubation at 37° C for 10 minutes. Reactions were halted with the addition of gel loading buffer (95% formamide, 5 mM EDTA, with trace amounts of xylene cyanol FF and bromophenol blue) and analyzed by denaturing polyacrylamide gel electrophoresis (10% TBE-PAGE, 8 M urea). Gels were scanned on a Typhoon FLA9000 imager (GE Healthcare) and quantified using Multi Gauge software (Fujifilm) for fraction of primer extended. Primer extension values were then normalized by subtracting the mean “No RT” value for the respective RT and multiplying by a normalization factor that defines 100% extension as the mean extension in the “No Aptamer” control.
Aptamer:RT 3’ Boundary Determination
In vitro transcribed and purified RNA was treated with Antarctic phosphatase (Fermentas) to remove the 5’ terminal phosphate and subsequently labeled with T4 polynucleotide kinase in the presence of Y-32P labeled ATP (Perkin Elmer). Radiolabeled RNA was gel purified by denaturing PAGE as described for transcription of libraries. RNase T1 digestion was performed by incubating thermally renatured RNA (> 106 CPM) with 40 units of RNase T1 (Thermo Fisher) in digestion buffer (25 mM sodium citrate pH 5.0, 6 M urea) for 5 minutes at 55° C. T1 digestion was halted with the addition of gel loading buffer. Alkaline hydrolysis was performed by incubating RNA in 50 mM sodium carbonate pH 9.0 at 90° C for 10 minutes. Hydrolysis was halted upon addition of 300 mM sodium acetate pH 5.0 and were then ethanol precipitated and resuspended in H2O. Aptamer:RT 3’ boundary was determined by incubating 50 pmol RNA with 100 pmol HXB2 RT and partitioning the bound complexes as described above for the Poly-Target selection. RNAs were recovered from the filter and resolved on a denaturing polyacrylamide gel (15% TBE-PAGE, 8 M urea).
Affinity Constant Determination
Affinity constants (Kd’s) were determined by a radiolabeled binding assay. Approximately 15-20k CPM of labeled and refolded RNA was incubated with varying concentrations of RT (0.1 – 1000 nM, or without RT to determine background binding) in binding buffer (50 mM Tris-HCl pH 7.5, 140 mM KCl, 1 mM MgCl2, 0.1 μg/mL bovine serum albumin) and allowed to equilibrate on ice for 15 minutes. Assembled RNA:RT complexes were then partitioned from unbound RNA by passing samples over a nitrocellulose filter (#HAWP02500, Millipore Sigma) under vacuum (#XX2702550, Millipore Sigma) and immediately washing with 500 μL binding buffer. Radioactivity retained on filters was counted by adding 4 mL of scintillation fluid to filters placed inside of scintillation vials and counted using a liquid scintillation counter. Fraction of RNA bound was calculated by determining the fraction of radioactivity bound and were fit to a one site, specific binding curve using Prism GraphPad 6.2.
Viral Infectivity Assay
Cell culture, virus production and evaluation of viral infectivity were carried out as previously described (19, 36). Single-cycle infectivity assays using aptamer-expressing plasmids, pNL4-3-CMV-GFP and pMD-G (VSV-G), were performed by transfecting 293FT cells with polyethylenimine (PEI) in six-well cell culture dishes. All transfections were performed on cells plated the previous day (50% confluence) using PEI at 3 μl/μg DNA, as previously described. Aptamer-expressing plasmids (1000 ng) were co-transfected with a mixture of pNL4-3-CMV-GFP (150 ng) and pMD-G (50 ng). The medium was changed approximately 12 hours after transfection. Viral supernatant was harvested 48 hours after th post-transfection media change and clarified by centrifugation to remove cellular debris. Cell-free vira supernatant (50 μl) was added to fresh 293FT cells to determine infectivity. Infected cells were collectec 24-48 hours post-infection, fixed with 4% paraformaldehyde and analyzed on an Accuri C6 Flow Cytometer (BD Biosciences) to determine the percentage of infected (EGFP-positive) cells. Infectivit> data was normalized to levels of p24 determined by ELISA.
FUNDING
This work was supported by National Institutes of Health grant R01AI074389 to D.H.B. and by a graduate research assistantship to K.K.A. from the University of Missouri Department of Biochemistry.
ACKNOWLEDGEMENTS
The authors would like to thank Katherine N. Wilsdon for technical assistance.