Skip to main content
bioRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search
New Results

Codon usage influences fitness through RNA toxicity

Pragya Mittal, James Brindle, Julie Stephen, Joshua B. Plotkin, Grzegorz Kudla
doi: https://doi.org/10.1101/344002
Pragya Mittal
MRC Human Genetics Unit, IGMM, University of Edinburgh, Edinburgh, Scotland, UK
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
James Brindle
MRC Human Genetics Unit, IGMM, University of Edinburgh, Edinburgh, Scotland, UK
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Julie Stephen
MRC Human Genetics Unit, IGMM, University of Edinburgh, Edinburgh, Scotland, UK
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Joshua B. Plotkin
University of Pennsylvania, Philadelphia, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Grzegorz Kudla
MRC Human Genetics Unit, IGMM, University of Edinburgh, Edinburgh, Scotland, UK
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Supplementary material
  • Preview PDF
Loading

Abstract

Many organisms are subject to selective pressure that gives rise to unequal usage of synonymous codons, known as codon bias. To experimentally dissect the mechanisms of selection on synonymous sites, we expressed several hundred synonymous variants of the GFP gene in Escherichia coli, and used quantitative growth and viability assays to estimate bacterial fitness. Unexpectedly, we found many synonymous variants whose expression was toxic to E. coli. Unlike previously studied effects of synonymous mutations, the effect that we discovered is independent of translation, but it depends on the production of toxic mRNA molecules. We identified RNA sequence determinants of toxicity, and evolved suppressor strains that can tolerate the expression of toxic GFP variants. Genome sequencing of these suppressor strains revealed a cluster of promoter mutations that prevented toxicity by reducing mRNA levels. We conclude that translation-independent RNA toxicity is a previously unrecognized obstacle in bacterial gene expression.

Significance statement Synonymous mutations in genes do not change protein sequence, but they may affect gene expression and cellular function. Here we describe an unexpected toxic effect of synonymous mutations in Escherichia coli, with potentially large implications for bacterial physiology and evolution. Unlike previously studied effects of synonymous mutations, the effect that we discovered is independent of translation, but it depends on the production of toxic mRNA molecules. We hypothesize that the mechanism we identified influences the evolution of endogenous genes in bacteria, by imposing selective constraints on synonymous mutations that arise in the genome. Of interest for biotechnology and synthetic biology, we identify bacterial strains and growth conditions that alleviate RNA toxicity, thus allowing efficient overexpression of heterologous proteins.

Main text

Although synonymous mutations do not change the encoded protein sequence, they cause a broad range of molecular phenotypes, including changes of transcription 1, translation initiation2, 3, translation elongation4, translation accuracy5, 6, RNA stability7, and splicing8. As a result, synonymous mutations are under subtle but non-negligible selective pressure, which manifests itself in the unequal usage of synonymous codons across genes and genomes9-11. Several recent experiments directly measured the effects of synonymous mutations on fitness in bacteria2, 12-17. It has been commonly assumed that fitness depends primarily on the efficiency, accuracy, and yield of translation. Here we show that in the context of heterologous gene expression in E. coli, large effects of synonymous mutations on fitness are translation-independent, and are mediated by RNA toxicity.

To study the effects of synonymous mutations on bacterial fitness, we used an IPTG-inducible, bacteriophage T7 polymerase-driven plasmid to express a collection of synonymous variants of the GFP gene2 in E. coli BL21-Gold(DE3) (henceforth referred to as BL21) cells (see Methods). Without IPTG induction, there were no discernible differences in growth between strains (Figure 1A). When induced with IPTG, the growth rate of GFP-producing strains was reduced, consistent with the metabolic burden conferred by heterologous gene expression. The growth phenotype varied remarkably between strains expressing different synonymous variants of GFP (Figure 1B, Supp Figure 1). “Slow” variants caused a long lag phase post-induction, indicating that at this stage the cells either stopped growing or died, while ‘fast’ variants showed growth rates closer to non-induced cells. Several hours after induction, the slow variants appeared to resume growth (Figure 1B): we found that this was related to the emergence of suppressor strains that could tolerate the expression of these variants (Supp Figure 1D, and see below).

Figure 1.
  • Download figure
  • Open in new tab
Figure 1. GFP variants are toxic in E. coli.

(A-B) Growth curves of BL21 E. coli cells, non-induced (A) or induced with 1 mM IPTG at t=0h (B). Cells carrying GFP_012 (non-toxic variant, blue), GFP_170 (toxic variant, magenta), pGK8 (empty vector control, black) and 29 other variants (grey) are shown. Each curve represents an average of 9 replicates (3 biological × 3 technical). OD, optical density. (C) Numbers of colony forming units (cfu)/ml at specified time points after induction with 1 mM IPTG. Data points represent averages of 4 replicates, +/- SEM. (D) Semi-quantitative estimation of BL21 cell viability by spot assay. (E) Estimated growth rates of cells expressing GFP variants in DH5α and BL21 strains (averages of at least 6 replicates).

We quantified cell viability post-induction by assessing the colony-forming ability of cells (Figure 1C). Fast variants showed the expected increase in cell numbers post-induction, but slow variants caused a 1000-fold decrease in viable cell numbers. Similarly, spotting of non-induced cells onto LB plates with IPTG showed that the slow variants formed markedly fewer colonies than fast variants (Figure 1D). Microscopic analysis of slow variants showed decrease in cell number, growth arrest and in some cases massive cell death following IPTG induction. In the case of fast variants we observed normal increase in cell numbers and negligible cell death after induction (Supp Figure 2). These results indicate that certain synonymous variants of GFP cause significant growth defects when overexpressed in E. coli cells, and we will henceforth refer to these variants as “toxic”.

To test if toxicity was specific to T7 promoter-driven overexpression, we analysed growth phenotypes following the expression of a subset of GFP variants using a bacterial polymerase (trp/lac) promoter system (Methods). Although the growth phenotypes measured with bacterial promoter constructs were not as dramatic as with T7-based constructs, presumably because of lower GFP expression levels, growth rates with both types of promoters were correlated with each other (Figure 1E). Interestingly, toxicity increased at high temperature, and decreased at low temperature (Supp Figure 1C). Taken together, these results indicate that the toxic GFP variants cause growth defects in two different E. coli strains, with two types of promoters, possibly through a common mechanism.

To understand if toxicity depends on the process of translation, we selected several toxic and nontoxic variants of GFP and mutated their Shine-Dalgarno (SD) sequences from GAAGGA to TTCTCT to prevent ribosome binding and block translation initiation. As expected, mutation of SD sequences completely inhibited the production of functional GFP protein from all tested constructs (Figure 2A). To our surprise, GFP variants without SD sequences remained toxic, and their effects on growth were indistinguishable from variants with a functional SD sequence (Figure 2B). Western blot analysis confirmed that mutation of the SD sequences ablates GFP expression (Supp Figure 3). We considered the possibility that a cryptic SD element within the coding region allowed translation of a truncated fragment of GFP, which would be consistent with loss of GFP fluorescence and translation-dependent toxicity. However, analysis of the coding regions with the RBS Calculator18 revealed no strong SD consensus sequences. These results raise the possibility that toxicity might arise at the RNA level, rather than at translation or protein level.

Figure 2.
  • Download figure
  • Open in new tab
Figure 2. Toxicity of GFP variants is independent of translation.

(A-B) Fluorescence (A) and growth rate (B) of BL21 cells expressing GFP variants with functional and non-functional ribosome-binding sites (RBS). (C-D) Fluorescence (C) and growth rate (D) of cells expressing full-length GFP variants, truncated variants, and variants containing internal stop codons or transcription terminators. Inset in (C) shows location of toxic sequence element in GFP_170 which was calculated based on an analysis of growth rates of 36 shuffled constructs. The Y-axis shows the statistical significance of the association of particular positions with slow growth. Variants derived from non-toxic GFP_012 are shown in blue, and variants derived from toxic GFP_170 are shown in magenta. Full-length constructs, truncated constructs and constructs with internal stop codons have similar growth rates, suggesting that the element of toxicity resides within the truncated fragment and that the mechanism of toxicity is independent of translation. FL, full-length construct; TT, T7 transcription terminator. All data are averages of 9 replicates, +/- SEM.

To identify sequence elements required for toxicity, we selected one of the toxic variants (GFP_170), and a nontoxic variant (GFP_012), and performed DNA shuffling19 to generate constructs that consisted of random fragments of GFP_170 and GFP_012. All the shuffled and non-shuffled constructs we generated encoded the same GFP protein sequence. Analysis of growth rate phenotypes of these shuffled constructs revealed a fragment near the 3′ end of the GFP_170 coding sequence (nt 514-645) that was sufficient to elicit the toxic phenotype (Figure 2C, Supp Figure 4A, B). Some mutations outside of the toxic region partially improved fitness, which might be explained by interactions of the RNA secondary structure between the toxic region and the mutated regions. The GFP_170 mRNA is predicted to have a very low translation initiation rate, due to strong RNA secondary structure near the mRNA 5′ end2. Nevertheless, replacement of the strongly structured 5′ region with an unstructured fragment did not affect toxicity (Supp Figure 4A, B).

The above results led us to hypothesize that the toxicity associated with GFP expression was independent of translation, but depended on the presence of a specific fragment of RNA. To test this hypothesis, we performed growth rate measurements with a series of constructs. First, we isolated the 132-nt toxic region identified in the DNA shuffling experiment, and expressed it on its own, with or without start and stop codons. The expression of the 132-nt fragment of GFP_170 was sufficient for toxicity, whereas the corresponding fragment of GFP_012 did not cause toxicity. The effect of the 132-nt fragments on growth did not depend on the presence of translation start and stop codons (Figures 2C, D), the fragments contained no cryptic translation initiation signals, and FLAG tag fusions showed no detectable protein expression from the GFP_170 fragment in any of the three reading frames (Supp Figure 3B). Second, we introduced stop codons upstream of the toxic fragment in the GFP_170 coding sequence, and in the corresponding positions of GFP_012. This placement of stop codons ensures that ribosomes terminate translation before reaching the putative toxic region of the RNA, while still allowing a full-length transcript to be produced. As expected, internal stop codons abrogated GFP protein production (Figure 2C), but despite the presence of premature stop codons, GFP_170_Stop still caused toxicity to bacterial cells while GFP_012_Stop remained non-toxic (Figure 2D). To remove possible out-of-frame translation, we inserted stop codons into GFP_170 in all three frames, before and after the toxic region, and toxicity remained the same in all cases (Supp Figure 4C). Third, we introduced an efficient synthetic T7 transcription terminator20 upstream of the toxic region in GFP_170 and in the corresponding location in GFP_012. Notably, we found that both variants with internal transcription terminators became nontoxic, and GFP_170_TT grew slightly faster than GFP_012_TT (Figure 2D). The GFP_170 fragment also caused toxicity when fused to FLAG tags (in any of the three reading frames), and when fused to fluorescent protein mKate2, it caused toxicity and reduced expression of mKate2 by 50-fold (Supp Figure 4D, E, F). Overall, these data suggest that toxicity is caused by the RNA itself, rather than the process of translation or by the protein produced.

To investigate the sequence determinants of RNA-mediated toxicity, we measured the growth phenotypes of single synonymous mutations within the 132-nt region of GFP_170. Close to half of these mutations reduced or abolished the toxic phenotype, whereas the remaining mutations had no effect (Figure 3A). There was no clear relationship between the position of mutations within the region and their effect on growth, nor was there any relationship between the type of nucleotide introduced and growth. RNA toxicity associated with triplet repeats has been described in Eukaryotes21, but we found no triplet repeats in the toxic GFP mRNAs. Consistent with our observation that the toxic effect does not require translation, codon adaptation index was not associated with toxicity (Figure 3B). RNA folding energy, measured either in the immediate vicinity of each mutation, or for the entire 132-nt mutagenized region, was not correlated with toxicity, and we were unable to identify any RNA structural elements associated with the toxic phenotype (data not shown). We further probed the effects of sets of several mutations within the 132-nt toxic region. 75/98 sets of mutations we introduced within the region reduced or abolished toxicity, whereas 23/98 sets had no effect (Supp Figure 5). In almost all cases, the phenotypes of sets could be deduced from the effects of individual mutations in a simple way: if any mutation in a set abolished toxicity, then the set also did. Four sets did not conform to this rule, indicating potential epistatic interactions between mutations (not shown). Mutations near the 3′ end of the 132-nt fragment had no effect on toxicity, identifying a minimal toxicity-determining region of about a hundred nucleotides that either consists of a single functional element, or it contains multiple elements whose cooperative action causes toxicity.

Figure 3.
  • Download figure
  • Open in new tab
Figure 3. Multiple sequence elements determine RNA-mediated toxicity.

(A) Growth rates of single synonymous mutants of GFP_170, measured in BL21 strain (averages of 9 replicates). Mutations located throughout the toxic region reduce or abolish toxicity. (B) Relationship between Codon Adaptation Index (CAI) and the growth rate of GFP mutants. Asterisk-marked codons represent the original codon in GFP_170. (C) Growth estimate (optical density) of BL21 cells expressing GFP variants containing fragments: GFP_155 nt 490-720 (N=16, red), GFP_170 nt 514-645 (N=6, green), and other variants (N= 163, blue). (D) Spearman correlation analysis of phenotypes measured in BL21 cells and sequence covariates in a set of 190 GFP variants. The size and colour of circles represents the correlation coefficient; crosses indicate non-significant correlations.

Several recent studies examined the effects of synonymous mutations on fitness in bacteria, either in endogenous genes, or in overexpressed heterologous genes2, 12-16. Fitness had been found to correlate with the codon adaptation index (CAI), GC content, RNA folding, protein expression level, a codon ramp near the start codon, and measured or predicted translation initiation rates. We quantified these variables in a set of 190 synonymous variants of GFP, and analysed their impact on fitness. We also considered two candidate toxic RNA fragments (GFP_170, nt 514-645, and GFP_155, nt 490-720), both of which were common to several constructs and appeared to negatively influence fitness (Figures 3C, D). High protein expression was previously shown to correlate with slow growth14, whereas we found positive correlations of fitness with total protein yield or protein yield per cell. These correlations presumably reflect reduced protein yields and cell growth after the induction of toxic RNAs. As seen previously, growth rate and optical density were positively correlated with CAI, and GC content was correlated with optical density2, 16. However, in a multiple regression analysis aimed to disentangle the effects of these covariates, we found that the presence of candidate toxic RNA fragments predicted slow growth in both BL21 and DH5α cells, whereas CAI and GC3 did not (Methods). This suggests that the apparent correlation of CAI or GC content with fitness, observed in this and previous studies2, 16, might result from the confounding effect of toxic RNA fragments (Supp Figure 6A, B). Consistently, an experiment with 22 new, unrelated synonymous GFP constructs spanning a wider range of GC content showed no correlation between GC content and bacterial growth (Supp Figure 6C, D). To further test whether toxicity could be explained by unusually high expression of certain GFP variants, we measured the mRNA abundance of 79 toxic and non-toxic RNAs by Northern blots, and correlated GFP mRNA abundance per cell with OD. Although we observed differences in mRNA abundance, mostly related to mRNA folding2, we find no significant correlation between RNA abundance and toxicity (Spearman rho=0.12, p=0.29). Furthermore, we detected no consistent differences in plasmid abundance between toxic and nontoxic variants.

To study the molecular mechanisms of toxicity caused by mRNA overexpression, we aimed to evolve genetic suppressors of this phenotype. We selected several GFP constructs that showed both strong toxicity and moderate or high GFP fluorescence, and plated bacteria containing these constructs on LB agar plates with IPTG and ampicillin. We observed a number of large white colonies that apparently expressed no GFP, and smaller bright green colonies producing high amounts of the GFP protein (Figure 4A). We hypothesized that the green colonies have acquired a genomic mutation that allowed cells to survive while expressing toxic RNAs. To support this, we cured the evolved strains of their respective plasmids and re-transformed the cured strains with the same plasmid. The re-transformed strains readily formed bright green colonies on IPTG+ampicillin plates, and exhibited faster growth rates in IPTG medium compared to the parental strain. This supported our hypothesis that the mutations were located on the chromosome and not the plasmid. We therefore selected 22 evolved strains and the parental strain for genome sequencing, and used the GATK pipeline for calling variants (Methods).

Figure 4.
  • Download figure
  • Open in new tab
Figure 4. Isolation and characterization of genetic suppressors of toxicity.

(A) Fluorescence image of LB+Amp+IPTG Petri dish with BL21 cells expressing GFP_003 variant. (B) Genetic organization of lac and DE3 loci in BL21 cells. Dashed lines indicate homologous recombination between the loci in suppressor strains. (C) Sequence variation between the three types of promoters found in the suppressor strains. Substitutions are marked in red. (D) Growth curves and fluorescence of strains carrying the GFP_003 variant: parental BL21 strain (red), suppressors strains (N=7, green), C41 and C43 strains (blue). (E) Growth rates of C41 and C43 cells expressing several GFP variants. GFP_003, GFP_100 and GFP_170 are toxic in the BL21 strain, GFP_012 and GFP_183 are not. Growth curves are averages of 3 replicates.

In all green suppressor strains, we found a single cluster of mutations in the Plac promoter of the T7 polymerase gene that explains the suppressor phenotype (Figure 4B, C, Supp Table 1). The parental BL21 strain contains two alleles of the Plac promoter: the wild-type allele PlacWT controls the lac operon, and a stronger derivative allele PlacUV5 controls T7 RNA polymerase. In the suppressor strains, recombination between these two loci associates PlacWT promoter with T7 polymerase, leading to reduced levels of polymerase and presumably to reduced transcription of GFP. The same Plac promoter mutations were recently observed in the C41(DE3) and C43(DE3) strains of E. coli (the ‘Walker strains’), and were responsible for the reduced T7 RNA polymerase expression, high-level recombinant protein production, and improved growth characteristics of those strains22-24. Similar to our suppressor strains, C41(DE3) and C43(DE3) allowed high protein expression of toxic GFP variants, and little toxicity was observed in these strains (Figure 4D). Taken together, these results support our conclusion that high levels of RNA, rather than RNA translation or protein, are responsible for toxicity.

To test whether translation-independent RNA toxicity might affect genes other than GFP, we turned to the ogcp gene, which encodes a membrane protein Oxoglutarate-malate transport protein (OGCP) believed to be toxic for E. coli. OGCP overexpression was originally used to derive the C41( DE3) strain, now commonly used for recombinant protein expression22. As expected, we found that expression of OGCP was toxic to BL21 but not to C41(DE3) cells. In agreement with our observations for GFP, a translation-incompetent variant of OGCP lacking the Shine-Dalgarno sequence was just as toxic to BL21 cells as a translation-competent variant (Supp Figure 7). A translation-competent, codon-optimized variant of OGCP retained toxicity in BL21 cells. These experiments suggest that translation-independent RNA toxicity might be a widespread phenomenon associated with heterologous gene expression in E. coli. Heterologous protein expression is known to inhibit growth of E. coli. Toxicity is typically attributed to the foreign protein itself, and it is often remedied by lowering expression, reducing growth temperature, or using special strains of E. coli such as C41(DE3). Here we demonstrate that the same strategies and strains also prevent toxicity when RNA, rather than protein, is the toxic molecule. We speculate that other cases of toxicity, previously attributed to proteins, may in fact be caused by RNA. Although the molecular mechanisms of RNA toxicity are presently unclear, we identified several GFP and OGCP variants with similar phenotypes, suggesting that the phenomenon may be common. Interestingly, induction of wild-type APE_0230.1 in E. coli inhibits growth, but a codon-optimized variant does not inhibit growth despite increased protein yield25. In addition, several recent high-throughput studies found unexplained cases of slow growth or toxicity upon the expression of various random sequences in E. coli14, 26, 27. Our results point to RNA toxicity as a possible cause of these observations.

Our results are relevant to the phenomenon of synonymous site selection in microorganisms. Synonymous mutations can influence fitness directly (in cis), by changing the expression of the gene in which the mutation occurs12, 13, 15, or indirectly (in trans), by influencing the global metabolic cost of expression2, 14, 16, 28. Experiments with essential bacterial genes predominately uncover cis-effects, most of them mediated by changes of RNA structure or other properties that influence translation yield. For example, mutations in Salmonella enterica rpsT downregulated the gene, and could be compensated by additional mutations in or around rpsT or by increase of the gene copy number13. Similarly, mutations that disrupted mRNA structure of the E. coli infA gene, through local or long-range effects, explained much variation in fitness across a large collection of mutants12. Protein abundance and RNA structure contribute to the observed trans-effect of mutations14. Although our results are broadly consistent with a role of RNA structure, the specific structure is unknown, and the effects we uncovered are translation-independent, suggesting that a novel mechanism is involved. Toxic RNAs might interact with an essential cellular component, either nucleic acid or protein, and interfere with its normal function. Such interactions might be uncovered by pulldowns of toxic RNAs combined with sequencing or mass spectrometry. Alternatively, RNA phase transitions may be involved; such transitions have been shown to contribute to the pathogenicity of CAG-expansion disorders in Eukaryotes, providing a mechanistic explanation for this phenomenon29. Further studies will address the mechanisms, biotechnology applications, and evolutionary consequences of RNA toxicity in bacteria.

Supplementary Figure 1.
  • Download figure
  • Open in new tab
Supplementary Figure 1. Growth phenotypes of GFP variants.

(A-B) Growth rates in BL21 cells (A) and DH5α (B) in the presence of IPTG, sorted from minimum to maximum growth rate in each strain. (C) Growth curves of DH5α cells at different temperatures (23°C, 37°C and 42°C) in presence of IPTG. At 23°C there are minor variations in growth of cells expressing GFP variants, at 37°C there are large variations, and at 42°C, some of the GFP variants fail to grow altogether. GFP_012 (non-toxic, blue), GFP_170 (toxic, magenta), other variants (grey). The growth curves represent averages of at least 6 replicates. (D) Growth curve of BL21 cells expressing GFP_170 (magenta); suppressor isolated after back-diluting cells expressing GFP_170 in presence (red) and absence (grey) of IPTG. The suppressor strain has similar growth phenotypes both in presence and absence of IPTG.

Supplementary Figure 2.
  • Download figure
  • Open in new tab
Supplementary Figure 2. Microscopic analysis of cell viability.

Cell viability was estimated for BL21 cells expressing GFP_012 (non-toxic variant) and GFP_170 (toxic variant). Brightfield images give an estimate of cell morphology and densities. GFP and RFP channels were used to determine the number of cells expressing GFP and the number of dead cells stained by Propidium Iodide (PI) respectively. At 0 min (just before IPTG induction) GFP_012 and GFP_170 cultures have similar cell densities and morphology. For cells expressing GFP_012, we see a steady increase in cell number after induction and GFP expression appears after 30 mins of induction. There is no significant cell death (PI stained cells) at any given time point. For cells expressing GFP_170 cell densities do not increase rapidly and most cells lose their morphology. We see a rapid increase in number of dead cells and the severity of the phenotype can be estimated at 240 min time point when PI staining shows only dead cells or debris from the dead cells. GFP expression is not seen for GFP_170 due to a strong mRNA secondary structure at its 5′ end, impeding its translation. The scale bar is 5 µm.

Supplementary Figure 3.
  • Download figure
  • Open in new tab
Supplementary Figure 3. Measurement of GFP expression by Western blotting

(A) Expression of four toxic variants of GFP in the presence and absence of RBS. UI, uninduced control; M, marker. GFP expression was analysed by probing with anti-GFP polyclonal antibody (abcam 290). Ponceau stained blot shows equal loading. (B) GFP_170 toxic fragment (nt 514-645) expression fused to FLAG tag in all three reading frames (S1, S2, and S3) was analysed by probing with monoclonal Anti-FLAG (F3165 sigma). UI, uninduced control; M, marker; C, control sample expressing two Flag-tagged proteins of size 116 and 90 kDa. No FLAG expression was detected from S1, S2 or S3 constructs.

Supplementary Figure 4.
  • Download figure
  • Open in new tab
Supplementary Figure 4. The toxic element resides near the 3′ end of GFP_170 and toxicity is independent of translation.

(A) Growth curve for BL21 cells expressing constructs GFP_012, GFP_170 and their shuffled variants JB_015 and JB_016. JB_015 consists of GFP_170 (nts 1-497) and GFP_012 (498-720); JB_016 consists of GFP_012 (1-449) and GFP_170 (450-720). (B) Fluorescence of the shuffled constructs. JB_015 is non-toxic and shows a low level of fluorescence; JB_016 and GFP_170 are toxic and almost non-fluorescent. (C) Growth rate of cells expressing GFP_170 constructs with internal stop codons before and after the toxic fragment (nt 514-645) in all three reading frames. TAA stop codons were inserted at nucleotide positions 469 (stop2_frame1), 470 (stop2_frame2) and 471 (stop2_frame3) upstream of the toxic fragment and 643 (stop3_frame1), 644 (stop3_frame2) and 645 (stop3_frame3) downstream of toxic fragment. (D) Growth curves of constructs having toxic fragment from GFP_170 fused to FLAG tag at the 3′ end in all three reading frames. All three constructs retain toxicity. (E) Growth curves of mKate2 and toxic GFP_170 fragment fused to mKate2 at the 5′ end. Fusion construct retains toxicity (F) Expression of mKate2. No fluorescence is detected when mKate2 is fused with the toxic fragment from GFP_170.

Supplementary Figure 5.
  • Download figure
  • Open in new tab
Supplementary Figure 5. Growth analysis of GFP constructs generated by shuffling and multiple synonymous mutations.

(A) 36 constructs were generated by DNA shuffling of GFP_012 (blue) and GFP_170 (orange). All constructs encode full length GFP. Constructs are colour coded according to the sequence identity with GFP_012 and GFP_170. The constructs from top to bottom are arranged in ascending order of their growth (OD 595nm). The highlighted region shows that most constructs having sequence identical to GFP_170 (orange) in 520-620 nt region are toxic. (B) An inset of the highlighted area from Panel A summarizes the results of multiple synonymous mutations that were generated in the toxic region. Each row represents a particular mutated variant and each column represents the nucleotide position. Columns highlighted orange and black represent nucleotides identical to GFP_170 and synonymous substitutions respectively. Each construct has 2-9 substitutions. Synonymous mutations in the region 534-624 nt reduce or abolish the toxicity of GFP_170 but any number of synonymous mutations in 627-642 nt region had no effect on toxicity. All data are averages of 9 replicates, +/- SEM.

Supplementary Figure 6.
  • Download figure
  • Open in new tab
Supplementary Figure 6. No correlation between GC3 content and growth rate of GFP variants.

(A-B) The correlation between GC3 content and growth (OD 595nm) of GFP variants in BL21 cells is driven by two toxic RNA fragments shared between a number of variants: GFP_155 nt 490-720, and GFP_170 nt 514-645, marked in orange. After removal of these variants (panel B), we no longer see any relationship between GC3 content and growth. (C-D) There is no relationship between GC3 content and growth in an independent set of 22 GFP constructs, either in DH5α (C) or BL21 (D) strains. All data are averages of 9 replicates, +/- SEM.

Supplementary Figure 7.
  • Download figure
  • Open in new tab
Supplementary Figure 7. Spot assay for semi-quantitative estimation of cell viability of BL21 cells expressing OGCP variants.

OGCP-WT (wild type OGCP), OGCP_noRBS (OGCP lacking functional RBS) and OGCP_CO (codon-optimized OGCP) variants were cloned in pGK8 plasmid and transformed in BL21 and C43 strains. In the absence of IPTG there are no difference in the viabilities between strains or constructs; in the presence of IPTG, the three constructs are toxic in BL21 cells but not in C43 cells.

View this table:
  • View inline
  • View popup
  • Download powerpoint
Supplementary Table 1. Analysis of suppressor genotypes.

15/18 green suppressors showed a complete replacement of PlacUV5 promoter with PlacWT, 3/18 showed replacement of PlacUV5 with PlacWeak. 3/4 white suppressors had no changes in the promoter of T7 RNA polymerase, while for 1/4 we could not definitively assign the promoter type.

References

  1. 1.↵
    Zhou, Z. et al. Codon usage is an important determinant of gene expression levels largely through its effects on transcription. Proc Natl Acad Sci U S A 113, E6117–E6125 (2016).
    OpenUrlAbstract/FREE Full Text
  2. 2.↵
    Kudla, G., Murray, A.W., Tollervey, D. & Plotkin, J.B. Coding-sequence determinants of gene expression in Escherichia coli. Science 324, 255–258 (2009).
    OpenUrlAbstract/FREE Full Text
  3. 3.↵
    Goodman, D.B., Church, G.M. & Kosuri, S. Causes and effects of N-terminal codon bias in bacterial genes. Science 342, 475–479 (2013).
    OpenUrlAbstract/FREE Full Text
  4. 4.↵
    Sorensen, M.A., Kurland, C.G. & Pedersen, S. Codon usage determines translation rate in Escherichia coli. J Mol Biol 207, 365–377 (1989).
    OpenUrlCrossRefPubMedWeb of Science
  5. 5.↵
    Akashi, H. Synonymous codon usage in Drosophila melanogaster: natural selection and translational accuracy. Genetics 136, 927–935 (1994).
    OpenUrlAbstract/FREE Full Text
  6. 6.↵
    Drummond, D.A. & Wilke, C.O. Mistranslation-induced protein misfolding as a dominant constraint on coding-sequence evolution. Cell 134, 341–352 (2008).
    OpenUrlCrossRefPubMedWeb of Science
  7. 7.↵
    Presnyak, V. et al. Codon optimality is a major determinant of mRNA stability. Cell 160, 1111–1124 (2015).
    OpenUrlCrossRefPubMed
  8. 8.↵
    Pagani, F., Raponi, M. & Baralle, F.E. Synonymous mutations in CFTR exon 12 affect splicing and are not neutral in evolution. Proc Natl Acad Sci U S A 102, 6368–6372 (2005).
    OpenUrlAbstract/FREE Full Text
  9. 9.↵
    Plotkin, J.B. & Kudla, G. Synonymous but not the same: the causes and consequences of codon bias. Nat Rev Genet 12, 32–42 (2011).
    OpenUrlCrossRefPubMedWeb of Science
  10. 10.
    Gu, W., Zhou, T. & Wilke, C.O. A universal trend of reduced mRNA stability near the translation-initiation site in prokaryotes and eukaryotes. PLoS Comput Biol 6, e1000664 (2010).
    OpenUrlCrossRefPubMed
  11. 11.↵
    Chamary, J.V., Parmley, J.L. & Hurst, L.D. Hearing silence: non-neutral evolution at synonymous sites in mammals. Nat Rev Genet 7, 98–108 (2006).
    OpenUrlCrossRefPubMedWeb of Science
  12. 12.↵
    Kelsic, E.D. et al. RNA Structural Determinants of Optimal Codons Revealed by MAGE-Seq. Cell systems 3, 563–571 e566 (2016).
    OpenUrl
  13. 13.↵
    Knoppel, A., Nasvall, J. & Andersson, D. I. Compensating the Fitness Costs of Synonymous Mutations. Mol Biol Evol 33, 1461–1477 (2016).
    OpenUrlCrossRefPubMed
  14. 14.↵
    Frumkin, I. et al. Gene Architectures that Minimize Cost of Gene Expression. Mol Cell 65, 142–153 (2017).
    OpenUrlCrossRef
  15. 15.↵
    Agashe, D. et al. Large-Effect Beneficial Synonymous Mutations Mediate Rapid and Parallel Adaptation in a Bacterium. Mol Biol Evol 33, 1542–1553 (2016).
    OpenUrlCrossRefPubMed
  16. 16.↵
    Raghavan, R., Kelkar, Y.D. & Ochman, H. A selective force favoring increased G+C content in bacterial genes. Proc Natl Acad Sci U S A 109, 14504–14507 (2012).
    OpenUrlAbstract/FREE Full Text
  17. 17.↵
    Brandis, G. & Hughes, D. The Selective Advantage of Synonymous Codon Usage Bias in Salmonella. PLoS Genet 12, e1005926 (2016).
    OpenUrlCrossRefPubMed
  18. 18.↵
    Salis, H.M., Mirsky, E.A. & Voigt, C.A. Automated design of synthetic ribosome binding sites to control protein expression. Nat Biotechnol 27, 946–950 (2009).
    OpenUrlCrossRefPubMedWeb of Science
  19. 19.↵
    Stemmer, W.P. DNA shuffling by random fragmentation and reassembly: in vitro recombination for molecular evolution. Proc Natl Acad Sci U S A 91, 10747–10751 (1994).
    OpenUrlAbstract/FREE Full Text
  20. 20.↵
    Mairhofer, J., Wittwer, A., Cserjan-Puschmann, M. & Striedner, G. Preventing T7 RNA polymerase read-through transcription-A synthetic termination signal capable of improving bioprocess stability. ACS synthetic biology 4, 265–273 (2015).
    OpenUrl
  21. 21.↵
    Krzyzosiak, W.J. et al. Triplet repeat RNA structure and its role as pathogenic agent and therapeutic target. Nucleic Acids Res 40, 11–26 (2012).
    OpenUrlCrossRefPubMedWeb of Science
  22. 22.↵
    Miroux, B. & Walker, J.E. Over-production of proteins in Escherichia coli: mutant hosts that allow synthesis of some membrane proteins and globular proteins at high levels. J Mol Biol 260, 289–298 (1996).
    OpenUrlCrossRefPubMedWeb of Science
  23. 23.
    Kwon, S.K., Kim, S.K., Lee, D.H. & Kim, J.F. Comparative genomics and experimental evolution of Escherichia coli BL21(DE3) strains reveal the landscape of toxicity escape from membrane protein overproduction. Scientific reports 5, 16076 (2015).
    OpenUrl
  24. 24.↵
    Schlegel, S., Genevaux, P. & de Gier, J.W. De-convoluting the Genetic Adaptations of E. coli C41(DE3) in Real Time Reveals How Alleviating Protein Production Stress Improves Yields. Cell reports (2015).
  25. 25.↵
    Boel, G. et al. Codon influence on protein expression in E. coli correlates with mRNA levels. Nature 529, 358–363 (2016).
    OpenUrlCrossRefPubMed
  26. 26.↵
    Neme, R., Amador, C., Yildirim, B., McConnell, E. & Tautz, D. Random sequences are an abundant source of bioactive RNAs or peptides. Nat Ecol Evol 1, 0217 (2017).
    OpenUrl
  27. 27.↵
    Cambray, G., Guimaraes, J.C. & Arkin, A.P. Massive Factorial Design Untangles Coding Sequences Determinants Of Translation Efficacy. bioRxiv (2017).
  28. 28.↵
    Andersson, S.G. & Kurland, C.G. Codon preferences in free-living microorganisms. Microbiol Rev 54, 198–210 (1990).
    OpenUrlAbstract/FREE Full Text
  29. 29.↵
    Jain, A. & Vale, R.D. RNA phase transitions in repeat expansion disorders. Nature 546, 243–247 (2017).
    OpenUrlCrossRefPubMed
  30. 30.↵
    de Jong, I.G., Beilharz, K., Kuipers, O.P. & Veening, J.W. Live Cell Imaging of Bacillus subtilis and Streptococcus pneumoniae using Automated Time-lapse Microscopy. Journal of visualized experiments: JoVE (2011).
  31. 31.↵
    Braman, J., Papworth, C. & Greener, A. Site-directed mutagenesis using double-stranded plasmid DNA templates. Methods Mol Biol 57, 31–44 (1996).
    OpenUrlCrossRefPubMed
  32. 32.↵
    Lorimer, I.A. & Pastan, I. Random recombination of antibody single chain Fv sequences after fragmentation with DNaseI in the presence of Mn2+. Nucleic Acids Res 23, 3067–3068 (1995).
    OpenUrlCrossRefPubMedWeb of Science
  33. 33.↵
    Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25, 1754–1760 (2009).
    OpenUrlCrossRefPubMedWeb of Science
  34. 34.↵
    Van der Auwera, G.A. et al. From FastQ data to high confidence variant calls: the Genome Analysis Toolkit best practices pipeline. Current protocols in bioinformatics 43, 11 10 11–33 (2013).
    OpenUrl
  35. 35.↵
    Quinlan, A.R. & Hall, I.M. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26, 841–842 (2010).
    OpenUrlCrossRefPubMedWeb of Science
  36. 36.↵
    Sharp, P.M. & Li, W.H. The codon Adaptation Index--a measure of directional synonymous codon usage bias, and its potential applications. Nucleic Acids Res 15, 1281–1295 (1987).
    OpenUrlCrossRefPubMedWeb of Science
  37. 37.↵
    Markham, N.R. & Zuker, M. UNAFold: software for nucleic acid folding and hybridization. Methods Mol Biol 453, 3–31 (2008).
    OpenUrlCrossRefPubMed
View Abstract
Back to top
PreviousNext
Posted June 11, 2018.
Download PDF

Supplementary Material

Email

Thank you for your interest in spreading the word about bioRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
Codon usage influences fitness through RNA toxicity
(Your Name) has forwarded a page to you from bioRxiv
(Your Name) thought you would like to see this page from the bioRxiv website.
Share
Codon usage influences fitness through RNA toxicity
Pragya Mittal, James Brindle, Julie Stephen, Joshua B. Plotkin, Grzegorz Kudla
bioRxiv 344002; doi: https://doi.org/10.1101/344002
Digg logo Reddit logo Twitter logo CiteULike logo Facebook logo Google logo Mendeley logo
Citation Tools
Codon usage influences fitness through RNA toxicity
Pragya Mittal, James Brindle, Julie Stephen, Joshua B. Plotkin, Grzegorz Kudla
bioRxiv 344002; doi: https://doi.org/10.1101/344002

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Synthetic Biology
Subject Areas
All Articles
  • Animal Behavior and Cognition (1533)
  • Biochemistry (2492)
  • Bioengineering (1747)
  • Bioinformatics (9700)
  • Biophysics (3915)
  • Cancer Biology (2979)
  • Cell Biology (4213)
  • Clinical Trials (135)
  • Developmental Biology (2639)
  • Ecology (4108)
  • Epidemiology (2033)
  • Evolutionary Biology (6911)
  • Genetics (5224)
  • Genomics (6519)
  • Immunology (2193)
  • Microbiology (6974)
  • Molecular Biology (2765)
  • Neuroscience (17348)
  • Paleontology (126)
  • Pathology (430)
  • Pharmacology and Toxicology (709)
  • Physiology (1062)
  • Plant Biology (2498)
  • Scientific Communication and Education (646)
  • Synthetic Biology (832)
  • Systems Biology (2691)
  • Zoology (433)