Design, construction and optimization of a synthetic RNA polymerase operon in Escherichia coli

Prokaryotic genes encoding functionally related proteins are often clustered in operons. The compact structure of operons allows for co-transcription of the genes, and for co-translation of the polycistronic messenger RNA to the corresponding proteins. This leads to reduced regulatory complexity and enhanced gene expression efficiency, and as such to an overall metabolic benefit for the protein production process in bacteria and archaea. Interestingly, the genes encoding the subunits of one of the most conserved and ubiquitous protein complexes, the RNA polymerase, are not clustered in a single operon. Rather, its genes are scattered in all known prokaryotic genomes, generally integrated in different ribosomal operons. To analyze the impact of this genetic organization on the fitness of Escherichia coli, we constructed a bacterial artificial chromosome harboring the genes encoding the RNA polymerase complex in a single operon. Subsequent deletion of the native chromosomal genes led to a reduced growth on minimal medium. However, by using adaptive laboratory evolution the growth rate was restored to wild-type level. Hence, we show that a highly conserved genetic organization of core genes in a bacterium can be reorganized by a combination of design, construction and optimization, yielding a well-functioning synthetic genetic architecture.


28
Operons were first described in the 1960s by Jacob and Monod as "a group of genes regulated by a 29 single operator" (1,2). At present, operons are generally defined as clusters of genes that are co-30 transcribed as a single polycistronic mRNA. As first observed in the lactose (lac) operon of E.coli (1,

35
Ever since the discovery of the operon organization, the potential evolutionary forces that 36 drive operon formation have been discussed. Several hypotheses have been suggested to explain how operons could potentially contribute to a selective advantage over individual genes: (i) operons 38 contribute to reduction of genome size and to simplification of gene expression control (5), (ii) operons 39 avoid energy loss through appropriate co-transcription and co-translation of functionally-related genes 40 (6), and (iii) operons improve functional horizontal gene transfer (7-9).

41
In some metabolic pathways and protein complexes, uneven stoichiometries are required. In 42 these cases it has been demonstrated that differential transcription occurs by using multiple 43 promoters (4, 10), while differential translation of the cistrons within the operons is achieved in

64
The bacterial RNAP core complex consists of five subunits: two copies of the α subunits and 65 single copies of the β, β' and ω subunits (14). The rpoA-encoded α subunit dimer plays a key role in 66 assembly of the RNAP complex, acting as a scaffold for assembling the β and β' subunits (15).

70
The rpoB and rpoC genes encode the structurally related β and β' subunits, respectively, that 71 make up the hetero-dimeric core of the RNAP complex, of which the β' subunit harbors the catalytic 72 polymerase center (18,19). Most likely the rpoB and rpoC genes are the result of a gene duplication 73 (18,19). In line with this model, a single orthologous RNAP gene still exists in some phages, probably 74 encoding a homo-dimer (19). The bacterial rpoB and rpoC genes are always clustered, often 75 overlapping, and occasionally fused (20). In addition, a functional synthetic rpoB-rpoC fusion protein 76 has been reported (21). In several bacteria the rpoB and rpoC genes reside in an operon with the 77 ribosomal genes rplK, rplA, rplJ and rplL (Fig. 1B, 2B). In E. coli, this operon has a complex regulation:

89
The only non-essential subunit of the bacterial RNAP core is the ω subunit, which upon 90 knockout leads to growth retardation, but not to cell death (14, 23). The ω subunit is encoded by rpoZ, 91 which in E. coli resides in an operon with trmH, recG and spoT (Fig. 2C). TrmH is a tRNA 92 methyltransferase, and RecG is an ATP-dependent DNA helicase which plays a critical role in DNA 93 repair and DNA recombination (24, 25). SpoT is responsible for the synthesis and degradation of 94 ppGpp, the effector molecule for stringent response, which enables bacterial cells to react to stress conditions by altering expression of many genes (26-29). Interestingly, the primary ppGpp binding 96 site of the E. coli RNAP is located at the interface of the β' and the ω-subunits. The ω subunit plays a 97 role in regulating ppGpp-dependent control of RNAP activity (30), and it has been reported to act as a 98 chaperone for the RNAP subunits (31). The ω subunit binds mainly to the β' subunit, close to the 99 active polymerase site, indicating a role in controlling the RNAP catalytic activity (18).

100
The RNAP α2ββ'ω core forms a holoenzyme with a σ factor to initiate transcription. Bacteria

118
In this study, we set out to use a synthetic biology approach to test if this evolutionary-119 conserved scattering of the RNAP genes can be reorganized into a single operon, and how such a 120 different architecture would impact cellular fitness. For this, we designed and constructed an operon 121 of the RNAP core genes in E. coli. This synthetic operon was introduced on a bacterial artificial 122 chromosome (BAC), and expressed in E. coli. Subsequently, native RNA polymerase genes on the E.

123
coli chromosome were knocked out, to assess the function of the RNA polymerase operon. This led to 124 a slightly lower growth rate on rich medium compared to wild-type E. coli, but, to almost complete loss 125 of growth on minimal medium. However, by adaptive laboratory evolution (ALE) on minimal medium, 126 we could restore growth and even improve the yield of the strain with the synthetic RNAP operon.

127
Overall, this study demonstrates that an evolutionary-conserved operon organization of a core protein 128 complex can be successfully reorganized, suggesting plasticity of genome organization and regulation.

130
Strains and growth conditions

131
The E. coli DH10B strain (Invitrogen, suppl. table 2) was used for expression of the synthetic operon.

159
The provided protocol was adjusted by initially centrifuging the cell cultures at 4700 rpm for 10 min,

198
Single colonies were picked from these plates and re-streaked on LB agar and in LB agar/Cam30.

199
CmS colonies were picked and re-suspended in ddH2O to perform colony PCR, to confirm the 200 successful excision of the chloramphenicol resistance gene.

201
Growth assays of knock-out strain and data analysis 202 After knock-out was confirmed by colony PCR, growth assays were performed to assess growth rates 203 of the mutant strains. Precultures were prepared on LB for each strain and for wild-type DH10B.

207
The plate was then incubated at 37 °C in a Biotek ELx800 absorbance microplate reader (Fisher 208 Scientific). The provided reader control software, Gen5, was used to set a measuring protocol 209 consisting of a cycle of 5 min of linear shaking followed by absorbance measurement at 600 nm, for at 210 least 24 h. The data were then exported to an Excel spreadsheet. An in-house MatLab script was 211 used to process the data, yielding strain-specific growth graphs and doubling times.

Designing and building a synthetic RNAP operon 219
We rationally designed a synthetic RNAP operon in the order rpoABCZ. First, rpoA was introduced by for rpoC as well, but several attempts to introduce rpoC in the operon on the BAC were not successful.

238
To introduce rpoC we aimed to switch the antibiotic resistance gene in the BAC to tetR, to select for 239 BAC-rpoABC after transformation into the double knockout strain. Unexpectedly we did not obtain any 240 transformants harboring rpoABC. This prevented us from following the planned approach, to properly 241 introduce, delete and optimize expression for each RNAP gene one-by-one.

242
Therefore, an alternative approach was designed. As we could not continue with the one-by-243 one optimization of the rpo genes, we decided to assemble the operon at once. Making a 244 combinatorial library of the synthetic rpo operon would lead to a large collection of E. coli strains from 245 which all genomic rpo genes would first need to be deleted and confirmed, leading to a major 246 experimental effort. Hence, we decided to use the previously identified well performing strong 247 promoter in combination with RBS80 for rpoA, and the native RBS variant for rpoB. In addition,

248
without prior knowledge, we tested the RBS80 variant upstream rpoC as well as upstream rpoZ.

249
Downstream rpoZ, we included an in-house designed synthetic terminator consisting of a stem-loop 250 and a T-stretch (5'-ccccgcttcggcggggttttttt) (Fig. 3). To efficiently assemble all these parts in the 251 relatively large BAC construct at once, we chose to further construct the BAC in Saccharomyces 252 cerevisiae because of its highly efficient recombination system. For that purpose, a BAC-yeast 253 artificial chromosome (BAC-YAC) shuttle vector was constructed. First, we PCR amplified the 254 bacterial replication system (sopA, sopB, sopC and repE) from a BAC (pBeloBAC11 (39)), as well as 255 the yeast centromere region from a YAC (pHLUM (40)) with a his3 and a ura3 selection marker.

256
Furthermore, the RNAP genes rpoA, rpoB, rpoC and rpoZ (and the aforementioned RBS variants) 257 were PCR-amplified from E. coli DH10B. To allow for eventually knocking out the native rpo genes, 258 the genes encoding the λ-red system (gam, bet, exo) and the Cre recombinase (cre) were PCR 259 amplified by using plasmid pSC020 (41) as a template. All PCR amplifications were carried out using 260 extended primers that generated specific 50 base pair overhangs to allow for efficient homologous 261 recombination in S. cerevisiae. Next, we transformed S. cerevisiae CEN.PK2-1D (histidine, leucine, 262 tryptophan, uracil auxotroph) with the 8 fragments as described above for homologous recombination 263 and selected for correct assemblies using medium lacking histidine and uracil (Fig. 3)

270
the native genes were knocked out using λ-red recombination (Fig. 3). For this, a repair template 271 containing a chloramphenicol resistance gene flanked by lox66 and lox72 sites was used (35). The

272
repair template was PCR-amplified using primers harboring overhangs homologous to the knockout 273 location. Using this approach, the chromosomal rpoA, rpoB-rpoC and rpoZ genes of the BAC-YAC-274 containing E. coli strain were deleted consecutively (Fig. 3). The successful genomic deletions were 275 confirmed initially by PCR, and finally by genome sequencing. This confirmed that the synthetic 276 operon could fully replace the scattered genomic rpo genes.

277
To assess the growth of this newly created strain, named strain JH10B, growth assays were 278 performed on rich medium (LB) and minimal medium (M9+glucose),respectively (suppl. fig. 1). It was 279 found that on rich medium the growth rate of JH10B was slightly lower (reduced growth rate 280 approximately 7%) compared to growth of the wild-type E. coli DH10B. On minimal medium, however, 281 no growth was observed within the first two days for the engineered strain harboring the RNAP 282 operon (suppl. Fig. 1).

292
In an attempt to recover the ability of strain JH10B to grow on minimal medium, we decided to 293 perform adaptive laboratory evolution (ALE) on M9+glucose. For this, two colonies (biological 294 replicates a & b) were randomly selected and grown in 10 mL M9+glucose until the OD600 was at 295 least 0.4. After this, 10 μL (0.1%) was transferred to a fresh tube with 10 mL M9+glucose, and after 296 each passage a sample was taken for storage. After the first passage the obtained strain JH10B-ALE-297 1 already started growing on M9+glucose, and after 12 passages (strain JH10B-ALE-12) a plate 298 reader experiment was done to assess growth of selected generations of the adapted strains (JH10B-299 ALE-1,-2,-3,-4,-8,-12) in minimal and rich medium, compared to the wild-type strain (Fig. 4, suppl. fig.   300 2). Although the lag-phase of the wild-type is shorter than the lag phases of the ALE strains, the 301 growth rate of the evolved strains is higher (up to 23%). Already after one round of ALE, the doubling 302 time of both biological ALE-1 replicates (strains JH10B-ALE-1a/b; 2.2/1.8 hrs respectively) was 303 comparable to the wild-type strain (1.6 hrs), whereas after the second round of ALE the evolved strain 304 (JH10B-ALE-2b) grows faster than the wild-type on M9+glucose medium (1.4 hrs). Additionally, the 305 yield (final OD600) of some of the evolved strains (JH10B-ALE3b; 1.39) are substantially higher 306 compared to the yield of the wild-type strain (0.66) (suppl. fig. 2). While this experiment enhanced the 307 growth rate of JH10B on M9+glucose, growth on LB did decrease (approximately 42% for JH10B-308 ALEa and 35% for JH10B-ALEb) during the experiment (Suppl. fig. 3). rpoA (Y68C) was found not only in both ALE1 replicates, but also in passage 0, before ALE (Table 1).

320
Sequencing results indicate that this mutation appeared either during PCR amplification or during  346 glucose led to faster growth in this condition (Fig. 4), but to reduced growth on LB medium (suppl. fig.   347 3). The obtained amino acid substitutions can have different effects on the subunit, and on the RNAP  transcription, translation, glycolysis). Then the genome was split in eight segments, and for each 370 segment separately all genes belonging to one category were co-localized within this segment.

371
However, this meant that some original operon structures had to be rearranged. Hereto, a relatively 372 random approach was followed by which promoters/RBS were manually selected without any 373 optimization to regulate the reorganized genes. This highly randomized approach still led, maybe 374 surprisingly, to viable cells after modularization of one out of eight segments. However, the other 375 seven reorganized segments did not lead to viable cells. The latter result strongly suggests that a 376 rational approach should probably be combined with a random/combinatorial approach as described 377 in this study, to allow for selecting appropriate combinations of promoter and RBSs.

378
In conclusion, the current study reveals how modularization in bacteria can be performed via 379 a step-wise introduction of genes with synthetic control elements (promoters, RBSs) and subsequent 380 deletion of the native genes. This approach could be the basis to modularize larger parts of the 381 genome. After introduction of a synthetic gene or operon with a small library of synthetic promoters or 382 RBS, the native gene(s) can be deleted to identify viable clones, which could optionally be further improved by ALE. This strategy will take relatively long, but could be speeded up lab-automation.

384
Eventually this may lead to a fully reorganized modular genome, that could be highly beneficial for 385 easy 'swapping' of modules when engineering cells towards desired applications.

387
All data included in this study is available upon request by contact with the corresponding author.

389
We thank dr. Sjoerd Creutzburg for supplying plasmid pSC020 and the synthetic terminator, and dr.

398
Authors state no conflict of interest.