pOPIN-GG: A resource for modular assembly in protein expression vectors

The ability to recombinantly produce target proteins is essential to many biochemical, structural, and biophysical assays that allow for interrogation of molecular mechanisms behind protein function. Purification and solubility tags are routinely used to maximise the yield and ease of protein expression and purification from E. coli. A major hurdle in high-throughput protein expression trials is the cloning required to produce multiple constructs with different solubility tags. Here we report a modification of the well-established pOPIN expression vector suite to be compatible with modular cloning via Type IIS restriction enzymes. This allows users to rapidly generate multiple constructs with any desired tag, introducing modularity in the system and delivering compatibility with other modular cloning vector systems, for example streamlining the process of moving between expression hosts. We demonstrate these constructs maintain the expression capability of the original pOPIN vector suite and can also be used to efficiently express and purify protein complexes, making these vectors an excellent resource for high-throughput protein expression trials. Highlights pOPIN-GG expression vectors allow for modular cloning enabling rapid screening of purification and solubility tags at no loss of expression compared to previous vectors. Cloning into the pOPIN-GG vectors can be performed from PCR products or from level 0 vectors containing the required parts. Several vectors with different resistances and origins of replication have been generated allowing the effective co-expression and purification of protein complexes. All pOPIN-GG vectors generated here are available on Addgene, as well as level 0 acceptors and tags.


53
Understanding protein function is key to answering many biological questions. Biochemical, 54 structural, and biophysical techniques that probe the molecular mechanisms behind protein 55 function are reliant on the production of purified protein for use in these assays. Procedures 56 for protein expression and purification from Escherichia coli have been advanced by 57 methodologies which allow for the high-throughput generation of constructs (Rosano and 58 Ceccarelli, 2014). Intrinsic to generating soluble protein of the more difficult targets in E. coli 59 is the capacity to test multiple solubility tags, such as the Small Ubiquitin-like modifier (SUMO) 60 or the Maltose Binding Protein (MBP) tags, which allow for the production of proteins that 61 would be otherwise insoluble (di Guana et al., 1988;Malakhov et al., 2004). Further, the use 62 of purification tags frequently allows for rapid capture of proteins of interest from cell lysates.

63
However, purification and solubility tags are often vector-linked, encoded in the expression 64 vector upstream of the cloning site of the target gene. As such, the lack of modularity of 65 purification and solubility tags in expression vectors presents a bottleneck in high-throughput 66 expression screens as the user is limited to the solubility tags encoded in the vectors available 67 to them. Therefore, tackling the problem of modularity represents an opportunity to increase 68 the efficiency of expression trials, and readily allows for the incorporation of novel purification 69 and solubility tags as they are developed.

71
The pOPIN vector suite, generated by the Oxford Protein Production Facility (OPPF), is a set 72 of expression vectors encoding various purification and solubility tags at either the N-or C-73 terminus of the gene of interest (GOI) (Berrow et al., 2007). These vectors allow for a 74 straightforward cloning method via ligation independent cloning (LIC) and rapid generation of 75 constructs. Furthermore, the pOPIN vectors are compatible with multiple hosts, with many 76 being able to be used in bacterial, insect and mammalian cell hosts (Berrow et al., 2007). One 77 shortcoming of these excellent vectors is a lack of modularity, meaning users are restricted to 78 the solubility tags provided in the vector suite.

5 80
Golden Gate cloning (also known as Greengate cloning or MoClo) represents a fast and 81 efficient method of cloning genes through the use of Type IIS restriction endonucleases that 82 cut outside their recognition site to reveal user defined four nucleotide overhangs at both the 83 5' and 3' ends of the DNA (Engler et al., 2008). These overhangs can be exploited to allow 84 scarless cloning, as well as for design and assembly of multiple DNA fragments in a single 85 ligation reaction (Padgett and Sorge, 1996). They also allow for efficient subcloning between 86 vectors (Engler et al., 2008). Golden Gate cloning allows for the generation of diverse "level 87 0" parts, which can be promoters, the GOI, terminators, epitope tags etc. These can then be 88 assembled into "level 1" expression cassettes, which themselves can be further cloned along 89 with other level 1 expression cassettes to give rise to "level 2" multi-gene assemblies.

90
Moreover, the scarless nature of Golden Gate cloning makes the technique excellent for 91 synthetic design approaches such as assembling chimeric proteins, allowing the generation 92 of protein domain-swaps, or tagging proteins. Due to its high efficiency, modularity, and well 93 established sequential cloning strategy, Golden Gate cloning has been incorporated in 94 multiple vector systems for eukaryotic expression, such as plants or insect cells (Engler et al., 95 2014(Engler et al., 95 , 2009Neuhold et al., 2020).

97
Here, we present a modified pOPIN vector suite we call pOPIN-GG that takes advantage of 98 Golden Gate cloning (Engler et al., 2009(Engler et al., , 2008 to incorporate modularity into the pOPIN 99 vectors without disrupting efficacy of expression. These vectors also allow cross-compatibility 100 with other Golden Gate systems, such as plant expression binary vectors (Engler et al., 2014; 101 Patron et al., 2015), and simple one-pot reactions which allow for the bespoke construction of 102 expression vectors containing the GOI and desired purification or solubility tags. We

110
Cloning into the pOPIN-GG vectors follows standard Golden Gate protocols where matching 111 overhangs between acceptor vector and inserts, revealed by digestion with a Type IIS 112 endonuclease, BsaI, are ligated with T4 ligase (Engler et al., 2009(Engler et al., , 2008. Here, we have

125
For cloning into pPGC, the GOI requires the overhangs 5' AATG and 3' TTCG and the C-126 terminal tag must have the overhangs 5' TTCG and 3' GCTT. In addition, a GOI encoding an 127 untagged protein can be cloned by using the 5' AATG and 3' GCTT overhangs and one of the 128 C-terminal tagging compatible pPGC vectors, which contain the 5'AATG and 3' GCTT 129 overhangs. Figure 2 visualises the compatible overhangs between vector, tag and insert for 130 cloning to produce N-terminally tagged, C-terminally tagged, and untagged proteins.

131
Overhangs can be introduced into a desired sequence through either PCR with primers 132 containing overhangs and a BsaI site (Table 1)

166
The ability to test multiple purification and solubility tags in E. coli expression trials is important 167 for determining the best conditions for protein production. High-throughput expression trials 168 allow the user to explore potential avenues for successful protein production, and a key step

205
Ultimately, we developed four pOPIN-GG acceptor vectors, pPGN-C (Carb R ) and pPGN-K 206 (Kan R ) for N-terminal tagging, and pPGC-C (Carb R ) and pPGC-K (Kan R ) for C-terminal tagging 207 (Table 1). Further, to assist with positive clone identification, we introduced a visible red 208 fluorescent protein (RFP) negative selection marker, allowing users to select positive white 209 colonies after transformation. Importantly, the Carb R and Kan R versions of the acceptors also 210 contain different origins of replication to allow for efficient co-expression in addition to co-

235
The AVR-PikF coding sequence (encoding residues starting from the end of the signal peptide 236 21 -113) was cloned into pOPIN and pOPIN-GG vectors via Infusion cloning and Golden 237 Gate, respectively. Infusion cloning of AVR-PikF in the pOPIN vectors, pOPIN-F, pOPIN-S3C, 238 pOPIN-M and pOPIN-E was performed as described by (Berrow et al., 2007). To clone AVR-

239
PikF into the pOPIN-GG constructs, the AVR-PikF sequence was amplified with primers that 240 introduced overhangs at the 5' and 3' of the sequence containing a BsaI Type IIS 241 endonuclease sites (Table 1)

266
Expression and purification of the AVR-PikF and OsHIPP19 complex 267 A 6xHIS-GB1-tagged OsHIPP19 in pPGN-C was co-transformed with an untagged AVR-PikF 268 in pPGC-K into E. coli SHuffle cells and plated on carbenicillin + kanamycin LB agar plates.

377
GOI can then be combined with one of the pPGC acceptor and C-terminal tag level 0 vectors