Cotranslational folding of a periplasmic protein domain in Escherichia coli

In Gram-negative bacteria, periplasmic domains in inner membrane proteins are cotranslationally translocated across the inner membrane through the SecYEG translocon. To what degree such domains also start to fold cotranslationally is generally difficult to determine using currently available methods. Here, we apply Force Profile Analysis (FPA) – a method where a translational arrest peptide is used to detect folding-induced forces acting on the nascent polypeptide – to follow the cotranslational translocation and folding of the large periplasmic domain of the E. coli inner membrane protease LepB in vivo. Membrane insertion of LepB’s two N-terminal transmembrane helices is initiated when their respective N-terminal ends reach 45-50 residues away from the peptidyl transferase center (PTC) in the ribosome. The main folding transition in the periplasmic domain involves all but the ~15 most C-terminal residues of the protein and happens when the C-terminal end of the folded part is ~70 residues away from the PTC; a smaller putative folding intermediate is also detected. This implies that wildtype LepB folds post-translationally in vivo, and shows that FPA can be used to study both co- and post-translational protein folding in the periplasm.


Introduction
Most secreted proteins are translocated through Sec-type translocons in the bacterial inner membrane or the eukaryotic endoplasmic reticulum membrane in an unfolded state, and fold once they emerge on the trans side of the membrane [1]. The folding of proteins that are cotranslationally translocated across the membrane can be followed by tracking the formation of specific disulfide bonds or by assaying the appearance of enzymatically active domains as the protein emerges from the bacterial SecYEG or the eukaryotic Sec61 translocation channel [2][3][4][5], but this provides only a gross indication of the formation of folded structure.
Among these methods, FPA is unique in that it can be applied both in vitro and in vivo, and because it can be used to analyze any kind of event that results in a pulling force on the nascent polypeptide chain, including protein folding. Here, we show that FPA can be used to follow the cotranslational folding of the ~250 residues periplasmic domain in the E. coli inner membrane protein LepB, and show that the ~15 C-terminal residues of LepB are not involved in the main folding transition.

The force profile analysis (FPA) assay
Translational arrest peptides (APs) are short stretches of polypeptide that interact with the ribosome exit tunnel to pause translation when the last codon in the portion of the mRNA that codes for the AP is located in the ribosomal A-site [32]. The stalling efficiency of APs is sensitive to pulling forces acting on the nascent chain [26,33,34], and APs can therefore be used as force sensors to report on cotranslational events that generate pulling forces, such as protein folding or insertion of transmembrane segments into the membrane.
In FPA, a force-generating domain in a polypeptide is placed at increasing distances upstream of an AP, and the degree of translational stalling is measured for the corresponding series of protein constructs. A plot of the stalling efficiency vs. chain length -a force profile (FP)provides a map of cotranslational force-generating events with up to single-residue resolution. Further, by using APs of different stalling strengths, FPs can be fine-tuned to optimally reflect different kinds of force-generating events [35,36].
Here, we explore the possibility to use FPA to follow the cotranslational folding of a periplasmic domain in an E. coli inner membrane protein as it emerges from the SecYEG translocon into the periplasm. As illustrated in Fig. 1a, in constructs where the nascent chain is long enough that it can be stretched to the point that a protein domain (or a part of a domain that can form a stable folding intermediate) can reach sufficiently far into the periplasm to start to fold, some of the free energy gained upon folding will be converted to tension in the nascent chain, generating a pulling force on the AP and thereby reducing the stalling efficiency. As shown below, we find that protein folding in the periplasm is readily amenable to analysis by FPA.  Cotranslational membrane insertion and folding of LepB E. coli LepB is a 324-residue inner membrane protease. It is anchored in the inner membrane by two N-terminal transmembrane helices (TMH1, TMH2) and has a large C-terminal periplasmic domain that contains the active site [37,38]. The signal recognition particle interacts with TMH1 as it emerges from the ribosome exit tunnel at a nascent chain length of ~45 residues [39][40][41], targeting the ribosome-nascent chain complex (RNC) to the SecYEG translocon. TMH1 then inserts into the inner membrane in the Nout-Cin orientation with its Nterminus facing the periplasm. Once TMH2 emerges, it inserts into the membrane in the opposite Nin-Cout orientation and initiates cotranslational, SecA-dependent translocation of the C-terminal periplasmic domain through the SecYEG translocon [42].
To follow the cotranslational insertion of the transmembrane helices and the folding of the periplasmic domain, we made a series of constructs composed of progressively longer Nterminal parts of the LepB protein followed by a 9-residue HA-tag, a 8-residue SecM(Ms) AP (a relatively strong AP [33]), and a 78-residue C-terminal tail, Fig. 1b (see Supplementary   Table 1 for sequences of all constructs). We also made longer constructs where full-length LepB is followed by linkers composed of additional HA-tags, the SecM(Ms) AP, and the C-    Fig. 2b. In a previous study, using the SecM(Ms) AP [33] we found that the main peak in a force profile representing the insertion of an artificial TMH of composition 6L/13A into the E. coli inner membrane reaches half-maximal amplitude when the N-terminal end of the TMH is ~50 residues away from the PTC, suggesting that peak I is generated by the insertion of LepB TMH1 into the inner membrane.
A double substitution [F 5 , L 7 → R 5 , R 7 ] that makes the N-terminus of TMH1 less hydrophobic results in a reduced fFL value at N = 55 (green data point), as expected. The assignment of TMH1 to peak I is further corroborated by previous crosslinking studies [43,44] and by a recent continuous-translation in vitro study in which a FRET signal between an acceptor attached to the N-terminal Met residue of LepB and a donor placed at the cytoplasmic entry to the SecYEG channel is seen when the LepB nascent chain reaches a length of ~50 residues [45].
Peak II reaches half-maximal amplitude at N = 120 residues, a nascent-chain length at which the N-terminal end of the weakly hydrophobic TMH2 is 52 residues away from the PTC, Fig.   2b. Peak II thus represents the membrane insertion of TMH2. Nstart values obtained with the -9 -SecM(Ms) AP are typically ~5 residues larger than those obtained with the weaker SecM(Ec) AP [33], and the two TMHs thus probably reach the SecYEG translocon when their Ntermini are ~45 residues away from the PTC, as seen for other E. coli inner membrane proteins [46].
The main peak in the FP is peak V. It has a much higher amplitude than the other peaks, and is also wider. It reaches its half-maximal amplitude at Nstart ≈ 377 residues. Full-length LepB is 324 residues long, hence the C-terminal end of LepB is ~53 residues away from the PTC at this point, suggesting that peak V may represent the folding of the periplasmic domain. If this indeed is the case, the peak should disappear if the folded state is destabilized by mutation.
We therefore deleted residues 80-105 that include a mostly buried segment in the core of the  Given that the distance between the PTC in the ribosome and the periplasmic end of the SecYEG translocon channel is ~160 Å [33], a distance that can be bridged by a largely extended nascent chain (~3.2 Å per residue) of ~50 residues length, we further conclude that the periplasmic domain has exited the translocon channel before it folds, and therefore that wildtype LepB folds post-translationally when not artificially tethered to the ribosome by a C-terminal linker.
Peaks III and IV are of lower amplitude and may signal the presence of folding intermediates in the P2 domain. Indeed, the deletion of residues 80-105 reduced the amplitude of peak IV Peak III, finally, is not reduced in amplitude by the residue 80-105 deletion, Fig. 2a (light blue data point at N = 209). The two narrow "spikes" at N = 213 and 223 seem to be caused by interactions between the nascent chain and the ribosome exit tunnel, as mutation of bulky and charged residues in the segments YSNVEPSDF (located 21-29 residues from the PTC in the N = 213 construct) to ASNVEASAA and DFVQTFSRRNGGE (located 20-32 residues from the PTC in the N = 223 construct) to AAVQTASAANGGA, markedly reduce the amplitude of the spikes (orange data points). We also considered the possibility that peak III may at least in part reflect the formation of the Cys 171 -Cys 177 disulfide in the periplasmic domain (as we recently found for the periplasmic protein PhoA [50]); indeed, mutation of either Cys residue to Ala significantly reduces fFL at N = 223 (p < 0.05, yellow data points).
-12 -Because peak III does not seem to represent a major tertiary structure folding intermediate, we did not analyze it further.

Discussion
Taken together with our previous analysis of the cotranslational folding of the periplasmic E. coli enzyme PhoA [50], the data presented here establish FPA as a generally applicable method to study the folding of periplasmic proteins and of periplasmic domains in inner Given that a ~50 residues long linker segment in an extended conformation should be able to reach from the PTC to the periplasmic end of the SecYEG channel, it is notable that the main folding transition in the periplasmic domain happens only at a linker length of ~70 residues.
We speculate that this may be because the periplasmic domain is anchored to the inner membrane via the two N-terminal TMHs and by additional hydrophobic residues in the periplasmic domain itself, meaning that it will by necessity be located some distance away from the mouth of the SecYEG channel when it folds, as illustrated in Fig. 4. Finally, it is quite remarkable that an AP located at the PTC, deep in the ribosome exit tunnel, can sense folding events taking place in the periplasm via a ~70 residues long linker that passes through both the ribosome and the SecYEG translocon. It will be interesting to test whether other events that take place in the periplasm, such as interactions between nascent polypeptides and periplasmic chaperones or steps in the assembly of outer membrane proteins, can be probed in a similar way.

Enzymes and chemicals
All enzymes used in this study were purchased from Thermo Fisher Scientific and New England Biolabs, with the exception of PfuUltra II Fusion HS DNA Polymerase that was procured from Stratagene, Sweden. Primers for site-directed mutagenesis, the partially overlapping inverse primers, and the primers used for Gibson Assembly ® were designed in silico and ordered from Eurofins Genomics. All gene fragments used for Gibson Assembly ® cloning were designed in silico and ordered using Invitrogen GeneArt Gene Synthesis service, Thermo Fisher Scientific. Plasmid isolation, PCR purification, gel extraction kits, and precast NuPAGE Bis-Tris polyacrylamide gels were from Thermo Fisher Scientific. L-[ 35 S]methionine was obtained from PerkinElmer. Mouse monoclonal antibody against the HA antigen was purchased from BioLegend. Protein-G agarose beads were manufactured by Roche. All other reagents were from Sigma-Aldrich.

Cloning and Mutagenesis
Starting with the previously described pING plasmid carrying a variant of lepB gene with the M. succiniciproducens SecM(Ms) arrest peptide (AP) [33], the coding sequence for the 9residue hemagglutinin (HA) tag and a 78-residue C-terminal tail were engineered upstream and downstream of the AP, respectively. In order to generate a force profile for LepB, the lepB After centrifugation for 1 min at 7,000 g, immunoprecipitates were first washed with 10 mM Tris-Cl pH 7.5, 150 mM NaCl, 2 mM EDTA, 0.2% (v/v) Triton X-100 and subsequently with 10 mM Tris-Cl pH 7.5. Samples were spun down again and pellets were solubilized in SDS sample buffer (67 mM Tris, 33% (w/v) SDS, 0.012% (w/v) bromophenol blue, 10 mM EDTA-KOH pH 8.0, 6.75% (v/v) glycerol, 100 mM DTT) for 10 min while shaking at 1,000 rpm.
-18 -Samples were incubated with 0.25 mg/ml RNase I for 30 min at 37°C to hydrolyze the tRNA, and subsequently separated by SDS-PAGE. Gels were fixed in 30% (v/v) methanol and 10% (v/v) acetic acid and dried by using a Bio-Rad gel dryer model 583.
Radiolabeled proteins were detected by exposing dried gels to phosphorimaging plates, which were scanned in a Fuji FLA-3000 scanner. Band intensity profiles were obtained using the ImageGauge V4.23 software and quantified with our in-house software EasyQuant to determine the fraction full-length protein, fFL = IFL/(IFL+IA), where IA, IFL are the intensities of the A and FL bands, respectively. Data was collected from three to six independent biological replicates (see Supplementary Data), and averages and standard errors of the mean (SEM) were calculated. A two-sided t-test was used to calculate statistical significance when comparing fFL values for different constructs.