Cryo-EM Structure of an Atypical Proton-Coupled Peptide Transporter: Di- and Tripeptide Permease C

Proton-coupled Oligopeptide Transporters (POTs) of the Major Facilitator Superfamily (MFS) mediate the uptake of short di- and tripeptides in all phyla of life. POTs are thought to constitute the most promiscuous class of MFS transporters, with the potential to transport more than 8400 unique substrates. Over the past two decades, transport assays and biophysical studies have shown that various orthologues and paralogues display differences in substrate selectivity. The E. coli genome codes for four different POTs, known as Di- and tripeptide permeases A-D (DtpA-D). DtpC was shown previously to favor positively charged peptides as substrates. In this study, we describe, how we determined the structure of the 53 kDa DtpC by cryogenic electron microscopy (cryo-EM), and provide structural insights into the ligand specificity of this atypical POT. We collected and analyzed data on the transporter fused to split superfolder GFP (split sfGFP), in complex with a 52 kDa Pro-macrobody and with a 13 kDa nanobody. The latter sample was more stable, rigid and a significant fraction dimeric, allowing us to reconstruct a 3D volume of DtpC at a resolution of 2.7 Å. This work provides a molecular explanation for the selectivity of DtpC, and highlights the value of small and rigid fiducial markers such as nanobodies for structure determination of low molecular weight integral membrane proteins lacking soluble domains.


INTRODUCTION
Membranes of cells compartmentalize metabolic processes and present a selective barrier for permeation. To preserve the characteristic intracellular milieu, membrane transporters with specialized functions have evolved to maintain the nutrient homeostasis of cells Zhang et al., 2019). Many of those are energized by an electrochemical proton gradient, providing a powerful driving force for transport and accumulation of nutrients above extracellular concentrations. Proton-dependent oligopeptide transporters (POTs) of the Solute Carrier 15 family (SLC15) are representatives of such secondary active transport systems and occur in all living organisms except in Archaea. They allow an efficient uptake of peptides and amino acids in bulk quantities (Daniel et al., 2006;Thwaites and Anderson, 2007). The best characterized members are the two mammalian PepT1 and PepT2 transporters which are known to play crucial roles in human health, being responsible for the uptake and distribution of nutrients such as di-and tripeptides (Brandsch et al., 2004;Smith et al., 2013;Spanier and Rohm, 2018;Viennois et al., 2018). They also play key roles in human diseases, and impact the pharmacokinetic profiles of orally administered drug molecules (Daniel, 2004;Brandsch, 2009;Ingersoll et al., 2012;Hillgren et al., 2013;Colas et al., 2017;Heinz et al., 2020). SLC15 transporters belong to the Major Facilitator superfamily (MFS). MFS transporters share a well-characterized fold, consisting of twelve transmembrane helices organized in two six-helix bundles, expected to function according to the alternate access mechanisms (Jardetzky, 1966) where either side of the transporter is alternately exposed to one side of the membrane. Therefore, substantial conformational changes are required to complete an entire transport cycle with at least three postulated states: (i) inward-open, (ii) occluded, and (iii) outward-open (Yan, 2015;Drew and Boudker, 2016;Quistgaard et al., 2016;Bartels et al., 2021;Drew et al., 2021). POTs have been intensively studied on a structural and biochemical level over the last 30 years. More than 50 entries for this transporter class can be found in the protein data bank, representing ten different bacterial homologues and the mammalian PepT1 and PepT2 transporters, bound to a limited set of substrates and drugs (Newstead et al., 2011;Solcan et al., 2012;Doki et al., 2013;Guettou et al., 2013;Guettou et al., 2014;Lyons et al., 2014;Zhao et al., 2014;Quistgaard et al., 2017;Martinez Molledo et al., 2018a;Martinez Molledo et al., 2018b;Minhas et al., 2018;Nagamura et al., 2019;Ural-Blimke et al., 2019;Killer et al., 2021;Parker et al., 2021;Shen et al., 2022;Stauffer et al., 2022). Although bacterial and eukaryotic POTs share an overall conserved binding site, individual amino acids changes in or in close vicinity of the binding site are likely responsible for observed differences in affinities and selectivity for particular peptides and drugs among the studied POT homologues. Here, structural biology studies are particularly crucial to understand substrate promiscuity and drug coordination on a molecular level. While bacterial POT structures, determined by mainly X-ray crystallography, represent exclusively the inward-open or inward-open-partially occluded state, the mammalian PepT1 and PepT2 transporters were recently captured in various conformations by single particle cryo-EM, advancing the mechanistic understanding of the entire transport cycle (Killer et al., 2021;Parker et al., 2021). Despite their small size of typically only 50 kDa for an individual transporter unit, these systems become more and more accessible for single-particle cryo-EM approaches. Indeed, in 2021, more MFS transporter structures were determined by single-particle Cryo-EM (17 pdb entries; resolution range 3.0-4.2 Å) than X-ray crystallography (14 pdb entries; resolution range 1.8-3.6 Å).
Here we describe the structure determination of the bacterial POT transporter DtpC by single particle cryo-EM. Considering that the transporter displays no characteristic cytoplasmic or periplasmic features which are helpful to drive the particle alignment, we applied different strategies previously described in the literature to increase the overall size of the transporter to overcome these limitations. We i) fused the transporter to split-sfGFP (Liu et al., 2020;Liu et al., 2022), ii) raised different nanobodies against DtpC (Pardon et al., 2014) and iii) extended the nanobody to a Pro-macrobody (Brunner et al., 2020;Botte et al., 2022). The various samples were subsequently imaged by cryo-EM and analysed. DtpC in complex with the conformation specific nanobody 26 turned out to be more rigid and a significant fraction of the sample dimeric, allowing us to reconstruct DtpC to 2.7 Å resolution. The DtpC structure now provides molecular insights into how selectivity within this transporter family is achieved.

Different Fiducial Marker Strategies for Structure Determination
Since MFS transporters typically lack additional domains outside their transport unit, which is a major impediment for accurate particle alignment in single particle cryo-EM approaches, we assessed three fiducial marker strategies introducing additional density outside of detergent micelles containing DtpC, by analyzing the quality of 2D class averages ( Figure 1). To obtain conformation specific binders against DtpC, we first immunized llamas with recombinant DtpC and selected nanobodies (Nbs) following standard procedures (Pardon et al., 2014). Three out of five selected binders (Nb17, Nb26, and Nb38) co-eluted with DtpC on gel filtration (Supplementary Figure S1) and increased the melting temperature of the respective DtpC-Nb complex by 20°C, 16°C, and 12°C. (Figures 2A,B). DtpC in complex with Nb17 and Nb26 yielded crystals in various conditions, but despite extensive optimization efforts, the crystals of the DtpC-Nb26 complex did not diffract X-rays better than 5 Å resolution. In a second step, we decided to increase the size of Nb26, which formed a tight complex with DtpC, by fusing one copy of the maltose binding protein (MBP) to its C-terminus as described previously (Botte et al., 2022). This resulted in a 52 kDa Pro-macrobody (short Mb26), and we expected it to bind to the periplasmic side of the transporter as seen in other MFS transporter-Nb complexes (Figure 1). In a third approach, we fused the two self-assembling parts of split-sfGFP; with β1-6 on the N-terminus of DtpC, and β7-11 on the C-terminus. We named this construct split sfGFP-DtpC FL . In order to minimize the mobility between the membrane protein and the split sfGFP fiducial, we also generated two additional constructs where the last five (split sfGFP-DtpC 1-475 ), or ten residues (split sfGFP-DtpC 1-470 ) of the transporter were deleted. We then assessed proper folding and complementation by monitoring the fluorescence of the chromophore on an HPLC system ( Figure 2C). All constructs eluted at similar retention times and the fluorescence was highest in the non-truncated construct (split sfGFP-DtpC FL ) and lowest in the most truncated version (split sfGFP-DtpC 1-470 ). In order to extend this observation to other MFS transporters, we repeated this experiment with the human POT homologue PepT1, and noticed a similar trend upon shortening of the termini. Yet, since the decrease of fluorescence was only minor in split sfGFP-DtpC 1-475 in comparison to split sfGFP-DtpC FL , we proceeded to imaging with the shorter construct in the presence of Nb26.
The particle density and distribution in the vitrified solution was similar in the three imaged samples. However, DtpC-Nb26 produced the best 2D class averages considering the sharpness of secondary structure elements inside the micelle, as judged by visual inspection (Figure 1, Supplementary Figure S2). The Promacrobody Mb26 fiducial was clearly visible in 2D class averages, but it adopted various positions in relation to the transporter, therefore making accurate alignment of the particles more difficult than in its shorter but more rigid and stable nanobody version (Figure 1, Figure 3A, B, Supplementary Figure S2). The split sfGFP-DtpC (1-475) -Nb26 sample allowed clear visualization of the transmembrane helices after clustering a small subset of particles, but the majority of particles clustered in classes with blurry density for the split sfGFP fiducial, or with the two complementary parts β1-6 and β7-11 not assembled (Supplementary Figure S2). AlphaFold2 predictions on the imaged construct, as well as on the full length construct later suggested a destabilization of the beta-barrel upon increasing termini restrains, resulting in partial unfolding of β7 and exposure of the chromophore to solvent quenching. Interestingly, this effect could partially be reverted by adding a linker of five glycine residues between the C-bundle and β7 based on in silico data. We conclude that termini restraining using the split-sfGFP approach is a promising fiducial strategy for structural studies of MFS transporters, in addition to the previous demonstrated showcases on small membrane proteins (2, 4 and 6TMs) (Liu et al., 2020;Liu et al., 2022). However, the amount of restraining in larger membrane proteins such as MFS transporters where both termini are placed far from each other need to be optimized experimentally or in silico, to produce a stable and rigid fiducial; two crucial aspects for high resolution structure determination of MFS transporters by single particle cryo-EM.
As we obtained the best 2D class averages for DtpC with the fiducial marker Nb26, we proceeded to a large data collection ( Table 1) and could cluster a subset of dimers within this data set ( Figure 3). The presence of different oligomeric species was already expected based on the peak shape of the gel filtration run ( Figure 4A). The large mass of the dimer, and the stable and rigid signal of the Nb26 fiducial, allowed us to reconstruct the DtpC-Nb26 dimer to 3.0 Å resolution and model this assembly (Figures 3, 4, Supplementary Figure S3). The quaternary structure consists of a non-symmetrical inverted dimer mediated by interactions through a large hydrophobic interface between the HA-HB helices of DtpC (Supplementary Figure S3). Although other inverted dimers were reported in FIGURE 1 | Utilization of different fiducial markers to improve particle alignment and 2D averaging from cryo-EM images. From left to right: DtpC-Mb26, split-sfGFP-DtpC 1-475 -Nb26, and DtpC-Nb26 were purified, vitrified on grids and imaged. Single particles were identified, clustered and averaged. The best average from each sample is shown under a representative raw micrograph.
homologous POT structures (Quistgaard et al., 2017), the source of such arrangements is likely to be artificial. We also investigated the oligomer heterogeneity in solution with small angle X-ray scattering and obtained a good fit at low angles (corresponding to the overall shape of particles in solution) for the cryo-EM volume of the dimer (Supplementary Figure S4). The fit to a monomeric cryo-EM volume was poor, indicating that in detergent solution a significant fraction of DtpC-Nb26 is dimeric. As for the interaction between the membrane protein and the fiducial marker, the CDR3 loop of Nb26 accounts for the strongest interactions with the periplasmic surface of the transporter with two salt bridges, while CDR1 and CDR2 contribute via hydrogen bonding ( Figure 5). 3D variability analysis (Punjani and Fleet, 2021) revealed a small degree of flexibility between the two DtpC-Nb26 copies. Therefore, we performed a local refinement, focused on one copy of the membrane protein, which extended the resolution of the reconstruction to 2.7 Å and improved the accuracy of the atomic model for subsequent structural analysis ( Figure 4).

Structural Basis for Ligand Selectivity in DtpC
The DtpC structure revealed the expected and well-known MFS transporter fold, with twelve transmembrane helices (TMs) Below, structure predictions were generated for split-sfGFP-DtpC +5Gly , split-sfGFP-DtpC FL , sfGFP-DtpC 1-475 , and overlaid with sfGFP (PDB accession number 2B3P). The dark-violet coloring corresponds to the fraction of ß7 which is properly folded in sfGFP while unfolded in the restrained chimeric construct. The right panel shows HPLC chromatogram profiles monitoring the fluorescence of the chromophore of split sfGFP in the context of the indicated constructs, using 480 nm as excitation wavelength and recording at 510 nm the emitted light. organized in two helical bundles and additional two TMs specific for the POT family (known as HA and HB domains). It is highly similar compared to the previously determined DtpD structure (Zhao et al., 2014) with an overall RMSD value of 1.06 Å between the two (for 335 out of 436 C α -atoms). The peptide binding site of DtpC is exposed to the cytoplasmic side (Figures 3, 4). Almost all bacterial POT structures described so far were determined by X-ray methods in a similar inward facing (IF) conformation. The extent to which the central cavity is open to the cytosol is regulated by a mechanism of occlusion mediated by TM4, TM5, TM10, and TM11, as supported by structures in IF occluded, partially occluded, and open states. In the case of the here described DtpC structure, the IF state is open (Figure 3).
Molecules from the periplasmic side, on the contrary, cannot enter the central cavity. Tight closure of both bundles above the binding site is mediated by a salt bridge between D43 (TM2, N-bundle) and R294 (TM7, C-bundle) and hydrogen bonds between H37 (TM1, N-bundle) and D293 (TM7) as well as R28 (TM1) and N421 (TM11, C-bundle) ( Figure 6A). We also analyzed previously determined POT structures with clearly resolved side chain densities, to understand how the IF state is generally maintained in this transporter family. Except for human PepT2 and the POT transporter from Shewanella oneidensis (PepT So ), where the inter-bundle periplasmic salt bridge is formed between TM5 and TM7, the IF state is in all other analyzed structures stabilized by a salt bridge on the tip of TM2 and TM7 ( Figure 6B). Additional hydrogen bonding networks as described in other studies, can occur, but vary greatly among different homologues. This analysis highlights that the alternate access mechanism in canonical and in so called 'atypical' POTs share similarities such as electrostatic clamping by formation and disruption of salt bridges. The differences in hydrogen bonding patterns however, could account for the various turnover rates seen among POT homologues. Canonical POTs are characterized by i) the presence of the E 1 XXE 2 R motif on TM1 involved in proton coupling and ligand binding, and ii) the ability to accommodate dipeptides, tripeptides, and peptidomimetics, which relies on a set of conserved residues located in the central binding cavity. In DtpC, the E 1 XXE 2 R motif, has evolved to Q 1 XXE 2 Y (where Q 1 = N17, E 2 = E20, Y=Y21). In all high resolution X-ray structures of canonical POTs, R is in salt-bridge distance to E 2 and the C-terminus of substrate peptides. Mutation of either E 1 or E 2 in the conventional E 1 XXE 2 R motif to glutamine residues abolishes uptake (Aduri et al., 2015). A reverse mutation in DtpC, from Q 1 XXE 2 Y to E 1 XXE 2 Y or to E 1 XXQ 2 Y preserves high transport rates, while a Q 1 XXQ 2 Y motif significantly decreases it (Aduri et al., 2015). In addition, based on previous molecular dynamics experiments, a salt bridge switching mechanism from R-E 2 to R-E 1 , upon protonation of E 2 in the E 1 XXE 2 R motif, was proposed (Aduri et al., 2015). This biochemical and in silico data strongly support a dual role of the E 1 XXE 2 R motif for both proton and peptide transport, where R can form a salt bridge interaction with the C-terminus of peptides or with E 1 when E 2 is protonated, and where the deprotonation event of the latter is required to disrupt the R-peptide interaction.
In DtpC, we now observe that the side chain pocket has a different architecture and characteristic in comparison with the one of canonical POTs. It displays an overall more acidic groove caused by the presence of the aspartate residue 392, conserved among atypical POTs. This residue has been predicted to be involved in substrate coordination and mutation of this residues in DtpC and homologues DtpD (corresponding residue is D395) abolished transport activity (Jensen et al., 2012b;Zhao et al., 2014). Canonical POTs have a conserved serine residue instead, yielding a slightly changed hydrophobicity pattern in the binding site ( Figures 7A-D). A structural overlay of DtpC with a canonical POT structure bound to the dipeptide Ala-Phe allows us to position the peptide in the binding site. By replacing the phenyl group with a lysine side chain (generating the known DtpC dipeptide substrate Ala-Lys instead of Ala-Phe), we postulate a putative salt bridge between the carboxyl group of D392 and the ε-amino group of the lysine side chain. This observation, together with previous biochemical work (Jensen et al., 2012b;Aduri et al., 2015) allows us to hypothesize that the selectivity of DtpC for dipeptides with C-terminal lysine or arginine residues is caused by swapping a salt bridge between the recurrent carboxyl group of the peptide terminus and the transporter (R21Y mutation), to a side chain specific salt bridge with D392. Since the R-peptide interaction is lost in DtpC, there is no requirement for E1 to destabilize R-peptide for release, which would explain the presence of a Q 1 XXE 2 Y motif instead of E 1 XXE 2 R.
In summary, our work provides new insights into promiscuous versus selective substrate recognition in POTs and constitutes a step forward towards completing the family of E. coli POTs structures. Lastly, it displays some of the challenges related to high resolution cryo-EM structure determination of MFS transporters devoid of soluble domains, and manifests once again, the benefit of fiducial markers in overcoming those.

MATERIAL AND METHODS
3.1 Expression and purification of membrane protein constructs: DtpC; split sfGFP-DtpC (full length split sfGFP-DtpC FL , and truncated constructs split sfGFP-DtpC 1-475 and split sfGFP-DtpC 1-470 ); split sfGFP-HsPepT1 (full length split sfGFP-HsPepT1 FL , and truncated constructs split sfGFP-HsPepT1 1-672 and split sfGFP-HsPepT1 10-672 ) The full-length cDNA of DtpC wild type (WT) was amplified from the Escherichia coli genome, and cloned into a pNIC-CTHF vector by ligation-independent cloning (LIC). This vector contains a C-terminal His-Tag and a Tobacco Etch virus (TEV) cleavage site and a kanamycin resistance gene as selectable marker. The first 6 N-terminal beta strands of sfGFP were fused to the N-terminus of DtpC, and the beta strands 7 to 11 fused to the C-terminus. We named this construct split sfGFP-DtpC FL . Two additional constructs were cloned with truncations of 5 (split sfGFP-FIGURE 4 | High resolution structure determination of DtpC-Nb26. (A) Gel filtration was performed on a preparative column (left) before concentrating the sample to 60 mg/ml and rerunning it on an analytic column on an HPLC system (right), in order to obtain a highly concentrated sample, free of empty detergent micelles. Peak shape already indicates a mixture of different oligomeric species. (B) Representative raw micrograph of the acquired dataset. The applied defocus is -1.5 µm. (C) Summary of the image analysis. The angular assignments from the dimeric reconstruction were used as prior to perform a local focused refinement with reduced angular and translational searches on the masked region illustrated in blue. (D) The Fourier transforms over different shells on frequency space, of two independent volumes (half maps) were compared (FSC) and plotted as a function of spatial frequency, to estimate the overall resolution using the 0.143 cutoff threshold. (E) The two half maps were used as inputs to assess various post-processing strategies.  Homo sapiens (HsPepT2), were all previously captured in the IF state. Here they were analyzed to identify the strongest interaction stabilizing their common conformation. The structures are colored from blue to red, from their N-to C-termini, and the respective PDB accession numbers are indicated. Conserved salt bridges are labelled and highlighted by red dashed lines. DtpC 1-475 ), and 10 residues (split sfGFP-DtpC 1-470 ), on the C-terminal side of DtpC. HsPepT1 was previously cloned into a pXLG vector containing an expression cassette composed of an N-terminal Twin-Streptavidin tag followed by the HRV-3C protease recognition sequence (Killer et al., 2021). Similarly, as for DtpC, the two self-assembling parts of split-sfGFP were first inserted into the N-and C-termini of the fulllength version of HsPepT1, and on two other versions with i) a C-terminal truncation of 36 residues (split sfGFP-HsPepT11-672), and ii) a C-terminal truncation of 36 residues and a N-terminal truncation of 10 residues (split sfGFP-HsPepT110-672) were cloned.
Recombinant DtpC, and the three split sfGFP-DtpC constructs were expressed in E. coli C41(DE3) cells grown in terrific broth (TB) media supplemented with 30 μg/ml kanamycin according to established procedures (Löw et al., 2012;Löw et al., 2013). Cultures were grown at 37°C and protein expression was induced with 0.2 mM IPTG at an OD 600 nm of 0.6-0.8. After induction, culture growth continued at 18°C for 16-18 h. Cells were harvested by centrifugation (10,000 × g, 15 min, 4°C), and the pellet was stored at -20°C until further use. Cell pellets were resuspended in lysis buffer (20 mM NaPi at pH 7.5, 300 mM NaCl, 5% (v/ v) glycerol, 15 mM imidazole, with 3 ml of lysis buffer per Gram of wet weight pellet), supplemented with lysozyme, DNase and 0.5 mM tris(2-carboxyethyl)phosphine (TCEP). The cells were lysed by three cycles using an Avestin Emulsiflex homogenizer at 10,000-15,000 psi. Recovered material was centrifuged to remove non-lysed cells (10,000 × g, 15 min, 4°C) and the supernatant was subjected to ultracentrifugation to separate the membrane fraction (100,000 × g, 1 h, 4°C using an Optima XE-90, Beckman Coulter centrifuge). Membranes were resuspended in lysis buffer supplemented with cOmplete EDTAfree protease inhibitors (Roche), and solubilized by adding 1% n-Dodecyl-β-D-Maltoside (DDM) detergent (Anatrace). The sample was centrifuged for 50 min at 90,000 × g, and the supernatant was applied to Ni-NTA beads for immobilizedmetal affinity chromatography (IMAC) on a gravity column. The beads were pre-equilibrated in lysis buffer and incubated with the solubilized membrane proteins for one hour at 4°C on a rotating wheel. Loaded beads were washed with buffer with increasing imidazole concentrations (20 mM NaPi at pH 7.5, 300 mM NaCl, 5% glycerol, 15-30 mM imidazole, 0.5 mM TCEP, 0.03% DDM). The proteins were eluted from the column with a buffer containing high imidazole concentration (20 mM NaPi at pH 7.5, 150 mM NaCl, 5% glycerol, 250 mM imidazole, 0.5 mM TCEP, 0.03% DDM) and combined with 1 mg of TEV protease to perform the His-tag cleavage during dialysis overnight at 4°C. The dialysis buffer contained 20 mM HEPES at pH 7.5, 150 mM NaCl, 5% glycerol, 0.5 mM TCEP, 0.03% DDM. The cleaved protein was recovered by negative IMAC, concentrated to 4 ml using a 50 kDa concentrator (Corning ® Spin-X ® UF concentrators) and run on an ÄKTA Pure system (GE Healthcare Life Sciences), using a HiLoad 16/ 600 Superdex 200 column for DtpC, and a Superdex 200 Increase 10/300 column for the split sfGFP-DtpC constructs. Fractions containing the protein were pooled, concentrated, flash frozen and stored at -80°C until further use.

Selection, Expression and Purification of Nanobodies Against DtpC
To generate DtpC specific nanobodies, two non-inbred llamas were injected six times at weekly intervals with a mixture of 94 different proteins including DtpC purified in the detergent DDM (50 µg of each antigen weekly). After 6 weeks of immunization, two separate phage display libraries were constructed, one from each animal, in the pMESy2 vector, which is a derivative of pMESy4 that contain a C-terminal EPEA-tag for affinity purification. After pooling both libraries, nanobodies were selected against individual antigens in two rounds of parallel panning in 96-well plates containing one immobilized antigen in each well. After two selection rounds on DtpC, 60 clones were picked for sequence analysis, 13 clones encoded antigen-specific nanobodies as tested in ELISA, grouping them in 5 different sequence families. A nanobody family is defined as a group of nanobodies with a high similarity in their CDR3 sequence (identical length and >80% sequence identity). Nanobodies from the same family derive from the same B-cell lineage and likely bind to the same epitope on the target. Immunizations, library construction, selection by panning and nanobody characterization were performed according to standard procedures (Pardon et al., 2014). Five nanobodies were further characterized.
The nanobodies were expressed in E. coli WK6 cells and purified following standard procedures. Specifically, the cell pellet was resuspended in TES buffer (0.2 M TRIS, pH 8, 0.5 mM EDTA, 0.5 M sucrose) supplemented with one protease inhibitor tablet (Roche). Osmotic shock was performed by the addition of diluted TES buffer to release the periplasmic proteins. The solution was first centrifuged for 20 min at 10,000 × g and additionally for 30 min at 100,000 × g. The supernatant was applied to CaptureSelect beads (Thermo Fisher Scientific), which were equilibrated with wash buffer (20 mM NaPi, pH 7.5, 20 mM NaCl). After three column volumes of washing, the nanobody was eluted with 20 mM HEPES, pH 7.5, 1.5 M MgCl 2 . The nanobodies were further purified on a HiLoad 16/600 Superdex 75 pg column in 20 mM HEPES, pH 7.5, 150 mM NaCl, 5% glycerol, concentrated with a 5 kDa cut-off concentrator, flash-frozen and stored at -80°C until further use.

Expression and Purification of Macrobody 26
The nanobody 26 (Nb26) was first inserted into a pBXNPH3 vector containing a C-terminal penta-histidine tag preceded of a HRV-3C protease recognition sequence. The maltose binding protein (MBP) was then inserted in frame with the 3' end of the nanobody, with two proline residues as a linker between the two genes similar as described in (Botte et al., 2021). The resulting Pro-macrobody 26 (Mb26) was expressed in E. coli WK6 cells as above. The cell pellet was resuspended in TES buffer (0.2 M TRIS, pH 8, 0.5 mM EDTA, 0.5 M sucrose) supplemented with one protease inhibitor tablet (Roche). Osmotic shock was performed by the addition of diluted TES buffer to release the periplasmic proteins. The solution was first centrifuged for 20 min at 10,000 × g and additionally for 30 min at 142,000 × g. The supernatant was further purified by immobilized-metal affinity chromatography (IMAC) on a gravity column. The beads were pre-equilibrated in 20 mM NaPi at pH 7.5, 300 mM NaCl, 5% glycerol, 15-30 mM imidazole, 0.5 mM TCEP and incubated. Loaded beads were washed with increasing imidazole concentrations (20 mM NaPi at pH 7.5, 300 mM NaCl, 5% glycerol, 15-30 mM imidazole, 0.5 mM TCEP, 0.03% DDM). The proteins were eluted from the column with a buffer containing high imidazole concentration (20 mM NaPi at pH 7.5, 150 mM NaCl, 5% glycerol, 250 mM imidazole, 0.5 mM TCEP, 0.03% DDM) and combined with 1 mg of 3C protease to perform the His-tag cleavage. The cleaved protein was recovered by negative IMAC, concentrated to 0.5 ml using a 30 kDa concentrator (Corning ® Spin-X ® UF concentrators) and run on an ÄKTA Pure system (GE Healthcare Life Sciences), using a Superdex 75 Increase 10/300 column. Fractions containing the protein were pooled, concentrated, flash frozen and stored at -80°C until further use.

AlphaFold2 Predictions
Structures with the following sequences were used as input for AlphaFold2 structure prediction (Jumper et al., 2021), and AMBER relaxation. The best ranked models were used for visualization.

Cryo-EM Sample Preparation, Data Collection, Image Analysis, and Atomic Modelling
One hour before vitrification, the purified protein complexes were thawed on ice and run on a Superdex Increase 200 5/150 column in 0.015% DDM, 100 mM NaCl, 10 mM HEPES (pH 7.5), 0.5 mM TCEP in order to remove the excess of empty detergent micelles earlier generated upon sample concentration. The top fraction reached a concentration ranging between 3 and 6 mg/ml, and for each sample, 3.6 μl were applied to glow-discharged gold holey carbon 2/1 300-mesh grids (Quantifoil). Grids were blotted for 4 s at 0 force and 1-s wait time before being vitrified in liquid propane using a Mark IV Vitrobot (Thermo Fisher Scientific). The blotting chamber was maintained at 4°C and 100% humidity during freezing.
All movies were collected using a Titan Krios (Thermo Fisher Scientific) outfitted with a K3 camera and BioQuantum energy filter (Gatan) set to 10 eV. Automated data acquisitions were set using EPU (Thermo Fisher Scientific). The applied defocus ranged between -0.9 µm and -1.8 µm in all datasets.
For DtpC-Nb26 and DtpC-Mb26, movies were collected at a nominal magnification of ×105,000 and a physical pixel size of 0.85 Å, with a 70-μm C2 aperture and 100-μm objective aperture at a dose rate of 19.5 e−/pixel per second. A total dose of 75 e−/Å 2 was used with 2.8 s exposure time, fractionated in 50 frames. For split sfGFP-DtpC 1-475 -Nb26, movies were collected at a nominal magnification of ×,130,000 and a physical pixel size of 0.67 Å, with a 50-μm C2 aperture and 100-μm objective aperture at a dose rate of 19.0 e−/pixel per second. A total dose of 57 e−/Å 2 was used with 3 s exposure time fractionated in 40 frames.
For DtpC-Mb26, 13,257 movies were collected, 3,062,337 coordinates were picked and used for 2D averaging and clustering. For split sfGFP-DtpC 1-475 -Nb26, 7602 movies were collected, 1,049,399 coordinates were picked and used for 2D averaging and clustering. For DtpC-Nb26, 24,333 movies were collected, 6,464,070 coordinates were picked and used for 2D averaging and clustering, and 878,428 particles were used in the final 3D reconstruction. Briefly, DtpC-Nb26 dimeric population was clustered using 3D class averaging in Relion3.1 (Scheres, 2012). Particle trajectories and cumulative beam damage were further corrected by Bayesian polishing in Relion3.1 (Zivanov et al., 2019), and the resulting shiny particles were exported to cryoSPARCv3 (Punjani et al., 2017) for further 3D clustering via successive heterogeneous refinement cycles using "bad" and "good" volumes as references to denoise the dataset. Non uniform refinement (Punjani et al., 2020), followed by a local refinement using a soft mask around one transporter unit resulted in a 2.7 Å reconstruction of DtpC. The overall resolution was estimated in CryoSPARCv3 using the FSC = 0.143 cutoff. Local resolution estimations were also calculated in CryoSPARCv3 using the 0.5 FSC cutoff. The two half maps were used as inputs to assess various post-processing strategies such as the CryoSPARC's sharpening tool, DeepEMhancer (Sanchez-Garcia et al., 2021), and Resolve_cryo-em (Terwilliger et al., 2020). The latter led to a slightly better defined contour of the atoms, and was subsequently used for the last atomic-model refinement of DtpC. The initial models of DtpC and Nb26 were generated using AlphaFold2, and refined against the experimental maps; first in Isolde (Croll, 2018), and last in Phenix (Afonine et al., 2018), principally to refine atomic displacement parameters (B-factors) and perform a slight energy minimization while keeping restrains from Isolde's reference model. Half-maps, and postprocessed maps of the dimeric arrangement and of the focused refinement, as well as the atomic model of DtpC were deposited in the PDB and EMDB as deposition numbers 7ZC2, and EMD-14618. The atomic model of the dimeric DtpC-Nb26 is available upon request.

Small-Angle X-Ray Scattering Data Collection and Analysis
Synchrotron SAXS data from solutions of DtpC-Nb26 in β-DDM micelles (SEC-SAXS) were collected on the EMBL P12 (Blanchet et al., 2015) beamline at the PETRA III storage ring (Hamburg, Germany), in a buffer consisting of 0.015% DDM, 100 mM NaCl, 10 mM HEPES (pH 7.5), and 0.5 mM TCEP. Sample (10 mg/ml) was injected onto a Superdex Increase 200 10/300 column (Cytiva) and run at 0.5 ml/min at 20°C. 3000 successive 1 s frames were collected using a Pilatus 2M detector at a sample-detector distance of 3.1 m and at a wavelength of λ = 0.124 nm (I(s) vs. s, where s = 4πsinθ/λ, and 2θ is the scattering angle). The data were normalized to the intensity of the transmitted beam and radially averaged; the scattering of the solvent-blank was subtracted using CHROMIXS (Panjkovich and Svergun, 2018). Cryo-EM volume maps of DtpC-Nb26 were fit to the scattering data across the low-angle range (shape region only) using EM2DAM (Franke et al., 2017) at a density threshold of 0.1.

DATA AVAILABILITY STATEMENT
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found below: http://www. wwpdb.org/, 7ZC2 https://www.ebi.ac.uk/pdbe/emdb/, EMD-14618.

ACKNOWLEDGMENTS
We thank the Sample Preparation and Characterization facility of EMBL Hamburg for support in this project and the beamlines P13 and P14 at EMBL Hamburg for regular access. We acknowledge Instruct-ERIC and the FWO for their support to the Nb discovery and Saif Saifuzzaman for the technical assistance during Nb discovery. All past and current group members are acknowledged for their input to this manuscript and their efforts to crystallize DtpC over the years.