ABSTRACT
RNA binding proteins (RBPs) often engage multiple RNA binding domains (RBDs) to increase target specificity and affinity. However, the complexity of target recognition of multiple RBDs remains largely unexplored. Here we use Upstream of N-Ras (Unr), a multidomain RBP, to demonstrate how multiple RBDs orchestrate target specificity. A crystal structure of the three C-terminal RNA binding cold-shock domains (CSD) of Unr bound to a poly(A) sequence exemplifies how recognition goes beyond the classical π-π-stacking in CSDs. Further structural studies reveal several interaction surfaces between the N-terminal and C-terminal part of Unr with the poly(A)-binding protein (pAbp). This provides first atomistic details towards understanding regulation of translation initiation that is mediated by the interplay of these two proteins with each other and RNA.
INTRODUCTION
RNA binding proteins (RBPs) interact with coding and non-coding RNAs as constitutive partners in ribonucleoprotein (RNP) complexes. The structural and mechanistic knowledge about RNP assemblies is scarce and mainly limited to large molecular machines, like the ribosome(1, 2), RNA polymerases(3–5) or the spliceosome(6–8). These machines are often highly abundant in cells and their target interaction is strong and constitutive, which is advantageous for mechanistic studies. However, many RBPs function as regulatory units, requiring transient and versatile interactions with their binding partners along with fluctuating abundance(9). Thereby these RBPs can respond to environmental changes or developmental cues quickly. The dynamic nature of RBPs is an advantage in their involvement in many regulatory pathways of the cell, including gene expression at all levels ranging from transcription, splicing, polyadenylation, localization, stabilization, and degradation to protein synthesis via their diverse roles in translation(10–13). This rather transient binding nature makes structural studies difficult and explains why RNA binding properties of most RBPs remain unexplored(12).
To ensure specific regulation through RBPs in the many different cellular processes, a certain RNA target specificity of the protein is a prerequisite. RBPs employ a set of RNA binding domains (RBDs) to engage their target RNAs. RBDs are often small and very conserved domains, with specificities towards single stranded (ssRNA) or double stranded RNA (dsRNA)(14) Although RBDs are the main drivers of protein-RNA interactions, the single domains are often not enough to discriminate target from non-target RNAs within the complex transcriptome of the cell(11). Most classical RBDs are around 10 kDa of size and can accommodate three to five contiguous RNA bases specifically(15), which is often not enough to endow RNA target recognition(11). Thus, a composition of multiple RBDs within one protein often increases specificity(16, 17). The majority of RBPs is composed of multiple RBDs, either of the same or of different domain types(11). This results in a large combinatorial variety of different domain classes and the diversity of architectures would influence the binding mode to the specific target RNA sequence.
One prerequisite to understand mechanistic details of the different binding modes of RBPs that induce target specificity, is the determination of RNP structures at an atomistic level. Over the years, there have been several efforts to examine structural features that dictate RNP binding specificity(18, 19). The knowledge of how single RBDs engage their target sequences increased and in some cases these studies offered insights into the role of multidomain arrangements in the recognition process(20, 21). However, the interplay of multiple RBDs in a single RBP is far from being understood and may change on a case-to-case basis.
Here we use Drosophila Unr, a multidomain RBP, to demonstrate the complexity of RNA binding. Unr is a highly conserved protein among metazoan, containing five ssRNA binding canonical cold-shock domains (CSD) and four additional non-canonical CSDs (ncCSDs), which lack the conserved RNA binding residues and are therefore incapable of independent RNA binding(22). More than hundreds of transcripts could be identified in previous co-immunoprecipitation studies with Unr(22, 23), reflecting its widespread biological function including diverse cellular processes like cell migration, differentiation, and apoptosis, by regulating RNA stability and translation(24–27). A peculiarity of Drosophila Unr is its dual sex specific function during dosage compensation. Contrary to its involvement in males, where it promotes the assembly of the MSL complex and thus is part of transcription regulation at least indirectly(28), it is involved in the inhibition of the same complex formation in female flies via translation repression of msl2 mRNA, which is the rate limiting factor of the MSL complex(29, 30). Together with sex-lethal (Sxl), heterogeneous nuclear ribonucleoprotein 48 (Hrp48) and pAbp, Unr binds to the 3’ UTR and thereby inhibits the recruitment of the 43S pre-initiation complex(31–34). Structural details about assembly and action of this translation initiation repressor complex are scarce.
To conduct these widespread biological functions, Unr must interact with other RBPs, acting as a protein-RNA hub, that brings binding partners together and stabilizes their interaction(35). One example is its ternary interaction with Sxl and msl2 mRNA during translation repression (29, 30, 36). Another well-established interactor is the poly(A)-binding protein (pAbp) (34, 37), which promotes translation upregulation and protection against mRNA decay through “closed-loop” formation of target mRNAs(38–42). However, in complex with Unr the fate of pAbp target mRNAs relies on the further composition of the RNP complex. On the one hand both proteins increase the mRNA stability of c-fos in complex with PAIP-1, hnRNP D and NSAP1 (43), whereas on the other hand it downregulates translation when accompanied with Imp1 on pAbp mRNA (44, 45).
Previous structural work on Unr focused on the first CSD and its RNA interaction or on single or multidomain constructs in unbound protein states (22, 36, 46). However, the role of the C-terminal CSDs has not been studied structurally, apart from showing that the last three CSDs bind RNA (22). Here we provide a high-resolution structure of a multidomain construct comprising these last three CSDs, showing for the first time the complexity of Unr-RNA interactions, which goes beyond the classical π-π-stacking of canonical CSDs. The relevance of the interaction was validated in several experiments and increases the knowledge about synergistic RNA binding which may occur in many additional multidomain RBPs. Moreover, we could identify interactions between a surface within the same C-terminal region and additional N-terminal regions of Unr with the Drosophila poly(A)-binding protein (pAbp). This gives atomistic insights into possible arrangements of Unr in a larger RNP translation regulatory complex, where it could rearrange the mRNA conformation and act as a hub for mRNP complex assembly.
MATERIAL AND METHODS
Plasmids
A pETM11 derived plasmid with a His6-affinity tag connected via a tobacco etch virus protease (TEV)-cleavage site to the protein constructs (derived from pBR322; G. Stier) was used for all protein expressions. Constructs of Drosophila Unr full length, CSD789 (A756-D990), CSD78 (A756-K922), ncCSD8 (P840-K922) and CSD9 (G911-D990) were used as described earlier(22). The different pAbp constructs, pAbp full length, pAbp RRM1 (A2-L84), pAbp RRM2 (G90-G176), pAbp RRM3 (G176-A263), pAbp RRM4 (L276-A362), pAbp linker (A362-N561) and pAbp PABC domain (K550-N634), were cloned from SL2 cDNA using the restriction free cloning approach. Point mutations for mutational analyses were inserted using site directed mutagenesis(47).
Protein expression and purification
Protein expression and purification was done as described earlier(22). In brief, the proteins were expressed in E. coli BL21 (DE3) cells (E. coli B dcm ompT hsdS(rB-mB-) gal) using TB or isotope labeled M9 minimal medium, supplemented with 15NH4Cl and when needed 13C-D-glucose as sole nitrogen and carbon sources (purchased from Cambridge Isotope Laboratories). The cells were grown at 37°C before they were induced with 0.2 mM IPTG at an OD600 of 1.2 for TB and 0.8 for M9 minimal medium and incubated at 17°C for 16 h.
For protein purification the harvested cells were resuspended in 50 mM Hepes/NaOH pH 8.0, 500 mM NaCl, 30 mM imidazole, 1.4 mM b-mercaptoethanol and 1 M Urea and lysed using a French press. An affinity chromatography of the cleared lysate was done using 3 ml Ni-NTA gravity flow columns and the protein was eluted with 500 mM imidazole after an extensive wash with the lysis buffer. The proteins were cleaved using His6-TEV-protease and dialyzed against a low salt buffer (50 mM NaCl) without imidazole over night at 4°C. Cleavage for pAbp RRM2 was not successful, so that the purification was continued, and the samples were measured including the His6-tag. After a second nickel affinity purification, which gets rid of the protease and the cleaved tag, the protein was concentrated and injected on an S75 gel-filtration column (GE) for further purification and buffer exchange (20 mM NaP (pH 6.5), 50 mM NaCl, 1 mM DTT). Protein quality and purity was assessed by Coomassie staining and protein quantity by using a NanoDrop or BCA assay kit for the different protein constructs.
NMR spectroscopy
All samples for NMR were measured in presence of 10% D2O and 0.01% NaN3 at 298 K on Bruker Avance III NMR spectrometers with magnetic field strengths corresponding to proton Larmor frequencies of 600 MHz, 700 MHz or 800 MHz equipped with triple resonance gradient cryogenic probe heads (600 and 800 MHz), a room temperature triple resonance probe head (700 MHz) or a room temperature quadrupole resonance probe head (600 MHz).
Experiments for backbone assignments were acquired on 13C and 15N labeled samples using conventional triple resonance experiments (backbone: HNCO, HNCA, CBCA(CO)NH and HNCACB(48)). For pAbp RRM1 0.03 mM (due to decreased solubility during the purification), pAbp RRM2 0.3 mM, pAbp RRM3 0.5 mM, pAbp RRM4 0.7 mM, pAbp linker 0.05 mM and pAbp PABC 1 mM samples were measured. Apodization weighted sampling was used for the acquisition of all spectra(49). These were processed using NMRPipe(50) and assigned with the program Cara(51). The backbone assignments of Unr CSD78, CSD789 and CSD9 were taken from Hollmann et al. (BMRB codes: 34492, 28086 and 34498)(22).
For NMR-based titrations 15N (or 15N2H for CSD789) labeled protein at a concentration of 0.1 mM (0.06mM for the competitive interaction study) was titrated with various ratios against purchased RNA oligonucleotides (A5, A7, A8, A9, A15, C8 and U8; IDT) or unlabeled protein. A 1H,15N-HSQC spectrum was recorded for each titration point. The RNA stocks were highly concentrated to keep dilution effects as small as possible (10 mM). The dilution was considered for peak intensity analysis. For protein titrations two samples were prepared, to avoid dilution effects. For titration analysis, Sparky(52) was used and the chemical shift perturbations (ppm) (CSP) at a ratio of 1:2 were calculated according to:(53). Shifts with a CSP greater than the average plus the standard deviation of all measured shifts were considered significant. The binding affinity reflected by the dissociation constant (Kd) was obtained from a least square fit of the chemical shift changes for different residues during the titration, using: (53), where x is the total ligand concentration and y is the corresponding CSP, P describes the protein concentration and A is the maximum CSP on saturation obtained as part of the fitting routine.
Standard pulse sequences were taken for the acquisition of R1, R2 and 1H-15N heteronuclear NOE experiments(54, 55) on a 0.1 mM 15N labeled deuterated sample. The relaxation delays were kept constant between all measurements (R1: 1600, 20, 50, 800, 100, 500, 150, 650, 1000, 400, 150 and 20 ms and R2: 25, 12.5, 50, 62.5, 100, 37.5, 75 and 25 ms). Peak integration and data fitting to derive spin relaxation parameters were done using PINT(56, 57). These parameters were taken to calculate the rotational correlation time (tc) according to Kay et al.(54) using:
Crystallography
A previously measured NMR sample of CSD789 bound to an A15-mer RNA was dialyzed against 10 liters of crystallization buffer (20 mM Hepes/NaOH pH 7.5, 50 mM NaCl and 1 mM DTT) to reduce the amount of phosphate, before being concentrated to 5 and 10 mg/ml. Several crystallization screens were set up at 7°C and 20°C and plate like crystals grew in 0.1 M Tris-Cl (pH 7.0), 0.2 M lithium sulfate and 2 M ammonium sulfate at 20°C. With a final size of 0.1×0.1×0.01 mm, these crystals were frozen in the mother liquor supplemented with 30 % glycerol and measured at the beamline P13 operated by EMBL Hamburg at the PETRA III storage ring (DESY Hamburg, Germany)(58). The crystal diffracted up to 1.2 Å and the data was processed in XDS(59). Molecular replacement was performed with CSD1 as search model (PDB: 4qqb)(36) using Phaser from Phenix suite(60, 61). Several rounds of model building in COOT(62) and refinement in the Phenix suite were done to further refine the structure.
For the crystal structure of Unr CSD789 bound to pAbp RRM3, crystals started to grow in an equimolar solution of both proteins and an A9-mer RNA oligonucleotide in 0.2 M sodium formate, 0.1 M bis-tris propane pH 6.5 and 20 % (w/v) PEG 3350 after two days. After three weeks the rectangular shaped crystals were frozen in the mother liquor supplemented with 40 % glycerol and measured at ID23-2 at the ESRF Grenoble. The crystal diffracted up to 2.9 Å. Data processing was done as described above, whereas the structure of CSD789 without RNA was used as search model for the molecular replacement. Data collection, processing, and refinement statistics for both structures are listed in Table S1.
SAXS data acquisition and analysis
Small-angle X-ray scattering (SAXS) data were collected at the BioSAXS beamline BM29 at the ESRF, Grenoble,(63) using an X-ray wavelength of 0.992 Å and at the P12, operated by EMBL Hamburg at the Petra III storage ring (DESY Hamburg, Germany)(64) using an X-ray wavelength of 1.24 Å. 30 ml protein samples or buffer were purged through a quartz capillary for the measurements. Data acquisition details and statistics are listed according to community guidelines(65) in Table S2.
Prior analysis, frames were checked for radiation damage and then merged, and buffer subtracted. The data quality was checked by Guinier approximation. The whole analysis was done using the ATSAS 2.7.1 software package(66). CRYSOL(67) and EOM(68) calculations were done using the default settings to derive and fit theoretical scattering curves.
Structure modeling
For the modeling of CSD789, high resolution structures of CSD78 (PDB: 6Y4H) and CSD9 (PDB: 6Y96)(22) were taken. The modeled structures were calculated using CNS (1.2)(69, 70) in an ARIA framework(71, 72). The structures were generated as described earlier(73). In brief, the single domains were connected to a single molecule, by implementing the missing linker residues. The linker region between CSD8 and CSD9 was randomized during the structure calculations. 5000 structures were calculated and fitted against a SAXS curve of CSD789.
To generate the model for the monomeric CSD789 15-mer poly(A)-RNA complex we first generated a pdb-file with the protein monomer and two 6-mer poly(A)-RNAs, where the first RNA is the molecule in the asymmetric unit and the coordinates for the second RNA were taken from the asymmetric unit adjacent to CSD9 in the unit cell. The residue numbering of the first RNA was kept fixed (1-6) while the second RNA was renumbered to follow the last residue of the first RNA (7-12) or with a gap of one to three additional adenosines (8-13, 9-14, 10-15). All missing residues of the protein including its hydrogen atoms and the RNA 15-mer were then generated in CNS-1.2(69) followed by an energy minimization step with fixed coordinates for the protein and the interacting RNA residues. Only the calculation where one adenosine was inserted in between the two RNA molecules and A6 was left flexible during the minimization gave a 15-mer RNA with proper geometry.
To generate the structural model in Figure 7b, we simply superimposed all available experimental high-resolution structures using PyMOL(74) (PDB accession codes: 4qqb, 6r5k, 6y6m, 6y6e, 6y4h, 6y96(22, 36, 75), and the structures determined in this study, PDB: 7zhh and 7zhr). The model of Unr-CSD1-9 was chosen from the SAXS-MD-based ensemble based on the proximity of ncCSD2 and ncCSD8 to allow interaction of both with pAbP-RRM12 and RRM3, respectively. A structural model of the Unr-ncCSD2-pAbP-RRM12 complex has been derived by modelling Drosophila pAbP-RRM12 in the RNA bound state using homology modelling (SWISS-MODEL(76)) based on the human pAbP-RRM12-RNA complex (PDB: 4F02(77)). To dock Unr-ncCSD2 with the pAbP-RRM12 model we used HADDOCK(78). Ambiguous interaction restraints to drive the docking were derived from chemical shift perturbations observed during NMR titrations between both proteins (Figure 5) and from their surface accessibility. These were for Unr-ncCSD2: T268, E269, T284, T285, R287, S289, C296 and F298; and for pAbP-RRM2: D102, K104, Y107, D108, S111, A112, G114 and N115. The default docking settings of the webserver were used. The resulting structure ensemble after water refinement consisting of 200 structures was clustered into 9 clusters, of which cluster 1 and 2 were by far the largest with 56 and 51 structures, respectively. Cluster 2 had the best statistics, with the best HADDOCK score and an average RMSD to the lowest energy structure of 0.63 ± 0.5 Å. The energy terms and buried surface area was also best for cluster 2. To incorporate this model into the structural model of the translation repression complex in Figure 7b, the pAbP-RRM domain containing structures (Unr-CSD789-pAbP-RRM3 and Unr-ncCSD2-pAbP-RRM12-RNA), were superimposed with the structure of yeast full-length PABP from PDB: 6R5K(79) to get an estimate about the distance of the RRM domains from each other in the RNA bound state.
All-atom MD simulations of CSD789 and CSD789/RNA
The model of the CSD789/RNA complex was taken as starting conformation for all-atom molecular dynamics (MD) simulations. Simulations were carried out with the Gromacs software(80), version 2020.3. Interactions of the protein and the RNA were described with the Charmm36(81) force field, version March 2019, and the original TIP3P(82) water model was used. The protein and the protein-RNA complex were each placed in a dodecahedral box, where the distance between the protein to the box edges was at least 2.0 nm. The boxes were subsequently filled with water molecules and the systems were neutralized by adding sodium ions. In total, the systems contained at least 157761 atoms. The energy of the two systems were minimized within 400 ps with the steepest decent algorithm.
Subsequently, the systems were equilibrated for 100 ps with harmonic position restrains applied to the backbone atoms of the proteins (force constant 1000 kJ mol-1 nm-2). Finally, each simulation was simulated for 230 ns. The temperature was kept at 293 K using velocity rescaling (τ=0.1 ps)(83). The pressure was controlled at 1 bar with the Berendsen (τ =1 ps)(84) and with the Parrinello-Rahman barostat (τ=5 ps)(85) during equilibration and production simulations, respectively. The geometry of water molecules was constrained with the SETTLE algorithm(86), and LINCS(87) was used to constrain all other bond lengths. Hydrogen atoms were modeled as virtual sites, allowing a 4 fs time step. The Lennard-Jones potentials with a cut-off at 1nm were used to describe dispersive interactions and short-range repulsions. The pressure and the energy were corrected for missing dispersion interactions beyond the cutoff. Electrostatic interactions were computed with the smooth particle-mesh Ewald method(88, 89). Visual inspection of the simulations reveals that the RNA-protein contacts were stable throughout the simulation.
To ensure the agreement with NOE signals, distance restraints were applied in all simulations. In simulations of CSD789 with RNA, 20 distance restraints were applied. In simulations of only CSD789, 53 restraints were applied involving 258 atom pairs. Below a distance of 0.5 nm, no restraining potential was applied. Between 0.5 and 0.6 nm, a quadratic restraining potential with a force constant of 1000 kJ mol-1 nm2 was applied. Above 0.6 nm, a linear restraining potential was applied. The distance restraints are listed in Table S3. Distances that can contribute to a single NOE signal were treated simultaneously, implemented by defining with distances with the same restraint index in the Gromacs topology file.
SAXS-driven MD simulation and SAXS calculations of CSD789 and CSD789/RNA
The SAXS-driven MD simulations and the subsequent SAXS calculations were performed with an in-house modification of Gromacs 2018.8, as also implemented by our webserver WAXSiS(90–92) for the SAXS calculations. The source code and documentation are available on GitLab at https://gitlab.com/cbjh/gromacs-swaxs and https://cbjh.gitlab.io/gromacs-swaxs-docs/, respectively. The simulation parameters were identical in MD and SAXS-driven MD simulations. Starting structures for the SAXS-driven simulations were taken from the last frame of the MD simulations. The SAXS restraints were turned on gradually over 15 ns and a force constant of 10 was used during the simulations. SAXS-restrained simulations were carried out for 50 ns. Simulation frames were saved every 2 ps for later analysis. Simulation frames from the time interval between 15 and 50 ns were used for the SAXS calculations. A spatial envelope was built around all solute frames of the protein and protein-RNA complex at a distance of 1.0 nm from all solute atoms. Because solvent atoms inside the envelope contributed to the SAXS calculations, the computed SAXS curves include effects from the hydration layer. The buffer subtraction was carried out using at least 351 simulation frames of a pure-water simulation box, which was simulated for 150 ns and which was large enough to enclose the envelope. The orientational average was carried out using 550 q-vectors for each absolute value of q, and the solvent electron density was corrected to the experimental value of 334 e/nm3, as described previously [12]. To compare the experimental with the calculated SAXS curve, we fitted the experimental curve via Iexp,fit(q) = f Iexp + c, by minimizing chi-square with respect to the calculated curve. Here the factor f accounts for the overall scale, and the offset c takes the uncertainties from the buffer subtraction. No fitting parameters owing to the hydration layer or excluded solvent were used, implying that also the radius of gyration was not adjusted by the fitting parameters.
MD simulations of CSD1–6 and CSD4–9
Ten different conformations for each CSD1–6 and CSD4–9 were taken from the rigid-body modelling and henceforth used as starting conformations for MD simulation. MD simulations were carried out with the Gromacs software, version 2019.6. Interactions of the protein were described with the Charmm36(81) force field, version March 2019, and the original TIP3P(82) water model was used. Each of the ten initial conformations was placed in a dodecahedral simulation box, where the distance between the protein to the box edges was at least 1.5 nm. The boxes with CSD1-6 were filled with 384,629 water molecules, and 1 chloride and 10 sodium ions were added to neutralize the systems. In total, these simulation systems contained 1,162,429 atoms. The boxes CSD4-9 were filled with 628,071 water molecules, and 1 chloride and 9 sodium ions were added to neutralize the systems. These simulation systems contained 1,893,654 atoms. The energy of each simulation system was minimized and equilibrated as described above for CSD789. Finally, each of the ten replicas was simulated for 230 ns without any restraints. The temperature was kept at 293 K using velocity rescaling (τ = 0.1 ps)(83). All other simulation parameters were set as described above for CSD789.
SAXS calculations of CSD1–6 and CSD4–9
SAXS curves were computed from the free MD simulations using the same modified GROMACS version as described above for the CSD789. The distance between the protein and the envelope surface was at least 0.2 nm, such that nearly all water atoms of the hydration shell were included in nearly all frames. The buffer subtraction was carried out using 101 simulations frames taken from a 15 nm simulation of a pure-water simulation system, which was large enough to enclose the envelope. The orientational average was carried out using 14,200 q-vectors for CSD1–6 and 15,680 q-vectors for CSD4-9 for each absolute value of q. All other parameters and the fitting protocol were chosen as described above.
Isothermal Titration Calorimetry
All titrations were done on a Malvern MicroCal PEAQ-ITC at 20°C while stirring at 750 rpm. The protein and RNA (IDT) samples were dialyzed against 20 mM NaP (pH 6.5), 50 mM NaCl and 0.5 mM TCEP. Concentrations of the molecules in each experiment are listed in Table S4. Each sample constellation was measured at least in duplicates and the MicroCal PEAQ-ITC analysis software was used to integrate, normalize, and fit the data.
Fluorescence polarization assay
RNA oligonucleotides (AAA AAA AUG and A15-mer; IDT) were labeled at the 3’ end with fluorescein-5-thiosemicarbazide according to Qiu et al.(93). In brief, 5 μM RNA in 0.25 M sodium acetate (pH 5.6) were oxidized with 50 μM sodium periodate at 25°C in the dark for 90 minutes, before 100 μM of sodium sulfite were added. After 15 min at 25°C, 150 μM of fluorescein-5-thiosemicarbazide were added and the labeling reaction was performed for 3h at 37°C. The labeled RNA was precipitated for 3h at -80°C using one tenth of the reaction volume of 8 M LiCl and 2.5 times the reaction volume of 100 % ethanol. The concentration and labeling efficiency of the washed (70 % ethanol) and resuspended (H2O) pellet was measured using Nanodrop.
The fluorescence polarization assays were done in 20 mM NaP/NaOH pH 7.5, 50 mM NaCl and 2 mM DTT in a volume of 25 μl. 5 nM (AAA AAA AUG) or 25 nM (A15-mer) labeled RNA was incubated with different concentrations of protein for 30 min at 20°C. For each reaction a technical duplicate was measured in a black 384-well plate on a BioTek Synergy 4 plate reader using the corresponding filters and the automatic gain function. Each measurement was done in triplicates.
Protein melting temperature
The nano differential scanning fluorimetry (nanoDSF) technology (nanotemper) was used to determine the protein melting temperature. Proteins were soaked into a standard capillary and heated up 1°C/min. The excitation varied from 10-30 % dependent on the protein concentration. The provided software was used to analyze the data and the melting temperature, at which 50 % of the protein was unfolded, was determined.
Mass Photometry
The mass photometry analysis was performed using a RefeynMP. The photometer was calibrated using filtered buffer (20mM sodium phosphate, 100mM NaCl). The proteins were diluted to a final concentration range between 5 and 60 nM. The measurement was immediately conducted after pipetting the protein on the sample carrier slide. To show the difference between immediate measurement and 10 minutes waiting time the sample was split into two, whereas one fraction was kept at room temperature for 10 minutes. Complex assembly was detected by change of back reflected scattering triggered by protein binding, which was monitored by the mass photometer. Assembled complexes resulted in an increased contrast through the scattering.
RESULTS
The C-terminal domain of Drosophila Unr tumbles independently, but with spatial restriction
Recently, we found that interdomain interactions between non-canonical CSDs and canonical CSDs of Drosophila Unr influence RNA target specificity (22). Robust domain-domain interactions could be found for CSD4-5, CSD5-6, and CSD7-8. Detailed information on RNA binding and consequences on domain arrangements are lacking. As we found that the three C-terminal domains bind to A-rich sequences (22), we aimed here to perform a thorough investigation of their RNA recognition mode. First, we investigated whether there is an inter-domain interaction for the two C-terminal domains (CSD8-9) as found for CSD7-8 and recorded 15N relaxation data of a Unr CSD789 construct to characterize the dynamics of the protein on a residue level (Fig. 1a). These data confirm the joint tumbling between CSD7 and ncCSD8(22), but also indicate that CSD9 tumbles independently of CSD78 due to the significant difference of the apparent rotational correlation times for CSD78 (tc = 18.3±1.2 ns) and CSD9 (14.5±1.8 ns) (Fig. 1b). The linker between ncCSD8 and CSD9 is only 4 residues long and although the domains tumble independently, a SAXS driven structure modeling and MD simulation indicate spatial restrictions for CSD9 with respect to CSD78 (Fig. 1c and Fig. S1). Transient interactions between the two parts are confirmed by NMR data, which show chemical shift perturbations (CSPs) in a region within CSD9 (S970-S980) when compared to the longer construct of CSD789 (Fig 1d).
Joint tumbling of the three C-terminal domains of Drosophila Unr upon RNA-binding
Next, we assessed RNA binding by CSD789, to examine whether this restricted flexibility between ncCSD8 and CSD9 has a similar importance on RNA binding and protein function as the fixed domain arrangement observed between the other domains. Previously strong binding of Unr CSD789 was shown for a poly(A)-15-mer RNA oligonucleotide(22), which is why poly(A) sequences of different length were used for this study.
Using NMR 1H,15N-HSQC titration experiments with poly(A) RNA sequences of different lengths (5-mer, 7-mer, 8-mer, 9-mer and 15-mer), a change between the binding affinities of the longer constructs compared to the shortest one (A5-mer) were observed (Fig. 2a-b and Fig. S2a-f). For some residues this change became clearly discernable by different binding modes. For the shorter A5-mer the CSPs induced by RNA binding are in the fast exchange regime57. In contrast, binding to RNA oligos with 7 adenines and longer result in an intermediate-to-slow exchange regime, indicative of a change in dissociation rates and therefore stronger binding (Fig. 2a and Fig. S2a-f). Not only the change of the exchange regime, but also the measured dissociation constants strengthen this observation. The affinity for the A5-mer is significantly lower than for the longer RNA oligonucleotides (Fig. 2b and Fig. S3a). To overcome the uncertainty of the affinity determination between CSD789 and the longer RNAs caused the change of the binding regime, isothermal titration calorimetry (ITC) was used for cross-validation. However, measurable thermal changes for the A5-mer could not be determined using ITC, potentially due to weak binding. Overall, the measured affinities using ITC are in line with the NMR observations (NMR: A5-mer: Kd = 127±16.6 μM vs. ITC: A7-mer: Kd = 17.7±5.2 μM, A8-mer: Kd = 4.4±2.7 μM, A9-mer: Kd = 4.3±1.3 μM and A15-mer: Kd = 4.8±0.8 μM). The non-gradual change in binding affinity and the saturation towards longer nucleotides strengthens the previously observed synergistic binding of the domains within CSD78922. A sufficient length of the RNA allows simultaneous binding of all domains.
Concomitantly, 15N relaxation data measured for the final titration point of the A15-mer RNA titration shows that the rotational correlation time of CSD9 increases to the overall tumbling time of CSD78 (CSD9: tc = 14.5±1.8 ns unbound vs. 18.4±1.3 ns bound), indicating that the three domains tumble together in solution as one entity (Fig. 2d and Fig. S3c). In contrast, binding to the shorter A5-mer retains the independent tumbling of CSD9 towards CSD78, as observed for the unbound protein state (CSD9: tc = 14.5±1.8 ns unbound vs. 13.5±2.4 ns bound) (Fig. 2c and Fig. S3b). The lower tumbling time of the domains CSD78 in the A5-mer bound state might reflect a reduced binding between these domains in an RNA bound form.
CSD789 binding to poly(A) RNA sequence involves both typical and atypical CSD-RNA contacts
Due to the observed rigidification of Unr CSD789 upon binding to the longer RNA, we could successfully co-crystallize the protein with A15-mer RNA. The crystal structure could be solved using molecular replacement. From the 15 adenosines only six were visible in the structure, possibly due to flexibility of the remaining unbound nucleotides showing no electron density or due to degradation of the unbound nucleotides within the sample drop. Furthermore, one RNA chain is bound by two molecules from the same unit cell. CSD9 binds to RNA that is also bound by CSD7 and ncCSD8 of its symmetry mate, resulting in one unit cell having two protein and two RNA molecules (Fig. 3a). The signal-to-noise ratio of peaks in the NMR titration and rotational correlation times derived from 15N relaxation experiments show that despite high concentrations the complex has a 1:1 (protein:RNA) stoichiometry in solution. The rotation correlation time of around 18 ns fits what can be expected of the protein:RNA complex (around 15.7 ns at 295 K determined using ROTDIF(94)). The tumbling time for a 2:2 complex would be expected to be larger (Fig. 2d and Fig. S3c).
This suggests that the peculiar assembly seen in the crystal structure is a result of crystal packing. A structure that connects the termini of the single RNA strands by one additional nucleotide (A7) was modeled for a better impression of how the complex may look in solution (Fig. 3b and see methods). The validity of the model in solution is further confirmed by a SAXS driven molecular dynamics simulation (Fig. S2g) and by already described NMR titrations. The modeled nucleotide establishes contacts to residues in ncCSD8, of which corresponding NMR resonances shift significantly upon RNA titration (Fig. S2a-e). Interestingly the protein structure of human and Drosophila Unr CSD789 predicted by AlphaFold2(95), adopts the same domain orientations as our RNA-bound structure described here (Fig. S2h). Of note, the domain arrangement of CSD789 and its interdomain contacts are different from CSD456 (22).
As described previously for many other CSD structures(36, 96–102), the known RNA binding motifs (FGF and FFH) are involved in RNA binding of CSD7 and CSD9 (Fig. 3c, d). Surface exposed aromatic side chains of F777, F788 and H790 of CSD7 and F934, F948 and H950 of CSD9 form the p-p-stacking of the bases A3-A5 and A8-A9 of the RNA, resulting in a tightly packed interaction surface between the canonical CSDs and RNA. However, besides this typical RNA binding residues, previously unobserved atypical interactions are formed between N977 and A6 or K979 and A9 in CSD9. Strikingly these contacts form sequence-specific interactions towards adenine nucleotides (Fig 3e, h). Positioning of electronegative atoms to contact adenine-C2 would be sterically hindered by the exocyclic amino-N2 in guanine. Additionally, residues located in ncCSD8, that were already identified as RNA binding residues in NMR titration experiments (Fig. S2c-g)22 form direct contacts with nucleotide A5. The base points into the interaction surface of CSD7 and ncCSD8 and directly interacts with R856 and P860 (Fig. 3f). The arginine is additionally involved in an interaction with E786 of CSD7 and its electron density suggested two different confirmations within the structure (Fig. 3f, h). Due to hydrogen bonding of amino acids to the adenine-N1 or N6 adenine specific contacts are formed(103). Since E786 forms a hydrogen bond between its free oxygen to N6 of A5, the base gets sequence-specifically sandwiched between the two domains. To confirm the positional adenine specificity of CSD789 we performed NMR titrations of CSD789 with poly(U), poly(C), and poly(A) 8-mer RNAs (Fig. S4). A8-mer RNA addition induces large CSPs of CSD789 peaks already at substoichiometric concentrations, whereas poly(C) and poly(U) 8-mer RNAs induce weaker CSPs only at higher concentrations. Although we cannot presume that in certain positions other bases than an adenine would be preferred, we can confirm an overall adenine preference in solution.
The crystal structure further reveals domain-domain interactions, which are formed between glutamines located in ncCSD8 (Q898) and in CSD9 (Q975) that are pointing towards each other, and form hydrogen bonds as well as van der Waals contacts (Fig. 3g). The fact that the joint tumbling is strengthened in presence of RNA may be due to conformation changes upon RNA binding, which brings the two residues closer together. Further, this might also explain the restricted flexibility observed in the absence of RNA (Fig. 1c). Potentially there is a weak interaction between the two residues that is not strong enough to fix the two domains completely in absence of RNA but keeps them close in space through constant association and dissociation.
Mutational analysis validates the solution model and confirms the importance of atypical RNA contacts and domain-domain interactions for RNA binding
To examine how much each of these interactions contribute to RNA binding and to validate the X-ray structure-based solution model, several mutants were tested for their binding affinity to an A15-mer RNA. Mutations were generated to disturb the atypical RNA binding of CSD9 (N977A and K979A), of ncCSD8 and its interface interactions to CSD7 (E786A, R856A and P860A) and the interactions between ncCSD8 and CSD9 (Q898A, Q975A and R976A). Electron density for the sidechain of R976 could not be detected. However, to exclude any kind of interaction between R976 with residues of ncCSD8 we included it in the mutational analysis (Fig. 4a). The mutated residues were in loop regions without secondary structure elements to avoid misfolding of the individual domains. High yield and solubility during the purification process, similar to wild type, peak dispersion in 1H,15N-HSQC spectra and the largely unchanged melting temperature demonstrates that the CSD fold for the mutants is not disrupted (Fig. 4b-c).
15N relaxation measurements of the Q868A/Q975A/R976A-mutant reveal lower rotational correlation times compared to the wild-type protein (CSD78: tc = 14.0±2.1 ns mutant vs. 18.3±2.9 ns wild type; CSD9: tc = 10.5±1.0 ns mutant vs. 14.5±1.8 ns wild type) (Fig. 4d and Fig. S5a). This indicates that the previously observed weak interaction between CSD78 and CSD9 is further weakened by these mutations. CSD78 and CSD9 are no longer temporarily interacting via the two glutamines, increasing the independent tumbling.
The effect of mutations on RNA binding was first assessed by NMR titration experiments. As saturation could not be reached, dissociation constants could not be reliably determined. This indicates that all mutants have a lower affinity to RNA compared to the wild type (Fig. 4e and Fig. S5b-e). The least impact on affinity had the Q868A/Q975A/R976A mutations, since chemical shift perturbations for e.g. G935 stayed in the intermediate exchange regime as observed for the wild type. For the E786A/R856A/P860A and N977A/K979A mutants the exchange regime changed from intermediate to fast exchange, indicative of weaker affinities. Concomitantly, fluorescence polarization (FP) data showed weaker binding for all mutants compared to the wildtype.
The binding curve of the wild-type protein results in a Kd of 21.1±3.6 μM to the A15-mer RNA (Fig. 4f). The measured Kd was lower compared to ITC (4.8±0.8 μM) or NMR (8.0±1.5 μM). The reason could be that the 3’ Cy5 label on RNA interferes with binding to CSD789. However, the Kd from our FP assays was similar to a previously reported affinity, which also used a Cy5 label in an EMSA (22). When comparing the binding affinity between wild type and mutants within the FP assay, all tested mutants show weaker binding to the two different RNAs. The mutant having a complete substitution of atypical RNA binding residues in ncCSD8 and its interface interactions to CSD7 (E786A/R856A/P860A) showed a more than fourfold weaker binding affinity with a Kd of around 94.7±32.8 μM compared to the wild type (Fig. 4f). Already single mutations showed a decreased binding affinity to A15-mer RNA (E786A: 50.1±6.4 μM; R856A: 72.4±11 μM; P860A: 67.2±7.9 μM) (Fig. S5f), meaning that each mutated amino acid contributes to RNA binding of CSD789.
The binding affinity of the other two interface mutants decreased even stronger. Here the Kd drops more than tenfold. Exact values cannot be calculated, since the binding does not even reach saturation with the maximum protein concentration range used (N977A/K979A: >1000 μM and Q868A/Q975A/R976A: >900 μM for both RNAs; Fig. 4f). As observed previously, also the single mutants bound the RNA significantly weaker than the wild-type protein (N977A: 71.2 ± 12.5 μM, K979A: 160.0 ± 57.6 μM, Q868A: 61.0 ± 12.1 μM, Q975A: 38.2 ± 8.7 μM and R976A: 161.0 ± 51.2 μM; Fig. S5f), indicating that all tested single mutations influence RNA binding.
Thus, all described structural peculiarities of CSD789, namely the atypical RNA binding within CSD9 and ncCSD8, the interface formation between CSD7 and ncCSD8 and ncCSD8 and CSD9 contribute to RNA binding of CSD789. The previously observed synergistic binding of CSD7 and CSD9 within CSD789 (Fig. 2a-d and Fig. S2a-f) could as such be explained by the solution structure presented here and is further validated by mutational analysis(22). Several atypical contacts contribute to RNA binding directly or indirectly via formation of additional domain-domain contacts. This validates our solution model of the CSD789-RNA structure and exemplifies the complexity of RNA binding by a multidomain RBP.
Interaction of Unr with the poly(A)-binding protein pAbp
Our observations of sequence-specific binding of Unr CSD789 to adenines and a previously reported RNA independent interaction of Drosophila Unr with the Drosophila poly(A)-binding protein pAbp(22, 34, 37, 104), prompted us to understand the interaction between the two proteins in more detail. Firstly, we aimed to reproduce the interaction with recombinantly purified full-length proteins. Due to the low solubility of the full-length pAbp we have chosen to use mass photometry (Fig. S6a. Measurements for the individual proteins in solution show populations for the expected molecular weight (Unr: ∼90kDa vs. observed 95kDa; pAbp: ∼70kDa vs observed 60kDa). The pAbp sample shows additional populations (18% and 37%) of molecular weights of 199 and 440kDa likely due to aggregation, which would reflect the low solubility of full-length protein. However, the complex sample shows an additional peak at a molecular weight of 148kDa, corresponding to Unr-pAbp complex formation (calculated weight ∼ 160kDa).
To map the interaction sites of both proteins in more detail, we decided to perform extensive NMR chemical shift perturbation mapping by titrating 1.5 molar excess of Unr constructs (CSD12, CSD456 and CSD789) to single, isolated 15N-labelled pAbp domains (RRM1, RRM2, RRM3, RRM4 and the PABC domain) and record 1H,15N-HSQC experiments (Fig. 5a). Due to solubility problems, we could not perform the backbone assignments for RRM1.
A clear interaction could be identified between CSD789 and RRM2 and RRM3 (Fig. 5b and Fig. S6b). While the CSPs and intensity loss of RRM2 signals are less pronounced, which indicates that this interaction is potentially weak and unspecific, the CSPs, that form on two patches within the protein sequence of RRM3 are stronger and an overall intensity loss of around 50% is visible upon saturation with CSD789. The intensity of peaks that additionally show significant CSPs decrease more than 50%, indicative of an interaction between both proteins due to the increased size of the observed molecule and resulting slow molecular tumbling of the complex.
Similarly, a reverse titration of both proteins (15N labeled CSD789 and unlabeled RRM3) confirm this interaction (Figure 5c). However, a clear CSP pattern to enable mapping of the interaction site on CSD789 did not emerge. Therefore, we used shorter CSD constructs to narrow down the interaction site on CSD789. Since no significant shifts were traced for CSD7 or CSD9, this experiment identifies ncCSD8 as interaction partner of pAbp-RRM3 (Fig. 5c and Fig. S6c).
To get an impression of interaction sites between pAbp-RRM3 and CSD789, significant shifts of the CSD78 and of ncCSD8 titrations are mapped onto the available structures. The corresponding residues are mostly located on one site of the structure, on the opposite site of the RNA interaction surface of CSD789 and, identifying a clear surface for the interaction with pAbp RRM3 (Fig. 5d).
The reverse RNA titration shows that similar residues of RRM3, that are involved in RNA binding (Fig. S6d) are contributing to the interaction with CSD789. Significant CSPs of the interaction with CSD789 are located close to this RNA binding surface of the protein (Fig. 5e), suggesting that Unr competes with RNA for pAbp binding.
In addition to the ncCSD8-RRM3 interaction significant chemical shifts were observed between Unr CSD12 and pAbp RRM2 (Fig. 5f) and CSD12 and the linker PABC region of pAbp (Fig. 5g), indicative for additional protein interaction sites between both proteins.
We could then solve a crystal structure of Unr CSD789 bound to pAbp-RRM3 (Fig. 6a), which is consistent with the NMR data. As suspected pAbp-RRM3 binds to ncCSD8 of CSD789 with its RNA interaction surface. Thus, several amino acids of both interaction partners build up a large interaction platform (CSD789: E844, T845, H847, I871, E874, I880; RRM3: N184, I186, S212 and F227). The NMR data confirm a competitve binding between Unr CSD789 and the poly(A) RNA sequence. 1H-15N-HSQC data show especially for the RNA bound state of pAbp RRM3 large CSPs, whereas the binding to Unr CSD789 is mostly accompanied by intensity loss, mostly likely due to peak broadening because of the molecular weight increase. The spectrum in the presence of both ligands, shows a combination of both; signal loss on the one hand, but also stronger CSPs as associated with the RNA binding (Fig. 6b). These data suspect, that the RNA and Unr ncCSD8 are competing for the same interaction surface on pAbp-RRM3. Although we were not able to obtain a crystal structure of the ternary complex, a superposition of the two presented crystal structures shows that CSD789 would retain its capability to bind to RNA (Fig. 6c). To our knowledge, this is the first high-resolution structure of these two major translation regulatory proteins, Unr and pAbp in complex.
MD simulations suggest a compact CSD1-9 conformation
RRM2 and 3 of pAbp are seperated only by a short linker (17 residues), suggesting that Unr CSD12, which interacts with pAbp RRM2, and Unr ncCSD8, which interacts with pAbp RRM3, are in close spatial proximity in the complex despite being spaced by 509 residues. To obtain the degree of extendedness or compactness of Unr, which would allow this interaction, SAXS data of longer Unr constructs were obtained. The SAXS data for CSD1-9 was affected by aggregation, so that datasets of CSD1-6 and CSD4-9 were used to validate MD simulations. Indeed, already free MD simulations of CSD1-6 and CSD4-9 show a good agreement with the SAXS data, suggesting that there is no need to further refine the simulations (Fig. S7).
Next, we modeled a CSD1–9 ensemble with correct bond geometries and without domain overlaps using CNS (1.2)(69, 70). The lack of flexible regions between several domains of Unr allowed us to generate model structures of almost full-length Unr, excluding the flexible N-terminal region, using the published high resolution structures of CSD12 (PDB: 6y6m), CSD456 (PDB: 6y6e), CSD78 (PDB: 6y4h) and CSD9 (PDB: 6y96)(105) and a homolgy model of CSD3(76, 106). The known rigid domain distances and orientations were kept fixed during the modelling. The remaining linker regions were randomized and thus allowed to adopt different conformations (Fig. 7a). The resulting structural ensemble covers a large conformational space involving center-of-mass (COM) distances between CSD123 and CSD789 up to 29 nm (Fig. 7a, red bars). To test if these conformations are compatible with the experiment-supported MD simulations, we superimposed the domains 4–6 from CSD1–6 and CSD4–9 MD-generated ensembles, thereby obtaining a plausible MD-based CSD1–9 ensemble (Fig. 7a). However, this MD-based ensemble was remarkably compact. Expanded conformations with COM > 15nm are not supported by the MD simualtions (Fig. 7a, black bars).
This suggests that some interdomain or domain-linker interactions restrict the overall flexibilty of the full-length protein, which is in accordance to the observed interactions of Unr with pAbp. This observation is of special interest considering that Unr is interacting directly with pAbp and Sxl in the female dosage compensation of flies. We could show in this study that Unr not only interacts with the F site region of msl2 mRNA and Sxl(29, 30), but may also directly interact with the poly(A)-tail and pAbp.
DISCUSSION
Although the number of RNA bound multidomain structures did increase within the last years (20, 21), atomistic insights into how multidomain RBDs exceeding two domains facilitate target recognition and specificity of RBPs is still scarce. Nevertheless, this knowledge is necessary to understand how full-length RBPs select for an RNA binding partner within the transcriptome. This study presents the first RNA bound multidomain structure of CSDs, indicating that their RNA binding mechanism can be far more complex than the previously shown π-π-stacking of the typical aromatic binding residues between a single CSD and RNA (Fig. 3)(102, 107, 108). Moreover, mutational interaction studies show that several atypical RNA binding residues contribute significantly to the RNA binding affinity of the C-terminal part of Drosophila Unr (Fig. 4). This structure not only increases our knowledge about the complex binding mechanism of multidomain RBD-RNA engagements, but it also suggests that the RNA binding of full-length Unr with its total five canonical and four ncCSDs is likely to be of even higher complexity and plasticity.
In an earlier study, we first speculated that CSPs upon RNA titration in ncCSD8 to be the result of unspecific interactions or allostery, as a single ncCSD8 construct did not harbor significant CSPs and a positively charged surface patch was located close to the interaction surface of CSD7(22). However, the crystal structure of CSD789 bound to the poly(A) RNA sequence showed that specific RNA-ncCSD8 contacts form and contribute to the overall binding affinity. Earlier identified binding motifs of Unr using SELEX, iCLIP or Shape analysis were rich of purine bases and especially adenosines(28, 108, 109), a phenomenon that cannot be sufficiently explained by the classical π-π-stacking of the canonical RNA binding motifs. The bacterial cold-shock proteins CspA and CspB, which only harbor a single CSD, were described as rather promiscuous RNA binders with low sequence specificity(110, 111) and CSD1 of Unr has low sequence specificity in isolation(36)..However, our structure shows a more complex interplay of multiple CSDs, where atypical interaction residues may be the main determinants for a specificity towards adenosines. Consequently, together with previous studies different mechanisms are shown to contribute to the target selectivity of Unr. As shown in this study, multiple CSDs increase the interaction surface of the RNA and the protein, allowing atypical binding residues to contribute to binding affinity and specificity of the protein. Additionally, a spatial restriction of the full-length protein, that gets introduced through interdomain contacts between canonical and non-canonical CSDs provides Unr with RNA tertiary structure specificity(22). A third mechanism to increase the target specificity of Unr is the interplay with additional RBPs, as shown in the case of the msl2 mRNA, where interaction with Sxl is necessary to increase the binding affinity and base specificity of Unr CSD1 towards a certain cytosine within its target RNA sequence(36).
Considering this, the biological relevance of the poly(A) binding specificity of CSD789 gets strengthened with the identified interaction surface between ncCSD8 of Unr and RRM3 of pAbp. This characterized interaction validates previous pull-down interaction studies, that identified RRM3 of pAbp as the main driver for the interaction with Unr(37). The structure shows, that the interaction of both proteins blocks the RNA binding capability of RRM3 but keeps the binding site of CSD789 accessible for RNA. Therefore, it is possible that the C-terminal part of Unr binds close to the pAbp binding sites in the poly(A) tail or AU-rich elements. Since the main drivers of RNA interaction of pAbp are the first two RRMs(112, 113), an elimination of RRM3 from the RNA by CSD789 would not impact direct RNA binding of pAbp. Instead, RRM3 could be an important stabilizer of the Unr-CSD789-RNA interaction, by keeping the C-terminal part of Unr sandwiched between the RNA and itself. This keeps both proteins close together, which could increase the probability for additional interactions between them and potentiates RNA binding. We hypothesize, that this interaction is common during translation initiation of most mRNA targets and may promote recruitment of the 43S preinitiation complex. Indeed, it has been shown that a direct Unr-pAbp interaction stimulates tranlsation(104).
In the case of msl2 translation repression, we propose that the presence of Sxl generates a conformational change within the Unr-mRNP complex by binding to CSD1, changing it from a general translation initiation promoter to a repressor complex. Here we present a more detailed model of how Unr could act as a molecular bridge, that ‘glues’ together the different components of the female dosage compensation complex in flies (Fig. 7b and c). This model has been generated by simple superposition of previously determined structures and structures presented in this study. A full-length Unr model from the SAXS-derived ensemble was chosen based on whether it allows the simultaneous binding of pAbp-RRM3 to ncCSD8 and pAbp-RRM2 binding to ncCSD2. This is only possible if both ncCSDs are in proximity by almost a circularized Unr. The 3’UTR of msl2-mRNA would wrap around Unr and Hrp48 is positioned just downstream of the F-site, which interacts with the 43S preinitiation complex and prevents it from binding to the 5’UTR.
Both presented high-resolution structures, showing the interaction of CSD789 with poly(A) RNA and with RRM3 of pAbp, contribute to a detailed structural insight of a larger translation regulation complex, which we hope to extend in the future. Unr and pAbp are identified to orchestrate translation initiation of different target mRNAs, whereby additional binding partners decide for the specific fate of the mRNA. Future studies will show, whether there are more interaction surfaces between both proteins, whether these are functionally relevant and whether the interaction between Unr and pAbp is invariable or dependent on the overall composition of this specific RNP.
AVAILABILTY
The modified GROMACS software for SAXS-calculations and SAXS-driven MD simulations is available at https://gitlab.com/cbjh/gromacs-swaxs.
ACCESSION NUMBERS
Structure coordinates have been deposited to the protein data bank (PDB) under the following accession codes: Unr CSD789 in complex with poly(A) RNA:7zhh, Unr CSD789 in complex with pAbp RRM3:7zhr. All NMR data have been deposited to the BMRB under the following accession codes (pAbp RRM2: 51392, pAbp RRM3: 51393, pAbp RRM4: 51394, pAbp PABC: 51395) and the SAXS data have been submitted to SASBDB (CSD789: SASDHH7, CSD1-6: SASDHM7, CSD4-9: SASDHL7).
SUPPLEMENTARY DATA
Supplementary Data is available online.
FUNDING
This work was supported by an EIPOD fellowship to P.K.A.J cofunded by the EMBL and Marie Curie Actions Cofund grant MSCA-COFUND-FP. J.H. gratefully acknowledges support via an Emmy-Noether Fellowship and the Priority Program SPP1935 of the Deutsche Forschungsgemeinschaft (DFG, grant no. HE 7291/1, HE7291/5-1, EP37/3-1, 3-2,). Finally, we thank the EMBL for funding. J.S.H. and J.-B.L. were supported by the DFG (grant no. HU 1971-3/1).
CONFLICT OF INTEREST
The authors have no conflicts of interest to declare.
ACKNOWLEDGEMENT
We thank the ESRF Grenoble (beamlines BM29 and ID-30A) and DESY Hamburg PETRA-3 (P12 beamline) local contacts for support. We gratefully acknowledge Kathryn Perez and Karine Lapouge at the EMBL Protein Expression and Purification Facility for assisting with the ITC measurements.
REFERENCES
- 1.↵
- 2.↵
- 3.↵
- 4.
- 5.↵
- 6.↵
- 7.↵
- 8.↵
- 9.↵
- 10.↵
- 11.↵
- 12.↵
- 13.↵
- 14.↵
- 15.↵
- 16.↵
- 17.↵
- 18.↵
- 19.↵
- 20.↵
- 21.↵
- 22.↵
- 23.↵
- 24.↵
- 25.
- 26.
- 27.↵
- 28.↵
- 29.↵
- 30.↵
- 31.↵
- 32.
- 33.
- 34.↵
- 35.↵
- 36.↵
- 37.↵
- 38.↵
- 39.
- 40.
- 41.
- 42.↵
- 43.↵
- 44.↵
- 45.↵
- 46.↵
- 47.↵
- 48.↵
- 49.↵
- 50.↵
- 51.↵
- 52.↵
- 53.↵
- 54.↵
- 55.↵
- 56.↵
- 57.↵
- 58.↵
- 59.↵
- 60.↵
- 61.↵
- 62.↵
- 63.↵
- 64.↵
- 65.↵
- 66.↵
- 67.↵
- 68.↵
- 69.↵
- 70.↵
- 71.↵
- 72.↵
- 73.↵
- 74.↵
- 75.↵
- 76.↵
- 77.↵
- 78.↵
- 79.↵
- 80.↵
- 81.↵
- 82.↵
- 83.↵
- 84.↵
- 85.↵
- 86.↵
- 87.↵
- 88.↵
- 89.↵
- 90.↵
- 91.
- 92.↵
- 93.↵
- 94.↵
- 95.↵
- 96.↵
- 97.
- 98.
- 99.
- 100.
- 101.
- 102.↵
- 103.↵
- 104.↵
- 105.↵
- 106.↵
- 107.↵
- 108.↵
- 109.↵
- 110.↵
- 111.↵
- 112.↵
- 113.↵