Fip1 is a multivalent interaction scaffold for processing factors in human mRNA 3′ end biogenesis

3′ end formation of most eukaryotic mRNAs is dependent on the assembly of a ~1.5 MDa multiprotein complex, that catalyzes the coupled reaction of pre-mRNA cleavage and polyadenylation. In mammals, the cleavage and polyadenylation specificity factor (CPSF) constitutes the core of the 3′ end processing machinery onto which the remaining factors, including cleavage stimulation factor (CstF) and poly(A) polymerase (PAP), assemble. These interactions are mediated by Fip1, a CPSF subunit characterized by high degree of intrinsic disorder. Here, we report two crystal structures revealing the interactions of human Fip1 (hFip1) with CPSF30 and CstF77. We demonstrate that CPSF contains two copies of hFip1, each binding to the zinc finger (ZF) domains 4 and 5 of CPSF30. Using polyadenylation assays we show that the two hFip1 copies are functionally redundant in recruiting one copy of PAP, thereby increasing the processivity of RNA polyadenylation. We further show that the interaction between hFip1 and CstF77 is mediated via a short motif in the N-terminal ‘acidic’ region of hFip1. In turn, CstF77 competitively inhibits CPSF-dependent PAP recruitment and 3′ polyadenylation. Taken together, these results provide a structural basis for the multivalent scaffolding and regulatory functions of hFip1 in 3′ end processing.


INTRODUCTION 16
3' end polyadenylation is a fundamental process in eukaryotic messenger RNA (mRNA) 17 biogenesis, essential for the maturation of non-histone precursor mRNAs (pre-mRNAs) prior to their 18 export into the cytoplasm. Poly(A) tails possess key functions in mRNA metabolism, governing mRNA 19 export, translational efficiency, and stability (Nicholson and  between mammals and yeast, underlining the fundamental nature of this process (Xiang et al., 2014). 33 The cleavage site, typically within a CA dinucleotide, is defined by the polyadenylation signal (PAS), a 34 conserved hexanucleotide motif (predominantly AAUAAA) located approximately 10-30 nucleotides 35 upstream (Proudfoot and Brownlee, 1976;Proudfoot, 2011). 36 The PAS is specifically recognized by CPSF (Chan et  Yth1 (Barabino et al., 1997) and not required for mPSF complex assembly (Clerici et al., 2017). The 48 ZF1 domain is necessary and sufficient for binding to the CPSF160-WDR33 heterodimer, while ZF2 49 and ZF3 together with WDR33 mediate recognition of the AAUAAA PAS hexamer motif (Clerici et al.,50 2018; Sun et al., 2018). ZF4 and ZF5 domains interact with hFip1 Hamilton and 51 Tong, 2020). In previously determined cryo-EM structures of the yeast CPF and human mPSF 52 complexes, ZF4 and ZF5 remained unresolved, indicating conformational flexibility with respect to the 53 rigid mPSF core. Recently, a crystal structure of human CPSF30 ZF4-5 domains in complex with hFip1 54 has been determined (Hamilton and Tong, 2020) and complementary NMR studies of the yeast Fip1 55 homolog (Kumar et al., 2021) have shed light on the molecular details of the CPSF30-Fip1 interaction 56 and revealed considerable structural dynamics of Fip1 in the context of the 3' processing machinery. 57 Mammalian CstF is a dimer of trimers comprising CstF77, CstF64 and CstF50 subunits 58 (Takagaki et al., 1990;Yang et al., 2018). It is recruited to the pre-mRNA by U-and G/U-rich sequences 59 located downstream of the cleavage site (Takagaki and Manley, 1997) that are recognized by CstF64 60 (Takagaki et al., 1992;MacDonald et al., 1994). Through stabilization of CPSF on the pre-mRNA, CstF 61 plays an important role in PAS recognition and is essential for pre-mRNA cleavage (Takagaki et al., 62 1990; Boreikaite et al., 2022;Schmidt et al., 2022). Dimerization of CstF is mediated by the CstF77 63 HAT (half-a-tetratricopeptide repeat) domain homodimer (Bai et al., 2007), and further stabilized by 64 CstF50 (Yang et al., 2018). The CstF77 homodimer has an arch-like shape and interacts asymmetrically 65 with CPSF, contacting the CPSF160-WDR33 mPSF scaffold via only one side of the arch (Zhang et al., 66 2019). 67 Fip1 interacts with PAP and tethers it to CPSF bound near the nascent 3' end of the cleaved 68 pre-mRNA, which is required for its processive polyadenylation (Preker et  Here, we report structural and biochemical analysis of the interactions of hFip1 with CPSF30, 75 PAP and CstF77 within the human 3' polyadenylation machinery. While confirming previous structural 76 data (Hamilton and Tong, 2020), we notably show that mPSF contains two hFip1 copies, yet recruits 77 only one PAP molecule at a time. The presence of two PAP binding sites in mPSF contributes to the 78 processivity of 3' polyadenylation. Furthermore, we show that hFip1 interacts with CstF77 through a 79 conserved helix in its N-terminal "acidic" region and reveal that CstF77 competes with PAP for hFip1-80 binding, which attenuates polyadenylation efficiency. These results deepen our understanding of hFip1 81 as a key interaction partner for 3' end processing factors, facilitating or regulating their spatiotemporal 82 assembly on the pre-mRNA, and establish a framework for further mechanistic studies of hFip1 83 interactions and CstF-mediated regulation of mRNA 3' end biogenesis. 84 To validate our structural observations, we initially mutated ZF4 and ZF5 interaction surface 114 residues in CPSF30 ZF4-ZF5 and tested the interactions of the mutant proteins with hFip1 CD in a pull-down 115 assay (Figure 1 -figure supplement 2A). Individual substitutions of Tyr127 CPSF30 , Tyr151 CPSF30 or  116 Phe155 CPSF30 with glutamate resulted in substantial reduction of hFip1 CD binding, while simultaneous 117 mutation of both ZF4 and ZF5 residues resulted in loss of hFip1 binding, in agreement with our structural 118 observations. In hFip1 CD , glutamate substitution of aromatic residues in the hydrophobic interaction 119 patch either substantially reduced (Trp150 hFip1 , Trp170 hFip1 ) or completely disrupted (Phe161 hFip1 ) the 120 hFip1 CD -CPSF30 ZF4-ZF5 interaction (Figure 1 -figure supplement 2B). We subsequently performed 121 size exclusion chromatography coupled to multi-angle static light scattering (SEC-MALS) to analyze the 122 stoichiometry of hFip1 CD -CPSF30 ZF4-ZF5 complexes. hFip1 CD and wild-type CPSF30 ZF4-ZF5 formed a 2:1 123 complex. In contrast, CPSF30 ZF4-ZF5 proteins containing Y127E CPSF30 or F155E CPSF30 mutations formed 124 a 1:1 complex with hFip1 CD , while simultaneous mutation of both residues resulted in complete loss of 125 binding ( Figure 1F). Together, these results confirm that human CPSF30 has two independently 126 functional hFip1 binding sites, one on ZF4 and the other on ZF5, each recruiting one copy of hFip1. 127

Functional redundancy of hFip1-CPSF30 interactions in human CPSF 128
To probe the functional significance of the dual CPSF30-hFip1 interaction interfaces in the context of 129 human CPSF, we co-expressed wild-type or mutant CPSF30 together with FLAG epitope-tagged 130 CPSF160, WDR33 and hFip1 in baculovirus-infected insect cells, and performed tandem affinity 131 purifications during which purified recombinant PAP was added in trans after the first purification step. 132 hFip1 co-purified with mPSF containing wild-type CPSF30 and co-precipitated PAP ( Figure 1G). 133 Expression of CPSF30 ZF4 or ZF5 mutants (Y127E or Y151E, respectively) resulted in reduced 134 recovery of both hFip1 and PAP (Figure 1G), consistent with the reduced stoichiometry of the CPSF30-135 hFip1 interaction observed in vitro ( Figure 1F). In turn, expression of a CPSF30 construct containing 136 mutations in both the ZF4 and ZF5 binding sites resulted in the loss of hFip1 from mPSF, which was 137 thus unable to interact with PAP ( Figure 1G). Together, these results indicate that both hFip1 binding 138 sites in CPSF30 contribute to the integrity of mPSF in vivo and both are capable of recruiting hFip1 and 139 consequently PAP. Notably, the expression levels of mPSF mutant complexes incapable of binding 140 hFip1 (Y127E/Y151 CPSF30 ) were substantially reduced, consistent with the role of hFip1 in stabilizing the 141 CPSF30 zinc finger fold (Kumar et al., 2021). 142 We next assessed the requirement of the hFip1-CPSF30 interactions for RNA 3' 143 polyadenylation using an in vitro polyadenylation assay. Incubation of a model RNA substrate with 144 purified wild-type mPSF (Figure 2 -figure supplement 1A) and PAP resulted in processive 145 polyadenylation, which was dependent on the presence of ATP in the solution and an AAUAAA 146 hexameric PAS in the RNA (Figure 2A). The efficiency and processivity of 3' polyadenylation were 147 reduced upon incubation of the substrate with mPSF complexes containing CPSF30 ZF4 or ZF5 148 mutants capable of binding only one copy of hFip1 (Figure 2A). No RNA polyadenylation was observed 149 upon incubation with mPSF containing the CPSF30 ZF4/ZF5 double mutant (Figure 2A), consistent 150 with the loss of hFip1 ( Figure 1G). The loss of polyadenylation could not be rescued by the addition of 151 recombinant hFip1 in trans. Collectively, these observations indicate that both hFip1 binding sites in 152 CPSF30 contribute to the processivity of RNA 3' polyadenylation, suggesting that the presence of two 153 hFip1 copies, and thus two PAP recruitment sites, in mPSF is required for high efficiency of 3' 154 polyadenylation. However, neither hFip1 binding site is strictly necessary for RNA 3' polyadenylation, 155 suggesting their functional redundancy. 156

PAP recruitment occurs via hFip1 N-terminal region 157
In S. cerevisiae, a poorly conserved peptide motif in the N-terminal region of Fip1 directly interacts with 158 the poly(A) polymerase Pap1 (Meinke et al., 2008). Similarly, the N-terminal region of human hFip1, 159 upstream of the CD, is required for PAP interaction ) but the precise PAP 160 interaction site in human hFip1 has not been identified. To this end, we tested the interaction of green 161 fluorescent protein (GFP)-tagged PAP with purified mPSF complexes containing truncated hFip1 162 fragments in an in vitro pull-down experiment. PAP was detectably, albeit weakly, co-precipitated by 163 mPFS containing a hFip1 fragment spanning both the N-terminal and CD regions (residues 1-195) as 164 well as by mPFS containing an N-terminally truncated hFip1 (residues 36-195) ( Figure 2B). However, 165 further truncation of hFip1 resulted in the loss of PAP binding, indicating that a region spanning residues 166 36-80 in human hFip1 is required for PAP interaction ( Figure 2B). An additional pull-down experiment 167 using recombinant PAP and glutathione-S-transferase (GST)-fused hFip1 fragments revealed that 168 although the hFip1 region comprising residues 36-80 was required for PAP interaction, it was not 169 sufficient (Figure 2 -figure supplement 1B). This suggests that additional parts of hFip1 contribute 170 to PAP binding. 171 We subsequently tested the activity of mPSF complexes containing N-or C-terminally truncated 172 hFip1 in the polyadenylation assay. In agreement with the interaction data, mPSF complexes containing 173 hFip1 fragments spanning residues 1-195 or 36-190 were able to support efficient RNA 3' 174 polyadenylation ( Figure 2C), whereas mPSF complexes containing hFip1 fragments comprising 175 residues 80-195 or 130-195 were not. Together, these results indicate that hFip1 residues 36-80 are 176 required for the recruitment of PAP to effect mPSF-dependent 3' polyadenylation. Interestingly, we also 177 observed that polyadenylation levels were reduced with mPSF containing full-length hFip1 (residues 1-178 378, isoform 4), as compared to mPSF containing C-terminally truncated hFip1 (residues 1-195), 179 suggesting that the C-terminal region of hFip1, which is proline-rich and predicted to be intrinsically 180 disordered, has an inhibitory effect on mPSF-dependent 3' polyadenylation. 181

CPSF recruits only one copy of poly(A) polymerase 182
Prior studies have indicated that a complex comprising CPSF30 ZF4 and ZF5 domains and two hFip1 183 molecules is capable of simultaneous interaction with two PAP molecules in vitro (Hamilton and Tong, 184 2020). To determine whether this also occurs in the context of mPSF, we analyzed the mPSF-PAP 185 interaction by SEC-MALS. Despite only weakly interacting in pull-down analysis, at high PAP 186 concentrations (40 µM), mPSF and PAP formed a stable complex that could be purified by SEC. 187 Analysis of this complex using SEC-MALS revealed an apparent molecular mass of 347 kDa, consistent 188 with the molecular mass expected for a complex containing two hFip1 molecules and one PAP (337 189 kDa) ( Figure 2D). Addition of excess PAP to the pre-purified mPSF-PAP sample did not lead to stable 190 formation of a 1:2 complex. These results indicate that mPSF is capable of stable association with only 191 one PAP molecule at a time, despite the presence of two copies of hFip1 in the complex. 192 The N-terminal region of hFip1 interacts with CstF77 193 In analogy with the yeast polyadenylation machinery, human Fip1 has previously been shown to interact 194 with CstF77 (Preker et al., 1995;Kaufmann et al., 2004). To validate these observations and identify 195 the interaction determinants in hFip1, we performed a pull-down experiment with GST-tagged hFip1 196 fragments and maltose binding protein (MBP)-tagged fragment of CstF77 comprising the HAT domain 197 (residues 21-549). The very N-terminal region of hFip1 spanning residues 1-35 was necessary and 198 sufficient for the interaction with the CstF77 HAT domain ( Figure 3A). Notably, this region is 199 dispensable for the interaction of hFip1 with PAP and for RNA 3' polyadenylation ( Figure 2B,C). 200 To shed light on the hFip1-CstF77 interaction, we subsequently reconstituted a complex 201 comprising the hFip1 1-35 fragment with a truncated construct of the CstF77 HAT domain (residues 241-202 549) and determined its X-ray crystallographic structure at a resolution of 2.7 Å. The structure reveals 203 that hFip1 binds to a conserved positively-charged patch located on the convex surface of the CstF77 204 HAT domain arch ( Figure 3B, Figure 3 -figure supplement 1A,B). Within the hFip1 1-35 fragment, only 205 the evolutionarily conserved residues 20-27 were ordered, adopting an alpha-helical conformation (Fig  206   3C,D). Interaction of hFip1 1-35 with CstF77 involves salt bridge contacts of Glu22 hFip1 and Glu23 hFip1 207 with Arg402 CstF77 , and hydrophobic contacts involving Leu26 hFip1 and Tyr27 hFip1 with Phe398 CstF77 , 208 Val428 CstF77 , Ile432 CstF77 and Leu435 CstF77 . Additionally, the Tyr27 hFip1 side chain interacts with 209 Arg395 CstF77 via a p-p stacking. Corroborating these structural observations, simultaneous alanine 210 substitutions of Glu22 hFip1 and Glu23 hFip1 , or Trp25 hFip1 , Leu26 hFip1 and Tyr27 hFip1 , respectively, disrupted 211 the hFip1 1-35 -CstF77 21-549 interaction in a pull-down experiment, whereas alanine substitution of 212 Trp25 hFip1 alone did not have an effect ( Figure 3E). In turn, mutation of the positively charged interaction 213 interface in CstF77 (Arg395, Arg402, and Lys431 mutated to alanines) abolished the interaction with 214 hFip1 1-35 ( Figure 3E). 215 A previously determined cryo-EM reconstruction of the human mPSF-CstF77 complex revealed 216 that the interaction of the CstF77 HAT domain dimer with mPSF is primarily mediated by extensive 217 contacts with WDR33 and CPSF160 (Zhang et al., 2019). Upon close inspection, the cryo-EM map from 218 this dataset (EMDB entry EMD-20861) exhibits residual densities on both CstF77 protomers that could 219 be attributed to the binding of two hFip1 molecules via their N-terminal regions ( Figure 3H). This 220 observation indicates that CstF77 is capable of binding two hFip1 copies when bound to mPSF. We 221 subsequently tested the contribution of hFip1 1-35 to the mPSF-CstF77 interaction in a pull-down 222 experiment using MBP-tagged CstF77 and mPSF complexes containing truncated hFip1 fragments. 223 Although all mPSF complexes were capable of binding CstF77, reduced levels of CstF77 co-224 precipitation were observed with mPSF containing N-terminally truncated hFip1 that lacked the CstF77 225 interacting region (Figure 3 -figure supplement 2A). Taken together, these results suggest that direct 226 interactions between hFip1 and CstF77 contribute to the assembly of the CPSF-CstF complex during 227 mRNA 3' end biogenesis. 228

CstF77 inhibits polyadenylation by competition for hFip1 229
As CstF77 and PAP bind to non-overlapping, yet adjacent, sites in hFip1, CstF77 binding could 230 nevertheless preclude PAP recruitment due to steric hindrance. To probe this, we carried out a pull-231 down experiment with GST-tagged hFip1 and mixtures of MBP-tagged CstF77 and GFP-tagged PAP 232 at varying molar ratios. In the presence of excess CstF77, PAP binding was considerably reduced, 233 indicating that CstF77 competes with PAP for binding to hFip1 ( Figure 4A). Consistent with this, CPSF-234 dependent RNA 3' polyadenylation was substantially reduced in the presence CstF77, suggesting that 235 CstF77 inhibits 3' polyadenylation via interaction with hFip1 ( Figure 4B) independently. Using polyadenylation assays we show that the two hFip1 copies are functionally 256 redundant in recruiting PAP to the mPSF, which increases the processivity of RNA 3' polyadenylation. 257 As recruitment of PAP to the 3' end of the cleaved pre-mRNA is prerequisite for its processivity 258 by mPSF is precluded, even though our results imply that PAP can be recruited via either hFip1 277 molecule. We speculate that this might be due to molecular crowding or steric hindrance when mPSF 278 is bound to a substrate RNA, particularly considering that the two Fip1 molecules make asymmetric 279 interactions with mPSF. Notwithstanding, these findings suggest that mPSF contains two hFip1 280 interaction modules to ensure efficient PAP recruitment. Furthermore, the presence of two hFip1 copies 281 might be required for mPSF integrity and its interactions with CstF. 282 The interaction between CPSF and CstF has previously been shown to involve direct contacts 283 between the CstF77 homodimer and an extensive interface provided by the CPSF160 and WDR33 and biochemical findings, we propose a model in which hFip1 acts as a coordinator of the two steps of 302 3' end processing. Initially, the two hFip1 molecules present in mPSF facilitate the assembly of CPSF 303 and CstF on the pre-mRNA via the interactions of their N-terminal motifs with CstF77 ( Figure 4C). In 304 part, these interactions also preclude PAP recruitment until the pre-mRNA has been cleaved and a free 305 3' end has been generated. Upon endonucleolytic cleavage of the pre-mRNA by CPSF73, a 306 conformational rearrangement, possibly driven by the dissociation of the downstream cleavage product 307 and concomitant displacement of CstF, reduces sterical constraints around the nascent 3' end, which 308 enables hFip1 to associate with PAP to initiate processive 3' polyadenylation of the cleaved pre-mRNA 309 ( Figure 4D). The conformational and compositional transitions required for accessing the nascent 3' In sum, these results advance our understanding of hFip1 as a multivalent interaction scaffold 319 for 3' end processing factors and unravel a novel aspect of polyadenylation regulation by CstF. Through 320 interspacing binding sites for processing factors with intrinsically disordered, low-complexity sequences 321 hFip1 can achieve the required degree of conformational freedom to accommodate the remodeling of 322 the 3' end processing machinery and ensure correct spatiotemporal regulation of the processing factors 323 at the nascent mRNA 3' end. The molecular basis of these transitions, however, awaits further structural 324 and biophysical investigations. 325

Protein expression and purification 327
Cloning for expression in E. coli

Pull-down analysis of hFip1-CstF77 interaction 529
For pull-down analysis with purified hFip1 and CstF77 proteins (wt and mutants), 10 µg of purified His6-530 GST-hFip1 protein was immobilized on 15 µl Glutathione Sepharose 4 Fast Flow beads (Cytiva) and 531 washed three times with 0.5 ml pull-down wash buffer (20 mM Tris pH 7.5, 200 mM NaCl, 0.05% Tween-532 20, 0.5 mM TCEP). His6-MBP-CstF77 protein was added to the immobilized protein at 4-fold molar 533 excess and incubated gently agitating at 4 °C for 1 h followed by washing three times with 0.5 ml of 534 pull-down wash buffer. The bound protein was eluted at room temperature by adding 1X SDS-PAGE 535

SEC-MALS analysis 568
Size exclusion chromatography combined with multiangle light-scattering (SEC-MALS) was carried out 569 on an HPLC system (Agilent LC1100, Agilent Technologies) coupled to an Optilab rEX refractometer 570 and a miniDAWN three-angle light-scattering detector (Wyatt Technology). Data analysis was 571 performed using the ASTRA software (version 7.3.2; Wyatt Technology). 572

Multiple sequence alignment 589
The multiple sequence alignment of hFip1 orthologs was produced with MAFFT version 7 (Katoh et al.,590 2018) and visualized using Jalview (Waterhouse et al., 2009). Input sequences are listed in 591 Supplementary Table 3. 592

3D density map analysis 596
Visualization and analysis of the 3D density map for CPSF160-WDR33-CPSF30-PAS RNA-CstF77 597 complex (EMD-20861) was performed with UCSF Chimera (Pettersen et al., 2004), developed by the 598 Resource for Biocomputing, Visualization, and Informatics at the University of California, San Francisco. 599 The 3D density map was segmented and color-coded based on the corresponding atomic model (PDB 600 ID: 6URO). The CstF77-hFip1 crystal structure from this study was superimposed onto the atomic 601 model of CstF77. 602

DATA AVAILABILITY 603
The atomic coordinates and structure factors for the crystallographic structures of the Fip1-CPSF30Fip1 604 and Fip1-CstF77 complexes have been deposited in the Protein Data Bank under accession codes 605 7ZYH and 7ZY4, respectively. All data generated or analyzed during this study are included in the 606 manuscript and supporting files. Source data files for gel images in Figures 1, 2, 3

Pull-down analysis of mPSF-CstF77 interaction 922
For pull-down analysis of the mPSF:CstF77 interaction, Ni-IMAC purified mPSF complexes from Sf9 923 cells containing hFip1