Molecular insights into RNA recognition and gene regulation by the TRIM-NHL protein Mei-P26

This study provides molecular insights into RNA target recognition by the TRIM-NHL protein Mei-P26 that functions in Drosophila ovarian germline stem cell maintenance, oogenesis, and spermatogenesis.


13
RNA binding proteins (RBPs) play key roles in the post-transcriptional regulation of 14 gene expression. They comprise a large and functionally diverse group of proteins involved in 15 all aspects of RNA biology from RNA synthesis to its degradation. RBPs typically bind 16 RNAs through dedicated RNA-binding domains (RBDs) (1). Several members of the 17 evolutionary conserved TRIM-NHL family employ their NHL domains to interact with RNA 18 (2-6). The TRIM-NHL protein family shares a common architecture comprised of an N-19 terminal tripartite motif (TRIM, consisting of a RING domain, one or two B-Box type zinc 20 fingers and a coiled-coil domain) followed by a C-terminal NCL-1, HT2A, LIN-41 (NHL) 21 domain (7-11). The NHL domain folds into a β-propeller that typically acts as a scaffold to 22 mediate interactions with other biomolecules such as proteins, DNA or RNA (6, 12, 13). was determined using the BioRad protein assay reagent. 25 µg of total protein were separated 203 by denaturing PAGE and subjected to Western blotting using mouse monoclonal anti-FLAG 204 antibody (M2, Sigma Aldrich, 1:1000) followed by probing with an HRP-coupled anti-mouse 205 light chain-specific secondary antibody (1:10000, Jackson Immuno Research). Detection 206 occurred by using Clarity Western ECL substrate and a ChemiDoc Touch Imaging System 207 (BioRad). After stripping, the membrane was re-probed using mouse anti-alpha-tubulin 208 antibody (DM1A, Sigma Aldrich, 1:1000). 209 Mei-P26:RNA-complexes were cut from the membrane, proteins were digested with 224

Individual-nucleotide cross-linking and immunoprecipitation (iCLIP)
Proteinase K and RNA was subjected to iCLIP2 library preparation as previously described 225 (53). Sequencing occurred on a HiSeq 2500 (Illumina). Three independent biological 226 replicates were performed for each protein construct; as a control, non-transfected cells were 227 processed in parallel. 228 iCLIP data analysis 229 The iCLIP data were processed using the iCount software suite and analysis pipeline 230 (54). The sequencing reads were demultiplexed based on barcodes for individual replicates 231 (allowing one mismatch), PCR duplicates were removed and adapters were trimmed. The 232 reads for each of the replicates were aligned to the Drosophila melanogaster genome 233 (ENSEMBL release 98) and processed separately. Cross-linked nucleotides (peaks) were 234 identified and then clustered. Gene loci that produced iCLIP peaks in the experiments When crosslinking occurred to mitochondrially-encoded RNAs (4 loci in total) or known 237 contaminants such as snoRNA/scaRNA/snRNA sequences (or similar) present in the host 238 genes (17 loci), the gene loci were excluded from further analyses. Similarly, crosslinking to 239 low complexity regions (mostly A-stretches that resemble polyA-tails) or sequences derived 240 from the transfected plasmids (originating from the Actin5C promoter or the Mei-P26 coding 241 region) were not considered. In the remaining 214 genes, crosslinking positions were 242 considered equivalent between the individual experiments when the distance between the 243 crosslinking peaks (full-length versus NHL domain) was <50 nt. In these loci, the presence of 244 Mei-P26 RNA target motifs was scored up to 30 nt upstream and 20 nt downstream of the 245 crosslink positions, considering full matches (UUUUACN, UUUUANA, UUUUNCA, or 246 UUUUUUU, 65 loci) or U-rich sequences with four consecutive U residues followed by 3 nt 247 containing at least one additional U residue (63 loci). resolution. Therefore, we predicted the beginning of the NHL domain and expressed different 274 constructs using a baculovirus expression system. We obtained large quantities of Mei-P26 275 NHL aa908-1206 producing a homogenous monomer of approximately 30 kDa, which was 276 completely free of any proteinaceous contaminants and nucleic acids (Fig. 1A). Mei-P26 277 NHL crystallized in several tested conditions and we collected numerous complete datasets at 278 various synchrotron sources. We solved the structure of the Mei-P26 NHL domain at 1.6 Å 279 resolution by molecular replacement using the backbone of the previously determined Brat 280 NHL domain (28) as a reference model. The N-terminus of the NHL domain remains invisible 281 due to averaging of its conformations throughout the crystal but is present as there was no 282 indication of proteolytic degradation during purification and crystallization (Supplementary 283 Fig. 1A, B). After refinement, an atomic model could be obtained with R/R free values of folds into a six-bladed β-propeller with a donut-like shape and a diameter of ~45 Å and a 286 height of ~25 Å (Fig. 1A). The two molecules located in the asymmetric unit are connected 287 by a di-sulfide bond between Cys1030, but the functional relevance of this dimerization under Mei-P26 NHL binds to single-stranded RNA 295 We compared the structure of Mei-P26 NHL to the NHL domains of Brat, Lin41 and 296 Thin/Abba to gain further insights into their unique and commonly shared features (3, 31, 55).  Fig. 1E, Supplementary Fig. 2). 315 There is, however, one notable exception: three amino acids important for the recognition of 316 the first position of the RNA target by Brat (a uridine base interacting with Asn800, Tyr829 317 and Arg847) are conserved in Mei-P26 (Asn970, Tyr 999 and Arg1017). This suggests that 318 Mei-P26 also binds a uridine base in the same region but utilizes varying, non-conserved 319 amino acids to recognize different sequence motifs in its target RNAs. 320 To assess the binding specificity of the Mei-P26 NHL domain experimentally, we 321 recently developed algorithms (38) to re-analyze the data from RNAcompete experiments 334 previously carried out by the Hughes and Morris groups (6). We sought to obtain a complete 335 list of oligonucleotide sequences that may interact with the NHL domain of Mei-P26, as an 336 initial step of defining Mei-P26 NHL targets. In our analyses we utilized a previously 337 generated library of almost all possible 7-mer oligonucleotide fragments that have been tested 338 for binding to purified GST-tagged Mei-P26 NHL (6). The relative binding of each of the 339 sequences is presented as a Z-score calculated from a dataset computationally separated into 340 two subsets termed Set A and Set B and for a combined subsets A+B. Consensus motifs 341 recognized by Mei-P26 NHL ( Fig. 2A) were calculated from the ten top-scored sequences of 342 7-mer motifs. Of note, we performed the same control analysis for the Brat NHL domain and 343 subsequently confirmed the binding of its RNAcompete-driven consensus motif (BRAT1: 344 UUGUUAA) using MST (Supplementary Fig. 3A sequences derived from RNAcompete (Fig. 2B, Supplementary Fig. 4, Supplementary Fig.  352   5). Initial models of Mei-P26 in complex with RNA (SEQ1: UUUUUUU, SEQ3: 353 UUUUACA, or BRAT1: UUGUUAA) were generated by modeling the RNA to reflect the 354 conformation of the UUGUUGU nucleotide bound to the Brat NHL domain (PDB ID: 5EX7). 355 According to our analysis, SEQ3 binds most stably to the Mei-P26 NHL domain and 356 converges into a well-defined conformation (Fig. 2B, Supplementary Fig. 4). Simulations with the top-scored polyU oligonucleotide (SEQ1) and the inverted SEQ3 sequence 358 (ACAUUUU) resulted in much higher RMSD values, indicating a weaker fit in comparison to 359 the SEQ3 RNA (Supplementary Fig. 4A). The top five clusters obtained by the MD 360 simulations with SEQ3 are much more similar to each other than any of them compared to the 361 clusters obtained with SEQ1. Moreover, we observed lower variance in the 3' region and 362 more structural heterogeneity of the 5'-RNA docking site between the individual clusters of 363 the Mei-P26:UUUUACA models (Supplementary Fig. 4B). Together, our simulations 364 emphasize the importance of the ACA trinucleotide anchor adjacent to poly-uridine stretches 365 for Mei-P26 RNA recognition (Fig. 2B, Supplementary Fig. 4). Although similar affinities 366 were determined for both oligonucleotides (Fig. 2C, Supplementary Fig. 6  affinities that would be expected for specific but transiently associated RBPs (Fig. 2C, 397 Supplementary Fig. 6). Among the identified sequences, Mei-P26 NHL showed the highest 398 affinities towards SEQ3 (K d 1.7 µM) and polyU (K d 1.8 µM). Mei-P26 NHL also exhibited 399 measurable affinities to the SEQ2 and SEQ5 RNA oligonucleotides but binding to SEQ4 and 400 SEQ6 was reduced. Limited solubility of purified Mei-P26 NHL protein at higher 401 concentrations did not allow determination of the dissociation constants for these 402 oligonucleotides. These data demonstrate sequence-specific recognition and discrimination by 403 Mei-P26 NHL also displays an increased affinity towards extended polyU sequences (e.g. U 9 ; 409 with only six consecutive bases (6), while Lin41 only recognizes two nucleobases and makes 412 most of its contacts via the ribose-phosphate backbone (3). Hence, we assume that the NHL 413 domain of Mei-P26 also recognizes a short stretch of bases (Fig. 2B, Supplementary Fig. 4). 414 Furthermore, we created an inverted SEQ3 oligonucleotide (ACAUUUU; Fig. 2D), which 415 exhibited decreased affinity and indicated the necessary 5′ to 3′ directionality of the motif. 416 Analysis performed with a UUUUCAC sequence also displayed significantly lower affinity 417 further indicating the importance of the ACA anchor (Fig. 2D). We also managed to convert 418 a poor substrate (UUUACAA) into a substrate able to be bound by the addition of a single U 419 nucleotide (UUUUACAA), highlighting the necessity of at least four consecutive uridines. 420 Conversely, insertion of a guanosine nucleotide at asymmetric and symmetric positions into 421 the polyU sequence (SEQ1) reduced the affinity to Mei-P26 (Fig. 2D). Extending SEQ1 by 422 two additional uridines, lead to the observation of increased affinity, illustrating the 423 aforementioned avidity effect for polyU sequences (57) (Fig. 2D). In summary, our results 424 show sequence-specific RNA recognition by the Mei-P26 NHL domain and confirm 425 computational predictions based on the RNAcompete data. 426  Fig. 7A,B), all variants showed purity and stability parameters comparable 452 to the wild type Mei-P26 NHL protein (Fig. 3B, Supplementary Fig. 7A,B). We tested all 453 stable Mei-P26 NHL variants in RNA interaction assays and observed severely compromised indicated that Arg1175 participates exclusively in SEQ3 binding, but appears dispensable for 459 polyU binding. The substitution of Arg1175 indeed strongly reduces Mei-P26 NHL ability to 460 recognize the SEQ3 sequence, but shows only minor influence on its interaction with U 9 . In 461 contrast, residue Arg1017 that seems to stabilize U1 in both simulations likewise strongly 462 contributes to the binding of both, SEQ3 and U 9 . This data experimentally indicates that Mei-463 P26 might employ different modes of RNA recognition for the association with different 464 RNA motifs. Furthermore, we show that simultaneous substitution of residues R1150A, 465 K1172A and R1175A fully abolishes the binding to any of the RNA sequences tested. 466

Identification of key residues involved in RNA recognition
Therefore, our findings are consistent with the MD simulations and conclude that R1150 and 467 K1172 residues are responsible for anchoring the ACA trinucleotide, while R1175 stabilizes 468 the uridine tract. In summary, our results identify particular amino acid residues critical for 469

RNA recognition. 470
We also tested binding of Mei-P26 NHL to the consensus Brat recognition sequence 471 (UUGUUAA, BRAT1), which we obtained from the RNAcompete data (Supplementary Fig.  472 2A) and which we used in our MD simulations (Supplementary Fig. 4). While this RNA was 473 only weakly bound by Mei-P26, a single substitution from G to U (BRAT mut UUUUUAA; 474  Fig. 8A, 8B). In an attempt to mimic Brat, we replaced the alanine with Supplementary Fig. 8C). 484 immunoprecipitate strongly reduced amounts of RNA (Supplementary Fig. 9B). 501

Identification of Mei-P26 target mRNAs by iCLIP
Analyses of the iCLIP data from the wild-type proteins identify a local enrichment of 502 cross-link positions in 751 protein-coding genes for the full-length protein and 623 for the only a moderate overlap is observed (22.1%, 249 loci, Fig. 4A, Supplementary Data 2). In 508 71.9% of the shared target genes, crosslinking is observed in comparable positions in the gene 509 body (at a distance of 50 nt or less, Fig. 4D). After removal of contaminating sequences (e.g. P26 NHL RKR mutant did not associate with the RNAs (Fig 4F, Supplementary Fig. 10). 526

The NHL domain is important for Mei-P26 gene-regulatory activity 527
To better understand the impact of Mei-P26 on gene expression, we conducted a series 528 of reporter assays in cultured Drosophila Schneider 2 cells. We first used the phage-derived 529 lambda-boxB system to artificially recruit full-length Mei-P26 to an RNA. For this, the within its 3' UTR several copies of the lambdaN binding site (boxB). The same proteins 533 without the lambda peptide served as controls. In addition, a co-transfected plasmid encoding 534 a renilla luciferase mRNA that lacks the boxB elements was used for normalization. In this 535 experimental setup, Brat and the positive control GW182, which is involved in miRNA-536 mediated gene silencing, convey robust repression of the firefly luciferase reporter mRNA 537 (31). In contrast, Mei-P26 exhibits only a weak gene regulatory activity (Fig. 5A). 538 Next, we analyzed expression of a reporter RNA that contains a fragment of the nanos 539 3ʹUTR, a genetically identified target of Mei-P26 (18). Co-expression of full-length Mei-P26 540 resulted in silencing of the reporter relative to a control mRNA that bears a 3' UTR from an 541 unrelated RNA (msl-2). Repression is dependent on the ability of Mei-P26 to bind to RNA 542 through its NHL domain since the RKR substitution abolishes regulation (Fig. 5B). A 543 previously identified U-rich RNA sequence element in the 3ʹUTR of nanos mRNA (29) is 544 critical for Mei-P26-mediated repression as its deletion completely abrogates regulation of the 545 reporter (Fig. 5B). Using a similar experimental setup, we tested a series of selected mRNA 546 targets that we identified in our iCLIP analyses (Fig. 5C). We included all RNA candidates 547 for which we confirmed Mei-P26 interaction and targets that contain (e.g. Hrb27c) or lack 548 (e.g. spz) the recognition motif (Fig. 4F, Supplementary Fig. 10, Supplementary data 2). 549 The effect of Mei-P26 on the expression of the reporters was diverse. For instance, we could 550 not detect a significant change to the expression of reporters that bear the 3ʹUTRs of bic, chic 551 and RpS23. In contrast, LanA-, Mlc-c-, lost-and Hrb27C-derived reporters exhibited 552 significant repression. Unexpectedly, RpS20-, sqd-, sta-, Swip-1, spz-, eIF4A-, Col4A1-and abrogate RNA binding activity severely blunted regulation of all reporter RNAs, underlining 557 the functional importance of the domain for activity (Fig. 5C). Despite being expressed at a 558 comparable level (Supplementary Fig. 11), a construct encompassing only the NHL domain 559 was not sufficient for regulation (Fig. 5D), demonstrating that additional sequences outside 560 the NHL domain are required. Previously, it has been debated whether the N-terminal RING 561 domain and its ubiquitin ligase activity contribute to the gene regulatory activity of Mei-P26 562 (21). However, deletion of the N-terminal RING domain neither abolished activation of spz, 563 or repression of nos or Hrb27c reporters, demonstrating that ubiquitin ligase activity is 564 dispensable in this experimental setup (Fig. 5D). indicating at least two distinct interaction modes between these two similar NHL domains. nucleotides in length with rather moderate affinities (65). To increase affinity and specificity, 583 often multiple binding domains are combined, either in tandem in the same polypeptide, or in 584 trans through formation of protein complexes (66). Our measured affinities are in the 585 micromolar range and are lower for Brat NHL than previously described for longer 586 oligonucleotide sequences (6), but comparable to those reported for CeLin41 NHL for shorter 587 hairpins (3). We provide experimental evidence (Supplementary Fig. 9B) that in Mei-P26 588 protein regions outside the NHL domain also contribute to RNA target recognition/binding. In 589 vivo, it is most likely that additional protein partners contribute to stable complex formation 590