Abstract
DDX3 is an RNA chaperone of the DEAD-box family that regulates translation. Its yeast ortholog Ded1 controls the translation of nearly all mRNAs, whereas DDX3 is thought to regulate only a subset of mRNAs. However, the set of mRNAs that are regulated by DDX3 are unknown, along with the relationship between DDX3 binding and activity. Here, we use ribosome profiling, RNA-seq, and PAR-CLIP to define the set of mRNAs that are regulated by DDX3 in human cells. We find that while DDX3 binds most expressed mRNAs, depletion of DDX3 affects the translation level of only a small subset of the transcriptome. We further find that DDX3 binds a site on helix 16 of the human ribosome, placing it immediately adjacent to the mRNA entry channel and translation factor eIF4B. Translation changes caused by depleting DDX3 levels or through chemical inhibition mimicking a dominant negative allele are different. Taken together, our data defines the subset of the transcriptome that is responsive to DDX3 inhibition, with relevance for basic biology and disease states where DDX3 expression is altered.
Introduction
Translation initiation is affected by mRNA structure. The DEAD-box RNA chaperone DDX3 and its yeast ortholog Ded1p facilitate translation initiation on mRNAs with RNA structures in their 5’ untranslated regions (UTRs) (Guenther et al. 2018; Lai et al. 2008; Oh et al. 2016; Soto-Rifo et al. 2012), a function that is essential in all eukaryotes (Sharma and Jankowsky 2014). Dysfunction in DDX3 is linked to numerous diseases and cancers, including medulloblastoma (Epling et al. 2015; Floor et al. 2016b; Jones et al. 2012; Kool et al. 2014; Oh et al. 2016; Pugh et al. 2012; Robinson et al. 2012; Valentin-Vega et al. 2016), many other cancer types (Sharma and Jankowsky 2014), and de novo developmental delay (Deciphering Developmental Disorders 2015; Lennox et al. 2018; Snijders Blok et al. 2015; Wang et al. 2018). Previous work studied how translation is altered by DDX3 variants found in medulloblastoma (Oh et al. 2016; Valentin-Vega et al. 2016), which are exclusively missense variants that preferentially target conserved residues. In contrast, hematological cancers like natural killer/T-cell lymphoma (Dufva et al. 2018; Jiang et al. 2015) and others (Wang et al. 2011; Schmitz et al. 2012) also have frequent variants in DDX3X, but they are mostly truncating or frameshift variants resulting in decreased expression. Changes in gene expression occurring as a result of decreased DDX3 levels remain incompletely understood.
Inactivation of Ded1 in yeast leads to polysome collapse and global downregulation of translation (Chuang et al. 1997; de la Cruz et al. 1997). More recent work showed that Ded1 is required for translation of most transcripts in yeast using genome-wide approaches (Guenther et al. 2018; Sen et al. 2015). In contrast, DDX3 depletion seems to only affect translation of a subset of expressed transcripts (Ku et al. 2018; Lai et al. 2008, 2010; Lee et al. 2008; Soto-Rifo et al. 2012). Despite the importance of DDX3 to normal function and its alteration in diverse disease states, the set of genes that depend on DDX3 for translation is not clearly defined. Moreover, it has been challenging to relate DDX3 binding to functional effects on bound mRNAs, and it was unclear if DDX3 is functioning outside of translation initiation given that binding was detected in coding sequences and 3’ UTRs (Oh et al. 2016; Valentin-Vega et al. 2016).
Here, we depleted DDX3 protein levels and measured alterations to translation and RNA abundance using ribosome profiling and RNA-seq. We also characterized DDX3 binding by PAR-CLIP, exploiting the presence of T>C mutations as a diagnostic hallmark of protein-RNA interactions. We observed robust interactions between DDX3 and transcript 5’ UTRs, as well as a specific and conserved site on the 40S ribosomal subunit. We found that transcripts with structured 5’ UTRs are preferentially affected by DDX3. We used an in vitro reporter system to conclude that decreases in ribosome occupancy upon DDX3 depletion are driven by 5’ UTRs. Taken together, our results support a model for DDX3 function where interactions with the small ribosomal subunit facilitate translation on messages with structured 5’ UTRs, which, when inactivated, pathologically deregulates protein synthesis.
Methods
NGS data pre-processing
Ribo-seq.fastq files were stripped of the adapter sequences using cutadapt. UMI sequences were removed and reads were collapsed to.fasta format. Reads shorter than 20 nt were removed. Reads were first aligned against rRNA (accession U13369.1), and to a collection of snoRNAs, tRNAs and miRNA (retrieved using the UCSC table browser) using bowtie2 in the ‘local’ alignment mode.
Read Alignment
Reads were mapped to the hg38 version of the genome (without scaffolds) using STAR 2.6.0a (Dobin et al. 2013) supplied with the GENCODE 25.gtf file. The following parameter were used:
--alignEndsType EndToEnd (Ribo-seq and PAR-CLIP)
--outFilterMismatchNmax 2 –outFilterMismatchNoverLmax 0.04 (Ribo-seq)
--outFilterMismatchNmax 3 (PAR-CLIP and RNA-seq)
--outFilterMultimapNmax 20 --outSAMmultNmax 1 -- outMultimapperOrder Random (All)
De-novo splice junction discovery was disabled for all datasets.
PAR-CLIP peak calling
Peak calling for PAR-CLIP reads was performed with PARalyzer v1.5 (Corcoran et al. 2011) in the “EXTEND_BY_READ” mode using the following parameters:
BANDWIDTH=3
CONVERSION=T>C
MINIMUM_READ_COUNT_PER_GROUP=5
MINIMUM_READ_COUNT_PER_CLUSTER=5
MINIMUM_READ_COUNT_FOR_KDE=5
MINIMUM_CLUSTER_SIZE=8
MINIMUM_CONVERSION_LOCATIONS_FOR_CLUSTE R=1
MINIMUM_CONVERSION_COUNT_FOR_CLUSTER=1
MINIMUM_READ_COUNT_FOR_CLUSTER_INCLUSIO N=5
MINIMUM_READ_LENGTH=13
MAXIMUM_NUMBER_OF_NON_CONVERSION_MISM ATCHES=0
Peaks with more than 10 reads were retained for subsequent analysis.
Additionally, wavClusteR (Comoglio et al. 2015) was used to call PAR-CLIP peaks using the “mrn” method and automatic thresholding calculation.
Differential expression analysis
Count matrices for Ribo-seq and RNA-seq were built using reads mapping uniquely to CDS regions of protein-coding genes, using the Bioconductor packages GenomicFeatures, GenomicFiles and GenomicAlignments. DESeq2 (Love et al. 2014) was used to calculated log2FoldChange values between siRNA-treated cells and controls; only genes with BaseMean >20 for both RNA and Ribo-seq were retained. Differentially expressed genes (“RNA_up” and “RNA_down”) were defined using the RNA differential expression, with an adjusted p-value cutoff of.01. Riborex (Li et al. 2017) was used to quantify translation regulation, using the DESeq2 engine and an adjusted p-value cutoff of.05 to define differentially translated genes (“TE_up” and “TE_down”). The “RNA_up” class was further divided in 2 groups: a mixture of 2 gaussians was fit to a vector comprising the difference between the log2FC of RNA and Ribo, using the mixtools package; a small difference is indicative of concordant RNA and Ribo change, while a positive value indicates a change in RNA larger than the change in the Ribo-seq (thus resembling the “TE_down” set). The means for the two gaussian distributions were set at the values of 0 and 0.4; genes belonging to the second distribution (according to the posterior probability of the fit) were removed. GO enrichment analysis was performed with the topGO package, using the fisher test with default parameters. Gene categories with a number of annotated genes between 3 and 3000 were retained.
Meta-transcript profiles
PARalyzer peaks (and peaks from the POSTAR2 repository (Zhu et al.)) were mapped on transcript coordinates using one coding transcript per gene: such transcript was chosen to have the longest 5’ UTR and the most common annotated start codon for that gene. Transcript positions were converted into bins using 33 bins for each 5’ UTR, 100 bins for each CDS and 67 bins for each 3’ UTR. Peak scores (ModeScore for PARalyzer peaks) were normalized for each transcript (to sum up to 1), and values were summed for each bin to build aggregate profiles. When plotting profiles for different RBPs. the aggregate profiles were further normalized to a sum of 1. To build the average meta-transcript profile in Figure 5 and S4, Conversion specificity values were averaged per transcript bin and transcript class.
Additional transcript features analysis
To compare read mapping locations within transcripts, a window of 25nt around the start codon was subtracted from annotated 5’ UTRs and CDS. 5’ UTRs and CDS regions were merged for each gene using the “reduce” command from the GenomicRanges package. Counts on 5’ UTR and CDS were first averaged between replicates. The ratio 5’ UTR to CDS of these counts were calculated for each gene, in the siRNA and controls condition. The log2 of the ratio siDDX3/control for those values represents the skew of counts towards 5’ UTR in the siDDX3 condition:
RNA in silico folding was performed on 5’ UTRs sequences using RNAlfold (Lorenz et al. 2016) with default parameters. Average ΔG values per nucleotide were calculated averaging the ΔG values of each structure overlapping that nucleotide. Single-nucleotide values were averaged into bins using the same approach used to create meta-transcript profiles above. %GC content and T>C transition specificity (defined as ConversionEventCount / (ConversionEventCount+NonConversionEventCount)) for each PAR-CLIP peak were derived using the clusters.csv output file from PARalyzer. Peak scores for the wavClusteR peaks (using the RelLogOdds column) and crosslinking efficiency were extracted from the wavClusteR output.
Gviz was used to plot tracks for RNA-seq, Ribo-seq and PAR-CLIP over different genomic regions. Intron size was set to a maximum of 150 nt.
Source code for all analysis programs can be found here: https://github.com/lcalviell/DDX3_manuscript_analysis.
Ribosome profiling
Flp-In T-REx HEK293 cells transfected with control siPool and with DDX3X targeting siPools (siTools Biotech) were washed with PBS containing 100 μg/ml cycloheximide, flash frozen on liquid nitrogen, lysed in lysis buffer (20 mM TRIS-HCl pH 7.4, 150mM NaCl, 5 mM MgCl2 1 % (v/v) Triton X-100, 25 U/ml TurboDNase (Ambion), harvested, centrifuged at 20,000 g for 4 min at 4°C and supernatants were stored at −80°C. Thawed lysates were treated with RNase I (Ambion) at 2.5 U/μl for 45 min at room temperature with slow agitation. Further RNase activity was stopped by addition of SUPERase:In (Ambion). Next Illustra MicroSpin Columns S-400 HR (GE Healthcare) were used to enrich for ribosome complexes. RNA was extracted from column flow throughs with TRIzol (Ambion) reagent. Precipitated nucleic acids were further purified and concentrated with Zymo-Spin IIC column (Zymo Research). Obtained RNA was depleted of rRNAs with Ribo-Zero Gold Kit (Human/Mouse/Rat) kit (Illumina), separated in 17% Urea gel and stained with SYBR Gold (Invitrogen). Gel slices containing nucleic acids 27 to 30 nucleotides long were excised and incubated in a thermomixer with 0.3 M NaCl at 4°C over night with constant agitation to elute RNA. After precipitation nucleic acids were treated with T4 polynucleotide kinase (Thermo Scientific). Purified RNA was ligated to 3’ and 5’ adapters, reverse transcribed and PCR amplified. The amplified cDNA was sequenced on a HiSeq2000 (Illumina).
PAR-CLIP Experiments
Flp-In T-REx HEK293 cells expressing FLAG/HA-tagged DDX3X were labelled with 100 μM 4SU for 16h. After labelling PAR-CLIP was performed generally as described (Hafner et al. 2010). Briefly, cells were UV-crosslinked and stored. Obtained cell pellets were lysed in 3 times the cell pellet volume of NP-40 lysis buffer (50 mM HEPES-KOH at pH 7.4, 150 mM KCl, 2mM EDTA, 1mM NaF, 0.5% (v/v) NP-40, 0.5 mM DTT, complete EDTA-free protease inhibitor cocktail (Roche)), incubated 10 min on ice and centrifuged at 13,000 rpm for 15 min at 4°C. Supernatants were filtered through 5 μm syringe filter. Next, lysates were treated with RNase T1 (Fermentas) at final concentration of 1U/μl for 10 min at room temperature. Immunoprecipitation of the DDX3X/RNA complexes was performed with Flag magnetic beads (Sigma) and was followed by second RNase T1 (Fermentas) digestion at final concentration of 10U/μl for 10 min at room temperature. After IP beads were incubated with calf intestinal phosphatase (NEB) and RNA fragments were radioactively end labelled using T4 polynucleotide kinase (Fermentas). Crosslinked protein–RNA complexes were resolved on a 4–12% NuPAGE gel (Invitrogen) and transferred to a nitrocellulose membrane (Whatman). Protein–RNA complex migrating at an expected molecular weight was excised. Next RNA was isolated from the membrane by proteinase K (Roche) treatment and phenol– chloroform extraction. Purified RNA was ligated to 3’ and 5’ adapter, reverse transcribed and PCR amplified. The amplified cDNA was sequenced on a HiSeq2000 (Illumina).
In vitro transcription, capping, and 2’-O methylation of reporter RNAs
Annotated 5’ UTRs for selected transcripts were cloned upstream of Renilla Luciferase (RLuc) under the control of a T7 promoter, with 60 adenosine nucleotides downstream of the stop codon to mimic polyadenylation. Untranslated regions were cloned using synthetic DNA (Integrated DNA Technologies) or by isolation using 5’ RACE (RLM-RACE kit, Invitrogen). Template was PCR amplified using Phusion polymerase from the plasmids using the following primers, and gel purified, as described (Floor and Doudna 2016).
pA60 txn rev: TTT TTT TTT TTT TTT TTT TTT TTT TTT TTT TTT TTT TTT TTT TTT TTT TTT TTT TTT TTT CTG CAG
pA60 txn fwd: CGG CCA GTG AAT TCG AGC TCT AAT ACG ACT CAC TAT AGG
100 uL in vitro transcription reactions were set up at room temperature with 1-5 micrograms of purified template, 7.5mM ACGU ribonucleotides, 30mM Tris-Cl pH 8.1, 125mM MgCl2, 0.01% Triton X-100, 2mM spermidine, 110mM DTT, T7 polymerase and 0.2 U/uL units of Superase-In RNase inhibitor (Thermo-Fisher Scientific). Transcription reactions were incubated in a PCR block at 37 degrees C for 1 hour. 1 uL of 1 mg/mL pyrophosphatase (Roche) was added to each reaction, and the reactions were subsequently incubated in a PCR block at 37 degrees C for 3 hours. 1 unit of RQ1 RNase-free DNase (Promega) was added to each reaction followed by further incubation for 30 minutes. RNA was precipitated by the addition of 200 uL 0.3M NaOAc pH 5.3, 15 ug GlycoBlue co-precipitant (Thermo-Fisher Scientific) and 750 uL 100% EtOH. Precipitated RNA was further purified over the RNA Clean & Concentrator-25 columns (Zymo Research). Glyoxal gel was run to assess the integrity of the RNA before subsequent capping and 2’ O-methylation.
20 ug of total RNA was used in a 40 uL capping reaction with 0.5mM GTP, 0.2mM s-adenosylmethionine (SAM), 20 units of Vaccinia capping enzyme (New England Biolabs), 100 units of 2’-O-Me-transferase (New England Biolabs) and 25 units RNasin Plus RNase inhibitor (Promega). The reactions were incubated at 37 degrees C for 1 hour, followed by purification over the RNA Clean & Concentrator-25 columns (Zymo Research) and elution in DEPC H2O. Glyoxal gel was run to assess the integrity of the RNA before proceeding to in vitro translation reactions.
Transfection of siRNA for in vitro translation
HEK293T cells in 150mM plates were transfected with 20 uL of siRNA (against DDX3 or a non-targeting control) using Lipofectamine 2000 (Thermo Fisher Scientific), following manufacturer’s instructions. Cells were harvested for preparation of cellular extracts after 48 hours.
Generation of DDX3 mutant cell lines
DDX3 WT and S228C mutant constructs were synthesized and cloned downstream of a CMV promoter (Twist Biosciences). 40 ug of plasmids were transfected into HEK293T cells using Lipofectamine 2000 (Thermo Fisher Scientific), following manufacturer’s instructions. Cells were harvested for preparation of cellular extracts after 48 hours.
Preparation of cellular extracts for in vitro translation
Three to five 150mm plates of HEK293T cells were trypsinized and pelleted at 1000g, 4 degrees C. One cell-pellet volume of lysis buffer (10mM HEPES, pH 7.5, 10mM KOAc, 0.5mM MgOAc2, 5mM DTT, and 1 tablet miniComplete EDTA free protease inhibitor (Roche) per 10 mL) was added to the cell pellet and was incubated on ice for 45 minutes. The pellet was homogenized by trituration through a 26G needle attached to a 1 mL syringe 13-15 times. Efficiency of disruption was checked by trypan blue staining (>95% disruption target). The lysate was cleared by centrifugation at 14000g for 1 minute at 4 degrees C, 2-5 ul was reserved for western blot analysis, and the remainder was aliquoted and flash frozen in liquid nitrogen.
Antibodies
Primary antibodies used in this study include anti-DDX3 (Bethyl A300-474A; Figure 1), rabbit polyclonal anti-DDX3 (custom made by Genemed Synthesis using peptide ENALGLDQQFAGLDLNSSDNQS; Figure 6; Lee et al. 2008), anti-actin HRP (Santa Cruz Biotechnology, sc-47778), anti-FLAG HRP (Sigma, A8592).
In vitro translation
5 uL in vitro translation reactions were set up with 2.5 uL of lysate and 20 ng total RNA (0.84mM ATP, 0.21mM GTP, 21mM Creatine Phosphate, 0.009units/mL Creatine phosphokinase, 10mM HEPES pH 7.5, 2mM DTT, 2mM MgOAc, 100mM KOAc, 0.008mM amino acids, 0.25mM spermidine, 5 units RNasin Plus RNase inhibitor (Promega) as described (Lee et al. 2015). Reaction tubes were incubated at 30 degrees C for 45 minutes, and expression of the reporter was measured using the Renilla Luciferase Assay System (Promega) on a GloMax Explorer plate reader (Promega). Drug treatments were performed by incubating the reaction with the indicated compound for 15 minutes at 30 degrees C prior to the addition of RNA.
Results
We performed ribosome profiling to determine the set of transcripts that are regulated by DDX3. DDX3X is an essential gene (Chen et al. 2016; Lee et al. 2008), so we transiently knocked down its expression using siRNA and collected ribosome protected footprints (Figure 1A,B). Measuring changes in both RNA abundance and ribosome occupancy enabled us to distinguish between different modes of DDX3-mediated regulation. High correlations between replicate experiments indicate reproducibility of the results (Figure S1A). Many genes with increased RNA expression also showed mild reduction in their translation level, small enough to not meet our threshold criteria. We used a Gaussian mixture model to deconvolve these groups (Methods; Figure S1B). Unlike its yeast ortholog Ded1p, we find that depletion of DDX3 only affects ribosome occupancy of two percent of messages (Figure 1C). Almost 90% of ribosome occupancy changes upon DDX3 depletion are decreases, consistent with a role of DDX3 in promoting translation of select mRNAs (Figure 1C,D). We note that transcripts whose RNA level changes after DDX3 depletion tend to have high basal expression level and small fold changes (Figure S1C). Interestingly, histone mRNAs modestly increase in translation upon DDX3 depletion (Figure 1D). Surveying the list of genes affected by DDX3 depletion revealed canonical genes that are subject to translational control, such as the oncogene ODC1 (Figure 1E, Table S1). Diverse biological pathways are affected by DDX3 depletion, revealing the enrichments of histone in translationally upregulated mRNAs (“TE_up”), and genes related to neuronal branching in the translationally downregulated set (“TE_down”) (Figure S1C).
DDX3 is thought to regulate translation through transcript 5’ UTRs, and we found genes that are regulated by their 5’ UTRs such as ODC1 in the translationally downregulated set (Figure 1E; Auvinen et al. 1992; Steeg et al. 1991). We therefore measured the enrichment of ribosomes in transcript 5’ UTRs, under the hypothesis that depletion of DDX3 might lead to defective scanning and ribosome accumulation (Guenther et al. 2018), or selective ribosome depletion on coding sequences. Indeed, we find more ribosomes in transcript 5’ UTRs relative to coding sequences upon DDX3 depletion (Figure 2A), especially in mRNAs that show translational downregulation. An example gene with increased 5’ UTR ribosome occupancy is shown in Figures 2B and S2A (HMBS), which is regulated by upstream ORF (uORF) translation. We considered features within the 5’ UTR that could potentially contribute to a transcript’s sensitivity to reduced DDX3 levels, and observed that the length of 5’ UTRs that are sensitive or insensitive to DDX3 depletion are similar (Figure 2C). However, DDX3-dependent 5’ UTRs have higher GC content (Figure 2C), consistent with the relationship between 5’ UTR structure and ribosome scanning (Kozak 1986). Other regions of these DDX3-dependent mRNAs also show a high-GC signature (Figure S2B). We examined predicted RNA secondary structure in 5’ UTRs and found that mRNAs that decrease in translation upon DDX3 depletion have regions with lower predicted RNA folding free energy than transcripts that are not dependent on DDX3 (Figure 2D). The position of a structure in RNA is thought to influence the requirement for DDX3 (Soto-Rifo et al. 2012), so we examined predicted structure across 5’ UTRs. Interestingly, DDX3-sensitive transcripts have higher predicted structure through the entire 5’ UTR and not just at the 5’ end (Figure S2C).
To better define the set of DDX3-sensitive transcripts, we measured DDX3 binding sites with high specificity using PAR-CLIP. Previous work using a complementary method (iCLIP) to measure DDX3 binding sites identified 5’ UTR and ribosomal RNA binding. Curiously, even though DDX3 is thought to regulate translation initiation, binding was also identified in coding sequences and 3’ UTRs (Oh et al. 2016; Valentin-Vega et al. 2016). Here, we leveraged the additional specificity afforded by T>C transitions induced by protein adducts on crosslinked uridine residues in PAR-CLIP to reexamine these findings. We treated cells with 4-thiouridine (4sU) and used UV-A crosslinking at 365 nanometers to crosslink RNA and protein complexes (Hafner et al. 2010).
High-throughput sequencing of RNA fragments crosslinked to DDX3 identified a binding site for DDX3 on the 18S ribosomal RNA (Figure 3; Eliseev et al., 2018) with high replicability (Figure S3A). Interestingly, while there were many rRNA fragments sequenced following PAR-CLIP (Figure S3A), there were only two sites with high-confidence T>C transitions (Figure 3B). The major interaction site maps to helix 16 of the 18S rRNA, similar to where Ded1p crosslinks to rRNA in yeast (Guenther et al. 2018). Helix 16 is on the small ribosomal subunit facing incoming mRNA, which might provide DDX3 access to resolve mRNA secondary structures to facilitate inspection by the scanning 43S complex (Figure 3C). Interestingly, the crosslink site on h16 is just opposite the eIF4B binding site, another factor crucial in ribosome recruitment and scanning (Sen et al. 2016; Walker et al. 2013). This is consistent with observations that eIF4B and Ded1 cooperate in translation initiation on mRNAs (Sen et al. 2016), and consistent with a previously reported interaction between DDX3 and eIF3 (Lee et al. 2008).
In addition to ribosomal RNA binding, we found that DDX3 binds primarily to coding transcripts (Figure 4A). We find high confidence peaks for DDX3 binding in 67 percent of expressed mRNAs (Figure 4B, S4A, Table S2), considerably more than are translationally regulated (Figure 1C). To identify where DDX3 binds mRNAs, we averaged reads across all expressed transcripts in a metagene analysis. We found that DDX3 primarily binds to transcript 5’ UTRs, with a small number of reads mapping in the coding sequence and 3’ UTR (Figure 4C). We used available CLIP data to compare the binding pattern of DDX3 to other known mRNA binders (Zhu et al. 2019). We selected three RNA-binding proteins to compare to: eIF3b is a member of the multi-subunit initiation factor eIF3 (Aitken et al. 2016), FMR1 interacts with elongating ribosomes (Chen et al. 2014), and MOV10 is involved in 3’ UTR-mediated mRNA decay (Gregersen et al. 2014; Sievers et al. 2012). The binding pattern of DDX3 most closely resembles the initiation factor eIF3b (Figure 4C). However, we detected some DDX3 binding within coding sequences and even 3’ UTRs. We used the frequency of T>C transitions at each site as a measure of the specificity of protein-RNA interaction (Figure S4B; Mukherjee et al. 2018). Interestingly, high specificity crosslinks with frequent T>C transitions reside most often in 5’ UTRs, especially near the start codon (Figure S4C). We also find DDX3 binding to transcript 5’ UTRs whose translation decreases upon DDX3 depletion, such as PRKRA and ODC1 (Figure 4D and S4D). Confirmation of this binding pattern comes from an independent assay of protein-RNA interaction, as measured by eCLIP (Figure S4E; Van Nostrand et al. 2016). Taken together, we conclude that DDX3 primarily functions as a translation initiation regulator by binding to transcript 5’ UTRs.
Since we found that DDX3 binds to transcript 5’ UTRs of messages that decrease in translation upon DDX3 depletion such as PRKRA (Figure 4D), we sought to determine if there was a relationship between DDX3 binding and translation changes. DDX3 binds a large majority of cellular transcripts (Figure S4A) but only regulates a small fraction of them (Figure 1), suggesting there are differences between the transcripts that are translationally controlled by DDX3 versus those that are not. We hypothesized that there is a relationship between the structure of a transcript’s 5’ UTR and its dependence on DDX3, as has been observed in the yeast ortholog Ded1p (Guenther et al. 2018; Gupta et al. 2018) and for individual mRNAs (Ku et al. 2018; Lai et al. 2008, 2010; Soto-Rifo et al. 2012). We find that there is increased T>C conversion specificity in the 5’ UTR of transcripts whose translation decreases upon DDX3 depletion (Figure 5A). An example of coordinate binding and regulation by DDX3 is the ODC1 gene (Figure S4D). We analyzed the average GC content in PAR-CLIP peaks across mRNAs stratified by the nature of translation changes upon DDX3 depletion, and find that the set of translationally downregulated genes have elevated GC content in peaks in their 5’ UTRs (Figure 5B). Despite a clear reported preference of DDX3 binding to structured RNAs, we reasoned that such a sequence enrichment might also result from inherent biases of PAR-CLIP in determining binding specificity. We observed a consistent, albeit weak, correlation between peak scores and %GC content for published PAR-CLIP peaks (Figure S5A) for 50 RNA-binding proteins (Zhu et al. 2018), confirming the presence of possible biases inherent to the experimental protocol and/or analysis (Friedersdorf and Keene 2014). However, a similar binding pattern, enriched in 5’ UTRs of downregulated transcripts, was confirmed using a different PAR-CLIP peak calling strategy (Figure S5B,C). Taken together, we conclude that while T>C conversion specificity is potentially confounded by GC content, the set of mRNAs that DDX3 regulates nevertheless have increased binding relative to all expressed transcripts.
Our data suggest that DDX3 directly affects translation of a subset of mRNAs. However, ribosome profiling measures ribosome occupancy and does not distinguish between elongating or stalled ribosomes. Given that DDX3 binds primarily to 5’ UTRs, and that DDX3 binding is enriched in transcripts it regulates (Figure 5, S5), we reasoned the ribosome occupancy changes observed here are likely due to altered translation initiation at the main start codon. To test this, we cloned DDX3-sensitive 5’ UTRs from this work and previous work (Oh et al. 2016) upstream of a Renilla luciferase reporter, which is not sensitive to DDX3 depletion (Figure 6A, Table S3). As the annotated DVL2 5’ UTRs are nested, we cloned the prevalent isoforms in HEK 293T cells using 5’ RACE. We then made translation extracts from HEK 293T cells transfected with a nontargeting control siRNA or a DDX3 siRNA (Figure 6B). Next, we in vitro transcribed, capped, and 2’-O methylated the reporter RNAs and performed in vitro translation in wild-type or DDX3-depleted lysate. We find that 5’ UTRs from DDX3-sensitive mRNAs (Figure 1D, 4D) also confer DDX3 dependence to the luciferase reporter (Figure 6C; PRKRA, DVL2); translation of the Rluc control, containing a generic 5’ UTR sequence, was not affected. Therefore, based on these reporter experiments and DDX3 binding pattern, we interpret ribosome occupancy changes upon DDX3 depletion as a result of mis-regulated translation initiation dynamics.
However, translation of RPLP1 was unchanged upon DDX3 depletion (Figure 6C). RPLP1 was identified as an mRNA with uORF occupancy changes upon mutant DDX3 expression in prior work (Oh et al. 2016). This suggests there may be different effects on translation between depletion of DDX3 and expression of a dominant negative mutant. Interestingly, patients carrying inactivating point mutations in DDX3 display more severe clinical symptoms than patients with truncating mutations (Lennox et al. 2018). Point mutations in DDX3 associated with medulloblastoma are dominant negative and act by preventing enzyme closure of DDX3 towards the high-RNA-affinity ATP-bound state (Epling et al. 2015; Floor et al. 2016b; Jones et al. 2012; Kool et al. 2014; Pugh et al. 2012; Robinson et al. 2012). We therefore tested chemical inhibition of DDX3, which functionally mimics a dominant negative mutation by blocking ATP binding. We were unable to inhibit translation in a DDX3-dependent manner using RK-33 (Bol et al. 2015), and instead found that it acted as a general translation inhibitor (Figure S6). We therefore used a recently established chemical genetic approach for inhibiting DEAD-box proteins by introducing a cysteine near the active site (Barkovich et al. 2018; Floor et al. 2016a). We made translation extracts with wild-type DDX3 or DDX3 S228C, which is sensitive to the electrophilic inhibitor AMP-acrylate (Figure 6D,E). Interestingly, we find that chemical inhibition of DDX3 decreases translation of all tested reporters but not the generic 5’ UTR sequence in the Rluc control (Figure 6E). Taken together, we find that DDX3 sensitivity for translation is preserved in translation extracts and that depletion of DDX3 appears to have different outcomes on translation than inhibition or dominant negative variants.
Discussion
DDX3X is an essential human gene that is altered in numerous disease states. Here, we use a set of transcriptomics approaches and biochemistry to show that DDX3 regulates a subset of the human transcriptome, likely through resolving RNA structures. We further show that inactivating point mutations in DDX3 yield different outcomes than depletion. Reporter experiments that show these 5’ UTRs are sufficient to confer DDX3 sensitivity onto unrelated coding sequences. We conclude that DDX3 affects translation initiation through interacting with transcript 5’ UTRs. Our data firmly establish that the major role of DDX3 is in translation initiation, and reveal translation differences between mutated and haploinsufficient DDX3 expression.
We identified binding between DDX3 and helix 16 (h16) on the human 40S ribosome. This is similar to binding sites identified previously (Guenther et al. 2018; Oh et al. 2016; Valentin-Vega et al. 2016), but strengthened here using T>C transitions defined by PAR-CLIP. Interestingly, histone mRNAs increased translation upon DDX3 knockdown (Figure 1D), and their translation is dependent on mRNA binding to the rRNA h16 helix (Martin et al. 2016). Our data suggests the possibility that there is competition for h16 between DDX3 and histone mRNAs, and their translation increases upon DDX3 knockdown due to increased accessibility to h16. However, histone mRNAs contain highly repetitive sequences and lack a poly-A tail, perhaps requiring a more tailored approach to precisely investigate the mechanisms of their regulation. The set of mRNAs that require h16 for their translation will be an interesting direction to pursue in the future.
Despite primarily affecting translation, some genes exhibited mild changes in steady-state RNA abundance upon DDX3 depletion. RNA-level changes could be mediated by indirect effects of a DDX3-dependent translation target or reflect additional mechanisms by which DDX3 regulates gene expression. For instance, the up-regulation of mRNAs encoding secretory proteins (CHGA, CHGB, SCG3) or of highly abundant RNAs (Figure 1D, S1C) might hint to possible roles of DDX3 in ER-mediated RNA processes or as a more general regulator of cytoplasmic RNP complexes. For instance, DDX3 is an important factor in stress-granule complexes (Shih et al. 2012). Alternatively, RNA-level changes could reflect differences in co-translational RNA decay pathways. Potential interactions between DDX3 regulation of both ribosome occupancy and RNA levels will be further explored in future studies.
DDX3 is an abundant protein, with approximately 1.4 million copies per HeLa cell (Kulak et al. 2014), or about half the abundance of ribosomes (Duncan and Hershey 1983). We have interpreted data in this work by hypothesizing that DDX3 is functioning in cis by binding to the 40S ribosome and facilitating translation initiation on the associated mRNA. It is also possible that DDX3, alone or in combination with other abundant DEAD-box proteins like eIF4A, functions in trans by activating an mRNA prior to 43S complex loading and then binds to the 43S complex. Future work defining the binding site of DDX3 on the ribosome could enable separation of cis and trans functions to test these two models, although we note that the functional consequences of DDX3 depletion we have observed here are independent of its functioning in cis or trans.
We find 249 mRNAs change in translation level upon DDX3 depletion following joint replicate analysis (Figure 1C), although DDX3 binds to over 9,000 mRNAs (Figure 4B, S4A). Therefore, not all mRNAs bound by DDX3 are strongly affected upon DDX3 depletion. This is markedly different than the observed requirement for Ded1 in translation of most yeast mRNAs (Guenther et al. 2018; Sen et al. 2015; Chuang et al. 1997; de la Cruz et al. 1997). mRNAs that require DDX3 for translation have a molecular signature involving increased 5’ UTR secondary structure (Figure 5). Genes that produce mRNAs that depend on DDX3 for translation include potent regulators of cellular state, such as disheveled 1 and 2 (DVL1 and DVL2), many transcription factors (e.g. E2F1, E2F4, TCF3, HES6, and ELK1), and transforming growth factor beta 1 (TGFB1; Table S1). However, the set of regulated mRNAs will change between cell types and states depending on the set of expressed transcripts that contain this molecular signature. Thus, it becomes challenging to establish directness for phenotypic changes that result from DDX3 knockdown, since DDX3 could be directly mediating the phenotype or indirectly mediating the phenotype through translational deregulation of a phenotypic effector.
DDX3 is altered in numerous human diseases, including cancers and developmental disorders (Sharma and Jankowsky 2014). Some diseases are characterized by missense variants (Jones et al. 2012; Kool et al. 2014; Pugh et al. 2012; Robinson et al. 2012), while others involve predominantly nonsense or frameshift variants (Dufva et al. 2018; Jiang et al. 2015), and still others present with a mixture of variant types (Deciphering Developmental Disorders 2015; Lennox et al. 2018; Wang et al. 2018). Our work here suggests that haploinsufficient variants in DDX3 that deplete protein levels may result in different translation changes than missense variants that yield dominant negative proteins. We attempted to directly compare translational changes upon DDX3 depletion identified in this work with previous expression of mutant DDX3 but stopped due to confounding variability in biological sample and library preparation and sequencing protocols. An intriguing direction for the future is to determine in an isogenic background how different mutation types in DDX3 affect gene expression, the underlying molecular mechanisms, and potential therapeutic interventions.
Data access
Sequencing data can be retrieved using GEO accession number GSE125114. Source code for all analysis programs can be found here: https://github.com/lcalviell/DDX3_manuscript_analysis.
Author contributions
Conceptualization, L.C., E.W., W.F., M.L., and S.N.F.; Investigation, L.C., K.R., S.V., and E.W.; Writing – Original Draft, S.N.F.; Writing – Review & Editing, S.N.F., S.V., L.C., E.W., and M.L.; Resources, B.T., M.T., J.K., and W.F.; Funding Acquisition, M.L., and S.N.F.; Supervision, W.F., M.L., and S.N.F.
Disclosure declaration
The authors declare no conflicts of interest.
Acknowledgments
We thank members of the Floor lab for feedback on the manuscript and Megan Moore and Kevan Shokat for advice and the kind gift of AMP-acrylate. Computation was supported by the UCSF Wynton computing infrastructure. This work was supported by the UCSF Program for Breakthrough Biomedical Research, funded in part by the Sandler Foundation (to SNF), the California Tobacco-Related Disease Research Grants Program 27KT-0003 (to SNF), and the National Institutes of Health DP2GM132932 (to SNF), and DFG Priority Project SPP1935 (to KR and ML).