A ubiquitous GC content signature underlies multimodal mRNA regulation by DDX3X

The road from transcription to protein synthesis is paved with many obstacles, allowing for several modes of post-transcriptional regulation of gene expression. A fundamental player in mRNA biology is DDX3X, an RNA binding protein that canonically regulates mRNA translation. By monitoring dynamics of mRNA abundance and translation following DDX3X depletion, we observe stabilization of translationally suppressed mRNAs. We use interpretable statistical learning models to uncover GC content in the coding sequence as the major feature underlying RNA stabilization. This result corroborates GC content-related mRNA regulation detectable in other studies, including hundreds of ENCODE datasets and recent work focusing on mRNA dynamics in the cell cycle. We provide further evidence for mRNA stabilization by detailed analysis of RNA-seq profiles in hundreds of samples, including a Ddx3x conditional knockout mouse model exhibiting cell cycle and neurogenesis defects. Our study identifies a ubiquitous feature underlying mRNA regulation and highlights the importance of quantifying multiple steps of the gene expression cascade, where RNA abundance and protein production are often uncoupled.


Introduction
The cytoplasmic fate of RNA molecules is impacted their subcellular localization, RNA binding partners, and engagement with the ribosomal machinery.These aspects are strongly interconnected 1 , which poses a great challenge, as it increases the number of variables and experimental approaches needed to answer many questions in mRNA biology.To this end, many protocols couple biochemical isolation, or metabolic labeling, of RNA with high throughput sequencing technologies, thus providing a snapshot of the transcriptome at specific stages of the mRNA life cycle, with high throughput and sensitivity.For example, highthroughput sequencing protocols, when coupled to ribosome isolation, such as in Ribo-seq 2 , metabolic labeling strategies in SLAM-seq 3 , immunoprecipitation of RNA binding proteins (RBP) as in CLIP-seq 4 and many others, have shed light on many regulatory mechanisms pertaining to different aspects of post-transcriptional gene regulation.
DDX3X is a multifunctional RNA helicase that is highly expressed in many tissues and able to unwind structured RNA to influence cytoplasmic post-transcriptional gene regulation 5 .
Together with its ability to bind initiating ribosomes, DDX3X has been often described as a translation regulator, specifically promoting translation of RNA with structured 5'UTRs 6,7 .
However, as mentioned above, cytoplasmic processes like translation or mRNA decay are intertwined, and connection between the two processes encompass different molecular mechanisms, such as mRNA surveillance mechanisms like nonsense-mediated decay (NMD) 8 , ribosome-collision dependent mRNA cleavage 9 , and others.In order to understand when and how such processes are coupled, it is important to study the dynamics of such mechanisms.
For instance, it has been proposed that miRNA can first trigger translation suppression and then mRNA deadenylation and decapping leading to RNA degradation 10 .
Mutations in DDX3X are associated with a variety of human diseases including cancers and developmental delay 11 .Variant types are disease selective in DDX3X, with cancers ranging from primarily loss-of-function alleles in NK-TCL and other blood cancers to nearly exclusively missense variants in medulloblastoma 11 .In DDX3X syndrome, missense variants are phenotypically more severe than loss-of-function.Previously, we used functional genomics approaches to identify mechanistic differences between depletion of DDX3X and expression of missense variants 7 .We found that DDX3X missense variants predominantly affect ribosome occupancy while DDX3X depletion impacts both ribosome occupancy and RNA levels.However, it is unclear whether the changes in RNA levels constituted a cellular response to translation suppression, often described as "buffering" 12 .mRNA regulation has been linked to neurogenesis during development, where multiple RNA binding factors, including DDX3X, ensure correct protein synthesis as cells transition between different fates and states 13 .To that end, it is important to think about the dynamics of gene expression, as complex dynamics of cell proliferation and differentiation ensure correct developmental patterning.
In order to access such complex interplays of a multitude of factors which shape gene expression, large-scale consortia have provided a great resource for investigations into gene regulation.While historically devoted to promoting investigation into transcriptional regulation, recent efforts started to provide precious information into post-transcriptional mechanisms, with hundreds of RBPs profiled in terms of both binding and function, by means of CLIP-seq, and knockdown followed by RNA-seq 14 .As in biology many molecular processes are interconnected, large-scale datasets and data amenable to re-analysis are at the very heart of many research efforts 15 .
Here, we identify how inactivation of DDX3X evolves over time to lead to acute and long-term changes to post-transcriptional gene regulation.We here employ different analytical approaches applied to newly generated experimental data and many previously published studies related to mRNA regulation, to show that GC content is associated with mRNA stability changes following DDX3X depletion.Our analyses indicate that this effect is widespread and is associated with cell cycle changes in mRNA regulation, including RNA stability.This further reinforces roles for DDX3X in RNA stability in addition to translation.Together, our work represents a significant advancement in the understanding of a fundamental regulator, which sits at the very heart of the gene expression cascade.

Time-resolved gene expression regulation by DDX3X.
To characterize the dynamics of DDX3X-dependent changes in the gene expression cascade, we employed a previously validated auxin-degron system to efficiently deplete DDX3X protein in the HCT116 colorectal cancer cell line 16 , where we found near-complete rescue of gene expression changes by DDX3X expression, thus being able to use this tool to monitor DDX3Xdependent changes.We profiled RNA levels and translation using RNA-seq and Ribo-seq along a time-course of DDX3X depletion, at 4, 8, 16, 24 and 48 hours after auxin or DMSO control treatment.(Figure 1A).Efficiency of DDX3X depletion, together with quality control and general statistics of the generated libraries, can be found in Supplementary Figure 1 and Supplementary Table 1.As expected, the number of differentially expressed genes increased along the time-course, with most changes supporting the role of DDX3X as a positive regulator of translation (Figure 1B).Changes in translation were negatively correlated with changes in mRNA levels, which together contributed to many changes in Translation Efficiency (TE), calculated using Ribo-seq changes given RNA-seq changes (Methods).At a closer look, we observed how "TE_down" mRNAs undergo translation suppression in the early time point after DDX3X depletion, with their mRNA levels increasing in the later time points (Figure 1C).The opposite behavior is observed for "TE_up" mRNAs, exhibiting higher ribosome occupancy first, and lower mRNA levels later.Such behavior was more evident when showing time-point specific changes and binning mRNAs in a 2D grid on the Ribo-seq/RNA-seq coordinate plane (Figure 1D, Methods), which highlighted a common regulatory mode, with early translation regulation followed by changes in mRNA levels.
This analysis shows the time-resolved dynamics of mRNA regulation by DDX3X, with hundreds of mRNAs changing in their steady-state levels albeit showing the opposite directionality in translation rates.Translation suppression by DDX3X is coupled with mRNA stabilization.
Changes to transcript levels can result from changes in transcription rates or posttranscriptional regulation.To identify the relative contribution of different processes to RNA levels, we used our time-course dataset to calculate changes in transcription, processing and stability using INSPEcT 17 .INSPEcT uses the proportion of intronic versus exonic reads to identify nascent vs. mature transcripts, and uses a system of ordinary differential equations (ODEs) to infer rates of RNA synthesis, processing and decay.Compared to non-regulated mRNAs, regulated mRNAs showed modest changes in transcription rates, suggesting transcription changes are not the major contributor to RNA level changes following DDX3 depletion, In contrast, we found more pronounced changes in mRNA stability as evidenced by TE down transcripts (Figure 2A).As our initial RNA-seq protocol was not designed to capture pre-mRNA molecules, we validated our estimated mRNA stability changes by employing the 4sU metabolic labeling SLAM-seq protocol 3 in our degron system after 8 hours of DDX3X depletion, in a way to detect changes in mRNA stability at early time points.Briefly, cells were incubated with 4sU to comprehensively label transcribed RNAs, and their abundance was followed after 8h of DDX3X degron activation, using DMSO as control.4sU treatment induces T>C conversions in the sequenced cDNA molecules, which can be used to monitor mRNA stability changes after a uridine chase, as shown in Figure 2B.As expected, we observed a drastic drop in T>C harboring reads after the chase, which reflects mRNA decay rates (Supplementary Figure 2).As shown in Figure 2B, after a labeling time of 24 hours, the percentage of reads harboring T>C mutations was different for the regulated categories (Methods) after only 8 hours of degron induction, confirming the stabilization of translationally suppressed mRNAs upon DDX3X depletion.While the modest depth and resolution of our SLAMseq dataset (Supplementary Figure 2) couldn't allow for more detailed insights on mRNA changes, it represented an important validation of mRNA stability regulation by DDX3X.In addition, we profiled RNA abundance via qPCR combining our DDX3X degron system with ActD treatment, to measure RNA stability changes.We selected few target genes: JUND was identified in our data as a stabilized RNA, while EIF2A was identified to be degraded.RACK1, LGALS1, and PFN1 were used as controls to normalize with via RT-PCR with Taq-man probes.
JUND RNA was stabilized after 24 hours with knock down of DDX3 and Actinomycin D (ACTd) treatment (Supplementary Figure 3A); EIF2A RNA was more degraded after 24 hours with knock down of DDX3 and ACTD (Supplementary Figure 3B).These results show an overall good agreement between the qPCR and the sequencing-based assays, despite the difficulty arising from choosing control genes and the modest fold changes observed in the sequencing data.
By profiling ribosome occupancy, steady state transcript levels, and mRNA decay, this analysis shows that DDX3X depletion triggers multiple modes of post-transcriptional regulation, involving translation suppression and a subsequent wave of mRNA stabilization.

GC-rich coding sequences underlie mRNA regulation by DDX3X.
With hundreds of mRNAs post-transcriptionally regulated after DDX3X depletion, we aimed to identify specific features belonging to up-or downregulated targets.We therefore built regression models to quantitatively predict levels of TE changes (Methods, Supplementary Table 2).We used different biophysical properties of genes and mRNAs, (e.g.length and GC content) and several gene and transcript features (e.g.introns, 3'UTR, etc.., Methods) as features for a Random Forest regression model.Given the extensive literature on codonmediated mRNA stability regulation, we added codon frequencies and previously validated codon optimality calculations 18 .Also, we added measured GC-content at 1 st , 2 nd or 3 rd codon position, as it was recently shown to potentially play a role in mRNA stability regulation 19,20 .
In addition, to pinpoint features predictive of mRNA stability changes rather than translation changes exclusively, we divided transcripts according to their position in the Ribo-seq/RNAseq coordinate system, to capture mRNAs where changes between assays agreed or not (Figure 3A, Methods).Interestingly, the categories differed in their DDX3X binding pattern (Supplementary Figure 4): re-analysis of our previously published PAR-CLIP data showed how stabilized targes (x,-xy groups) have a lower T>C conversion signal in their 5'UTRs, and a higher signal in CDS peaks, with the opposite being true for true translation targets (y group).This analysis suggests that stabilized mRNAs might be regulated differently than "canonical" translationally suppressed targets.
As shown in Figure 3B, the Random Forest model predicted TE changes with high precision, especially in cases where mRNA stability and translation were anti-correlated (-xy group).In addition, this model calculated the predictive power of each input feature (Figure 3C, Methods), which highlighted GC content in the coding sequence (which we will refer to as GCcds) as the most important feature.Feature selection is a very important method to select predictive features, especially when facing high levels of multicollinearity (Supplementary Figure 5).To validate the results from the Random Forest regression, we used Lasso regression (Methods), another well-known method for feature selection.Results from the Lasso regression were similar, and also identified GC content in the coding sequence as the most relevant feature in predicting TE changes (Supplementary Figure 6).GC content in the CDS remained the top predictor when using additional features, such as GC content in different sections of the CDS, or amino acid frequencies (Supplementary Figure 7).
In the light of these results, we tested whether GCcds was associated with the DDX3Xdependent transcriptome dynamics reported above.As shown in Figure 3D, mRNAs partitioned on the Ribo-seq/RNA-seq coordinate system based on their GCcds value.
Moreover, stability values from both INSPEcT and SLAM-seq partitioned according to GCcds values (Figure 3E-F).A similar, albeit weaker, separation was observed for predicted transcription and processing rates (Supplementary Figure 8).By using multiple analytical approaches, we here show how GCcds, not just GC content in general, or in other sections of the transcriptome, is a predominant feature of stabilized, yet untranslated, mRNAs following DDX3X depletion.
GC content in the coding sequence is a ubiquitous signal in mRNA regulation.
Given the extensive connections between different aspects of mRNA regulation by thousands of regulators, we tested the breadth of the influence of features such as GCcds in other studies of RNA regulators.We re-analyzed >2000 RNA-seq samples (Methods) from the recent ENCODE RBPome 14 study encompassing >200 RBP knockdowns, and performed differential analysis followed by predictive modeling using the same methods and features as described in the previous section, this time aiming at predicting changes in mRNA levels 227 (Figure 4A).228 We first grouped datasets according to knockdown efficiency, which varied according to knockdown method and cell line (Supplementary Figure 9, Methods).We selected the sample with the highest knockdown efficiency for each RBP and called feature importance using our analytical pipeline.Predictive power of our Random Forest regression strategy varied across different datasets (Figure 4B).Once again, the strongest predictor of mRNA changes was GCcds, whose predictive power dominated over other variables (Figure 4C, Supplementary Figure 10).As expected, changes upon DDX3X knockdown in the ENCODE data also exhibited a clear dependency over GCcds (Figure 4D), albeit to a lower degree compared to our degron dataset, likely due to differences in DDX3X depletion strategies and, importantly, to our translation profiling dataset, which allowed us to distinguish between specific classes (i.e."TE_down") of regulated mRNAs (Discussion).
Given the widespread relevance of GCcds as a predictor of post-transcriptionally regulated targets, we reasoned that a major cellular process might mediate the observed mRNA changes.We re-analyzed data from a recent study 21 focused on mRNA clearance during cell cycle re-entry, where the authors used a FUCCI (fluorescent, ubiquitination-based cell-cycle indicators) cell system coupling RNA labeling, scRNA-seq and single-molecule imaging techniques to find extensive decay differences among different transcripts, potentially related to poly-A tail mediated regulation.Despite a lower throughput when compared to sequencing-based experiments, kinetic parameters estimated from their data (exemplified in the decay curve in Figure 4E) showed significant differences when partitioned by GCcds values (Figure 4E).mRNAs rich in GCcds showed lower half-life values, and fast decay kinetics at cell cycle re-entry, with the opposite trend exhibited by mRNAs poor in GC content in their coding sequence.Motivated by this finding, we decided to investigate differences in cell cycle dynamics in our degron system, by using 5-ethynyl-2'-deoxyuridine (EdU) incorporation followed by FACS analysis (Methods, Supplementary Figure 11).As shown in Figure 4F and Supplementary Figure 12, DDX3X depletion resulted in cells staying more in G1 and less in S phase when compared to controls, throughout the time course.
By re-analysis of thousands of RNA-seq samples, these results show the prevalence of GCcds in post-transcriptional regulation and RBP functions, with a potential role for cell-cycle dependent mRNA dynamics in shaping such a regulatory phenomenon.
A shift in 5'-3' RNA-coverage as a hallmark of mRNA stabilization.
In addition to gene-level aggregate measures of abundance, we investigated whether changes in decay could be identified by taking advantage of the high resolution of RNA-seq experiments across gene bodies, which has previously been employed to inform about mRNA decay 19 .We leveraged our time-resolved degron dataset to investigate changes in 5'-3' coverage, a known hallmark of RNA degradation often employed to verify overall integrity of cellular mRNAs or to estimate transcript-level decay.We calculated 2 different metrics, using the strategy illustrated in Figure 5.
Initially, we pooled all samples to identify the major isoform for each gene (Methods), and 268 the first position at 15% of the maximum coverage.We then calculated such position for each time point.Importantly, coverage values were normalized for each transcript, thus controlling for expression level changes.Also, we did not observe similar changes at the 3' end of transcripts (Supplementary Figure 13).We then used coverage starting points as input for linear regression.The regression coefficient was extracted and compared across the top 250 stabilized, degraded, and control mRNAs, alongside 1500 control transcripts.As shown in Figure 5, coverage values on stabilized mRNAs started as an earlier position in the transcripts, with moderate albeit significant differences between categories, indicating a lower 5'-3' decay along the DDX3X degron time course.The opposite trend was observed for degraded transcripts.Similarly, we calculated average coverage values in a window of 300nt around the coverage start and applied a similar strategy: 5' coverage values increased along the time course, confirming the accumulation of translationally suppressed mRNA species otherwise destined for degradation.Results were similar when using different cutoffs for the definition of coverage starting point (Supplementary Figure 14).
To test whether the suppression of 5'-3' decay of untranslated transcripts by DDX3X occurs in 283 vivo, we re-analyzed recent RNA-seq/Ribo-seq dataset in a conditional Ddx3x (cKO) mouse 284 model 13 (Figure 6), where cell cycle and neurogenesis defects are evident when Ddx3x is depleted in neuronal progenitors.After applying our analytical pipeline, we observed that the accumulation of untranslated transcripts is even more evident in this in vivo model, as is its relationship with GCcds values (Figure 6A).Analogous to the strategy presented in Figure 5, 5' coverage values, as well as coverage starting points (Supplementary Figure 15), differed significantly between wild type and Ddx3x cKO animals (Figure 6B) in regulated transcripts, while no difference was detected at the 3'end (Supplementary Figure 16).
Leveraging again the power of hundreds of RNA-seq experiments, we examined 5' coverage profiles in the ENCODE dataset, partitioning experiments by their dependency on GCcds values.Differences between stabilized and control mRNAs are greater as the GCcds signature is more predominant (Figure 6C).Aggregating different experiments according to their GCcds dependency for example transcripts (Figure 6D) confirm this phenomenon, where both coverage starting position and coverage values changed across different datasets, indicative of mRNA decay regulation.
Taken together, we provide evidence for in vivo DDX3X-mediated stabilization of untranslated transcripts, its dependence on GCcds values, and, supporting the different analyses reported in this study (Figure 7) a high-resolution RNA-seq coverage analysis strategy to investigate GCcds-related mRNA decay regulation, with support from hundreds of post-transcriptionally perturbed transcriptomes.

Discussion
The multifaceted role of DDX3X, described as involved in different molecular processes, often hinders the ability to understand its functions, especially considering the interconnected nature of molecular processes in the cell.Multiple mRNA features might underlie different modes of regulation, as we previously showed and experimentally validated 5'UTR dependencies underlying DDX3X translation regulation 7 .This outlines an unmet need for studies linking multiple aspects of the gene expression cascade.
In addition to profiling RNA levels and translation, we further dissected dynamics of cytoplasmic regulation by DDX3X, by employing a time course of efficient DDX3X depletion (Figure 1A).Akin to previous studies observing translation suppression preceding mRNA changes during miRNA-mediated regulation 10 , we observed an accumulation of translationally suppressed RNAs.This highlights the importance to profile not only mRNA abundance but also translation levels, which, in absence of quantitative estimates of regulated protein levels, can greatly help researchers understanding the functions of many cryptic regulators often involved in multiple processes, like DDX3X and other RBPs 22 .Despite relatively fast kinetics of DDX3X degradation from our degron system, more work needs to be performed to pinpoint exactly what changes occur right after DDX3X depletion, and to more precisely quantify the lag between translation suppression and mRNA stabilization.
By employing multiple techniques for feature selection, we identified a major feature underlying mRNA regulation by DDX3X, as well as by many other post-transcriptional regulators.An important area of investigation for the future is to employ more unbiased approaches, akin to recent Natural Language Processing-inspired methods in transcription regulation 23 , in mRNA biology to accurately estimate the relevant features directly from data rather than specified by potentially biased approaches.In our hands, the relevance of GCcds is clearly picked up by both the Random Forest and the Lasso (Supplementary Figure 4).Importantly, we included similar features, such as overall GC content 24 , in UTRs, introns etc., alongside codon frequencies 20 and previously estimated values of codon optimality.
Our study suggests that data-driven approaches to functional transcriptomics are highly needed, where data from multiple experiments are routinely re-analyzed to test hypotheses and provide new insights into the complex world of mRNA biology.However, while profiling translation allowed us to focus on specific mRNA classes and their features, no large-scale translation profiling study exists yet, with few, precious small atlases recently appearing in the literature 25 .The current ENCODE RBP series is of great value to many mRNA biology researchers worldwide and it has been an invaluable resource for many recent studies 26,27 , yet an extension of these approaches which includes other aspects of post-transcriptional regulation, such as translation and stability, is in great need.
In the original ENCODE RBP study 14 , gene expression estimates were GC-corrected for each sample, as GC content has been often reported as a confounder, especially when comparing across sequencing technologies and labs.Given the presence of GC-related biases in sequencing-based assays, we think that great caution must be taken when observing expression changes driven by GC content features, especially when interpreted as direct effects from single molecular factors.Our degron time course analysis, despite containing dozens of features pertaining to GC content measures, detected GC content specifically in coding sequence as a feature underlying regulation, and this region-specific effect is not consistent with a general confounding role for GCcds.Moreover, our analysis focused on differences upon a perturbation under a single sequencing platform and laboratory settings, which are likely to have similar GC-related confounders, should there be any.Important confirmation of the relevance of GCcds and its relationship to mRNA dynamics also came from: employing SLAM-seq to estimate differences in stability (Figure 2), qPCR validations (Supplementary Figure 3), re-analysis of in vivo Ddx3x cKO RNA-seq/Ribo-seq (Figure 6), reanalysis of hundreds of RBP perturbations in human cell lines (Figure 4), and by analyzing kinetics extracted by transcriptome dynamics in cell-cycle specific states (Figure 4).
Together with well-established differential analysis statistical methods, which allowed us to robustly identify different classes of regulated mRNAs, we exploited the high resolution offered by RNA-seq to analyze differences in 5'end coverage for thousands of individual transcripts (Figure 5), as an additional metric reflecting active regulation of mRNA decay mechanisms.We posit that popular analysis strategies for -omics techniques, despite their popularity over more than a decade, often obscures information with regards to mRNA processing and other molecular mechanisms, which can be uncovered by dedicated computational methods.Importantly, such dynamics are invisible (or, worse, can significantly distort quantification estimates) when performing gene-level analyses.
The mechanism, or mechanisms, by which GC content in coding regions shapes mRNA dynamics is still to be determined.We speculate that complex RNA structures in the coding sequence can form in the absence of active translation elongation, and such structure may mediate degradation, helped by RNP complexes in the cytoplasm.However, recent literature focused on the role of different codons in mediating such effect 18 .In our hands, codonmediated effects seem to be negligible when considering the overall GCcds values, but more work needs to be done to identify cases where one or the other, or a mix of the two, can mediate mRNA decay on different transcripts.The involvement of mRNA dynamics during the cell cycle (Figure 4) suggests a model where, during cell cycle -dependent translation suppression, mRNAs are able to fold structures in the coding sequence promoting decay, and, when such processes are misregulated (e.g., by depleting multifunctional RNA helicases such as DDX3X), this process is less efficient.The extent to which cell cycle changes might depend on direct DDX3X binding and regulation remains to be elucidated.Further work needs to be done to refine the exact function, together with the subcellular localization, of regulated mRNAs.For instance, mRNA retention in the nucleus might be an additional underappreciated mode of gene expression control 28 , and is in line with our observation about the untranslated status of regulated transcripts.However, we identified GC content in the coding sequence as the hallmark feature for stabilized transcripts, a feature which is defined by translation in the cytoplasm.
While RBP binding data remains an important starting point from which we can build testable hypothesis, simple binding-to-function paradigms might also create bias when trying to explain complex phenotypes arising from RBP misfunction.Moreover, we observed how binding patterns might different between different regulated classes (Supplementary Figure 4).In our previous study we investigated the changes in translation and RNA abundance using a DDX3X helicase mutant; one of the observations we made pertained to the lack of RNA changes in our data, suggesting a potential function for the helicase activity in orchestrating such changes.
Previous work implicated DDX3X in mediating cell cycle dynamics by a variety of mechanisms 29 , including a direct regulation of cyclin E1 translation 30 , which however was not among the most regulated mRNAs in our dataset (Supplementary Table 2).More work needs to be done to accurately quantify mRNA dynamics and RBP functions in the cell cycle, where translation regulation mechanisms 31,32 ensure controlled rates protein synthesis.The connection between cell cycle, sequence content and mRNA regulation is reinforced by the in vivo data, adding to the importance of studying post-transcriptional regulation along the neurogenesis axis 33,34 , where the equilibrium between proliferation, apoptosis and differentiation 35 shapes the complexity of the developing brain.

Methods
Ribo-seq and RNA seq experimental protocol HCT116 cells with inducible degradation of DDX3X (as previously described 16 ), were plated in 15cm plates at 20% confluency (~3.5x10 6 cells/plate).48 hours post plating, when the cells were at ~ 70% confluency, the media was changed and fresh media with 500 µM IAA (Indole-3-acetic acid, the most common naturally occurring Auxin hormone) (Research Products International, cat: I54000-5.0)or DMSO was added to cells.Cells were harvested at 0, 4, 8, 16, 24, and 48 hours post IAA addition.Cell number did not appreciably increase over the 48 hours of the experiment.To quantify DDX3X protein, we used an anti-DDX3X antibody described in previous work 7 normalized to an anti-GAPDH antibody (Rockland Immunochemicals, cat: 600-401-A33S).
Illustra Microspin Columns S-400 HR (GE healthcare) were used to enrich for monosomes, and RNA was extracted from the flow-through using Direct-zol kit (Zymo Research).Gel slices of nucleic acids between 24-32 nts long were excised from a 15% urea-PAGE gel.Eluted RNA was treated with T4 PNK and preadenylated linker was ligated to the 3' end using T4 RNA Ligase 2 truncated KQ (NEB, M0373L).
Linker-ligated footprints were reverse transcribed using Superscript III (Invitrogen) and gelpurified RT products circularized using CircLigase II (Lucigen, CL4115K).rRNA depletion was performed using biotinylated oligos as described 36 and libraries constructed using a different reverse indexing primer for each sample.
For the RNA-seq, RNA was extracted from 25 µl intact lysate (non-digested) using the Directzol kit (Zymo Research) and stranded total RNA libraries were prepared using the TruSeq Stranded Total RNA Human/Mouse/Rat kit (Illumina), following manufacturer's instructions.
Libraries were quantified and checked for quality using a Qubit fluorimeter and Bioanalyzer (Agilent) and sequenced on a HiSeq 4000 sequencing system.

Slam-seq experimental protocol
SLAM-seq was performed at 60-70% confluency for DDX3X-mAID tagged HCT116.Media was changed and fresh media with 100μM 4-thiouridine (4sU) was added to cells and changed every 3 hours for 24 hours.8 hours prior to collection, growth medium was aspirated and replaced.Uridine chase was performed where cells were washed twice with 1X PBS and incubated with media containing 10 mM uridine and DMSO or 100µM IAA for 0 or 8 hours to induce degradation of DDX3X.At respective time points, cells were harvested followed by total RNA extraction using TRIzol (Ambion) following the manufacturer's instructions (SLAMseq Kinetics Kit -Catabolic Kinetics Module, Lexogen).Total RNA was alkylated by iodoacetamide for 15 min and RNA was purified by ethanol precipitation.200ng alkylated RNA were used as input for generating 3'-end mRNA sequencing libraries using a commercially available kit (QuantSeq 3ʹ mRNA-Seq Library Prep Kit FWD for Illumina, Lexogen).

SLAM-seq data analysis
Reads were mapped to the genome and transcriptome using same RNA-seq parameters, except for --outFilterMismatchNmax 10.Reads containing T>C mutations were extracted from the BAM file using GenomicAlignments and GenomicFiles Bioconductor 39 packages.

RNA-seq data analysis
Reads were mapped to the genome and transcriptome using STAR with same Ribo-seq parameters.Synthesis, processing, and degradation rates were obtained using INSPEcT 17 v1.17,using default settings.Genes significantly changing in their dynamics at a p-value cutoff of .05 were used for subsequent analysis.

Differential analysis
Unique counts on different genomic regions were obtained using RiboseQC 40 .5' end coverage values were inspected using Bioconductor 39 packages such as GenomicFeatures 41 and rtracklayer 42 .DESeq2 43 was used to obtain RNA-seq, Ribo-seq, and TE regulation, as described previously 7 : changes in translation efficiency were calculated using DESeq2 by using assay type (RNA-seq or Ribo-seq) as an additional covariate.Translationally regulated genes were defined using an FDR cutoff of 0.05 from a likelihood ratio test, using a reduced model without the assay type covariate, e.g.assuming no difference between RNA-seq and Ribo-seq counts.
A similar strategy was used to define significant changes in DDX3X-mediated stability from SLAM-seq: count tables with T>C reads were built and analyzed using labeling (4sU/DMSO) and degron status (8h.vs DMSO) as the two variables of interest; regulation in stability was defined using a reduced model without the degron type covariate, e.g.assuming no difference between DMSO and degron activation.
Translationally regulated genes (as defined by Ribo-seq/RNA-seq) and stability regulated genes (as defined by SLAM-seq) were defined using a p-value cutoff of .05.
For Figure 1D and 3D, the coordinate system was divided into 70 bins on each axis.GCcds values (for Figure 3D), or Ribo-seq and RNA-seq fold changes between each time point and the previous one (for Figure 1D) were averaged across genes in the same bin.Only mRNAs with significant changes in translation efficiency at 48h post degron induction were considered.

Random Forest and Lasso regression
The Random Forest regression was run using the randomForest 44 package with default parameters.Lasso regression was performed on scaled variables using the glmnet 45  TPM values using RNA-seq (in log scale).Baseline TE levels, defined as ratio of Ribo to RNA reads.Baseline RNA mature levels, defined as length-normalized ratio of RNA-seq reads in introns versus exons.GC content, length (in log scale) and Ribo-seq/RNA-seq density in: 5ʹ UTRs, a window of 25nt around start and stop codons, CDS regions, non-coding internal exons, introns, and 3ʹ UTRs.Codon frequencies.Measures of gene-specific codon optimality, previously calculated from a recent study 18 .GC-content at first, second, or third codon position.
Feature importance (measured by mean decrease in accuracy for the random forest model and by the lasso coefficients) and correlation between predicted and measured test data were calculated on a 5-fold cross-validation scheme.

Analysis of cell cycle -dependent mRNA dynamics
Estimated mRNA decay kinetics at cell cycle re-entry were deposited as supplementary files of the original study 21 .Genes were partitioned cutting their GCcds values into 3 groups given the low number of quantified genes (total n=220).

Cell cycle staging
To measure DNA replication and cell cycle stage, EdU (5-ethynyl-2´-deoxyuridine) was added to cells at 10nM for 1.5 hrs before harvesting. 1 confluent well of a 6-well plate of HCT116 cells were harvested and processed as per manufacturer's instructions for the Click-iT™ Plus EdU Alexa Fluor™ 647 Flow Cytometry Assay Kit (Thermo Fisher cat: C10634).Per manufacturer's instructions, FxCycle Violet DNA content stain (Thermo Fisher cat: F10347) was added after the Click-iT reaction at 1:1,000 dilution before quantifying on a BD LSR Dual Fortessa flow cytometer.Alexa Fluor 647 was measured in the 670-30 Red C-A Channel and FxCycle Violet Stain was measured in the 450-50 Violet F-A Channel.Analysis was performed using FACS DIVA and FlowJo V10 (FlowJo, LLC) software.

5'end coverage analysis
Computation on single-nucleotide coverage values was performed using rtracklayer 42 .For each differential analysis, we extracted the most 250 stabilized and the most 250 degraded genes ranking P-values from RNA-seq differential analysis.1500 control RNAs were randomly sampled from non-regulated genes, using p-values >.2 and TPM values > 3. Coverage values were 0-1 (min/max) normalized and the first position at value >.15 was identified as coverage starting position.In addition, a general coverage starting point was selected by pooling all samples, and a window of 250nt around such position was used to calculate average coverage values around the coverage start.Log2 fold change with respect to the control condition were then calculated.
For degron data, starting position and log2fc coverage values were extracted and used as input for linear regression.For coverage values, intercept was omitted, as the first value was 0. Beta coefficients were then extracted and compared between stabilized, degraded, and control mRNAs.
For mouse Ddx3x cKO and ENCODE data, differences between starting position (knockdown vs wt) and log2FC (knockdown vs wt) in coverage values were used to compare stabilized, degraded and control mRNAs, bypassing the regression step (2 values were calculated, as only wt or knockdown conditions were present).
TaqMan RT-PCR DDX3X-mAID tagged HCT116 cells were plated in 6 well plates at 30-40% confluency.24 hours post plating 500 μM IAA or DMSO was added to cells with or without 200nM Actinomycin D (ActD) for respective conditions.Total RNA was extracted from cells at 60-70% confluency using Direct-zol kit (Zymo Research) at 0 and 24 hours post-ActD and IAA or DMSO treatment.
TaqMan probes for JUND, EIF2A, RACK1, LGALS1, and PFN1 were predesigned and purchased from ThermoFisher Scientific.Riboseq degraded (EIF2A) or stabilized genes (JUND) were conjugated with FAM dye while control genes RACK1, LGALS1, and PFN1 were conjugated with VIC dye.For the TaqMan real-time quantitative PCR amplification reactions, we employed an Applied Biosystems QuantStudio 6 Real-Time PCR System instrument.Real-time PCR was conducted using TaqMan Fast Virus 1-Step Master Mix from Applied Biosystems in 384-well plates, following the manufacturer's protocol.Each well contained either the genes subject to riboseq degradation gene (EIF2A) or stabilization gene (JUND) along with control genes (RACK1, LGALS1, or PFN1).All reactions were conducted in triplicate.Thermal cycling conditions adhered to the manufacturer's recommended standard protocol.The quantification of the target input amount was determined using the cycle threshold (CT) value, which corresponds to the point at which the PCR amplification plot crosses the threshold.Expression of ribose degraded and stabilized genes were normalized to each control genes respectively.Gene Species Chromosome Location Assay ID Dye RACK1 HUMAN Chr.5: 181236928 -181243906 on Build GRCh38 Hs00272002_m1 VIC-MGB LGALS1 HUMAN Chr.22: 37675606 -37679802 on Build GRCh38 Hs00355202_m1 VIC-MGB computational analyses, prepared the figures, and wrote the manuscript, with support from F.D. and all Authors.

Figure 1 :
Figure 1: Dynamics of mRNA regulation by DDX3X

Figure 3 :
Figure 3: GC content in the coding sequence predicts regulation by DDX3X

Figure 4 :
Figure 4: A ubiquitous feature in mRNA regulation

Figure 6 .
Figure 6.GCcds -mediated mRNA stabilization is detectable in vivo and across the ENCODE RBP database.

Figure 7 .
Figure 7.A model for multimodal mRNA regulation by DDX3X package.While the entire feature table is available in Supplementary Table 2, a short description of the input features follows: