ABSTRACT
DUX4 is an embryonic transcription factor whose misexpression in skeletal muscle causes facioscapulohumeral muscular dystrophy (FSHD). DUX4 induces the transcription of thousands of RNAs and dysregulates multiple pathways that could contribute to FSHD pathophysiology. However, lack of temporal data and the knowledge of which RNAs are actively translated following DUX4 expression has hindered our understanding of the cascade of events that lead to muscle cell death. Here, we interrogate the DUX4 transcriptome and translatome over time and find dysregulation of most key pathways as early as 4 hours after DUX4 induction, demonstrating the potent effect of DUX4 in disrupting muscle biology. We also observe extensive transcript downregulation as well as induction, and a high concordance between mRNA abundance and translation status. Significantly, DUX4 triggers widespread production of truncated protein products derived from aberrant RNAs that are degraded in normal muscle cells. One such protein, truncated serine/arginine-rich splicing factor 3 (SRSF3-TR), is present in FSHD muscle cells and disrupts splicing autoregulation when ectopically expressed in myoblasts. Taken together, the temporal dynamics of DUX4 induction show how the pathologic presence of an embryonic transcription factor in muscle cells alters gene expression to ultimately perturb RNA homeostasis.
INTRODUCTION
Facioscapulohumeral muscular dystrophy (FSHD) is a prevalent progressive myopathy caused by misexpression of an early embryonic transcription factor, DUX4, in skeletal muscle (Hamel & Tawil, 2018; Tawil, van der Maarel, & Tapscott, 2014). Sustained DUX4 expression is toxic to somatic cells and induces apoptotic death, leading to skeletal muscle atrophy in individuals with FSHD (Bosnakovski et al., 2008; Kowaljow et al., 2007; Rickard, Petek, & Miller, 2015; Wallace et al., 2011). In the decade since the elucidation of a unifying genetic model for FSHD (Lemmers et al., 2010), much work investigating the molecular consequences of DUX4 expression has led to the discovery of myriad altered genes and pathways that are associated with FSHD pathophysiology (Campbell, Belleville, Resnick, Shadle, & Tapscott, 2018; Lim, Nguyen, & Yokota, 2020). Efforts to determine which genes or pathways downstream of DUX4 misexpression cause FSHD are just beginning. Such understanding is critical for the development of effective therapeutics for FSHD.
Previous genome-wide studies of DUX4-mediated gene expression have been performed as endpoint assays in cell populations that have seen long periods of DUX4 misexpression and are likely already undergoing apoptosis (Bosnakovski et al., 2019; Geng et al., 2012; Jagannathan, Ogata, Gafken, Tapscott, & Bradley, 2019; Jagannathan et al., 2016; Lek et al., 2020; Resnick et al., 2019; Shadle et al., 2019; Shadle et al., 2017; Sharma, Harafuji, Belayew, & Chen, 2013; Whiddon, Langford, Wong, Zhong, & Tapscott, 2017). Although mechanisms underlying DUX4- induced cytotoxicity have been uncovered this way, it has also been shown that important pathologic processes occur in distinct temporal order. For example, inhibition of the RNA quality control pathway nonsense-mediated decay (NMD) by DUX4 was shown to occur early in the course of DUX4 expression while proteotoxic stress was a later event (Feng et al., 2015; Jagannathan et al., 2019). Therefore, genome-wide time course studies are needed to identify early changes in gene expression or cellular pathways that lead to skeletal muscle cell death.
While DUX4 induces widespread transcriptomic changes, proteomics shows that many genes display discordant transcript and protein levels (Jagannathan et al., 2019). Given the sparse nature of proteomics data, sequencing-based measurement of active translation is necessary to determine if the observed discordance between RNA and protein levels in DUX4-expressing cells is the result of altered translation regulation or protein stability. Such a measure would also reveal whether the aberrant RNAs stabilized by DUX4-mediated NMD inhibition (Feng et al., 2015) produce truncated proteins.
To identify early transcript- and translation-level changes induced by DUX4, we performed paired RNA-sequencing (RNA-seq) and ribosome profiling (Ribo-seq) at 0, 4, 8, and 14 h following the expression of DUX4 in MB135-iDUX4 human skeletal muscle myoblasts. MB135-iDUX4 cells are a well-characterized model of FSHD that robustly recapitulates the consequences of endogenous DUX4 expression in FSHD muscle (Jagannathan et al., 2016; Yao et al., 2014). While RNA-seq measures transcript abundance, Ribo-seq measures ribosome-protected RNA fragments, allowing quantification of ribosome density along an mRNA that serves as a proxy for active translation (Ingolia, 2014). Ribo-seq also enables precise delineation of translation start and end sites to characterize the protein products made from aberrant RNAs.
We found that ∼1600 genes show a significant change at the transcript level after 4 h of DUX4 induction, including many that are repressed; most pathways known to be misregulated by DUX4 are altered at this early time point. We also found a high concordance of changes in mRNA abundance and translation status, suggesting that post-transcriptional modulation by DUX4 (Jagannathan et al., 2019) occurs primarily at the level of protein stability. Notably, the hundreds of aberrant RNAs stabilized by DUX4-mediated inhibition of NMD (Feng et al., 2015) are actively translated to produce truncated proteins, including truncated RNA-binding proteins (RBPs) and splicing factors. We show that one such truncated splicing factor, truncated serine/arginine-rich splicing factor 3 (SRSF3-TR), is expressed in FSHD muscle cell cultures and perturbs splicing homeostasis when ectopically expressed in myoblasts. Together, our results illustrate the importance of misregulated RNA quality control in DUX4-induced pathology.
RESULTS
Identification of time points for paired RNA-seq and Ribo-seq in DUX4-expressing cells
Misexpression of DUX4 in skeletal muscle cells is cytotoxic (Geng et al., 2012; Jagannathan et al., 2016; Rickard et al., 2015). To identify time points at which to measure transcript- and translation-level changes induced by DUX4 before the onset of overt cytotoxicity, we utilized a well-characterized doxycycline-inducible DUX4 human myoblast line, MB135-iDUX4 (Jagannathan et al., 2016), harboring a DUX4-responsive fluorescent reporter (Figure 1A). We live imaged these cells every 15 min for 28 h following doxycycline treatment to induce DUX4 (Figure 1B, Video 1). In the absence of doxycycline, a very low level of ‘leaky’ DUX4 expression was observed, which is consistent with studies in other doxycycline-driven DUX4 induction systems (Bosnakovski, Chan, et al., 2017; Dandapat et al., 2014). In the presence of doxycycline, expression of the DUX4-responsive fluorescent reporter was rapid and nearly synchronous, with fluorescence detectable after 2 h. Most cells fluoresced after 4 h, with fluorescence intensity increasing until 6 h at which point it plateaued. Cytotoxicity was first observed 9 h following DUX4 induction (Figure 1B). Myoblasts continued to round up and detach, resulting in a culture where most cells were dead or dying by 18 h. Given these data, we limited our study to time points ≤ 14 h to identify early and direct gene expression changes induced by DUX4.
To study the temporal trajectory of known DUX4-induced gene expression changes in important cellular pathways, we carried out quantitative reverse transcription PCR (RT-qPCR) using RNA extracted from the parental MB135-iDUX4 myoblast line. At 4 h we observed robust induction of the DUX4 target gene ZSCAN4 (Figure 1C, top left) and repression of the myogenic program (Figure 1C, top right), which is exquisitely sensitive to DUX4 misexpression (Bosnakovski et al., 2018; Bosnakovski et al., 2008). In contrast, other consequences of DUX4 expression, such as upregulation of pericentric human satellite II repeats (Shadle et al., 2019), activation of the unfolded protein response (Jagannathan et al., 2019), perturbed Wnt signaling (C. R. Banerji et al., 2015), and downregulation of oxidative stress response genes (Bosnakovski et al., 2008) are more prominent at later time points (Figure 1C, remaining panels). Based on our measures of DUX4 activity, perturbation of downstream pathways, and myoblast cell death, 4, 8, and 14 h were chosen as informative early, mid, and late time points to measure transcript- and translation-level changes induced by misexpression of DUX4, with the 0 h time point serving as a control. Therefore, we performed standard RNA-seq paired with Ribo-seq in parental MB135- iDUX4 myoblasts induced to express DUX4 for 0, 4, 8, or 14 h (Figure 1D).
DUX4 triggers early-onset disruption of key pathways involved in FSHD
We examined DUX4-induced transcriptome changes revealed by our RNA-seq dataset (Supplementary Table 1). As expected, the housekeeping gene RPL27 had constant, robust RNA expression throughout the time course (Figure 2A, top). Transcripts of a DUX4 target gene, ZSCAN4, were absent in uninduced cells but highly expressed at 4 h and increased with time, while transcripts from a myogenic gene, MYOD1, displayed the opposite trend (Figure 2A, middle and bottom). Genome-wide, DUX4 altered the expression of thousands of transcripts, with known DUX4 targets (Yao et al., 2014) showing increasing upregulation throughout the time course (Figure 2B). Notably, there were 1674 genes whose expression significantly changed after only 4 h of DUX4 induction with similar numbers activated and repressed. Genes underlying pathways known to be involved in FSHD pathology are already significantly altered at the 4 h time point (Figure 2 – figure supplement 1A), suggesting that DUX4 sets up cascades of misregulation very early after its expression in muscle cells.
To identify novel pathways altered by DUX4, we used k-means clustering to group genes significantly altered (defined as absolute log2 fold change > 1 and adjusted p-value < 0.01) at any point during the time course into five clusters (Figure 2 – figure supplement 1B) and carried out gene ontology (GO) analysis on each cluster (Figure 2C-E, Supplementary Table 2). We observed that Cluster 1, which is comprised of 181 genes rapidly induced upon DUX4 expression, returns GO categories that could underlie DUX4’s normal role in establishing an early embryonic program – negative regulation of cell differentiation, positive regulation of cell proliferation, and DNA-templated transcription. The 692 genes that are rapidly silenced (Cluster 5) are enriched for GO categories related to myogenesis, positive regulation of cell differentiation, and cytoskeleton organization. Together, this is suggestive of a general switch away from a differentiated muscle program and towards a proliferative phenotype. Cluster 2 (583 genes) that is less robustly induced and Cluster 4 (2462 genes) that is less robustly repressed are enriched for broad GO categories illustrative of the many fundamental cellular processes that are altered upon DUX4 misexpression. Cluster 3 (2243 genes), which is induced only at the late 14 h time point is enriched for GO categories mRNA splicing, ribonucleoprotein transport, and ubiquitin-dependent process, which have previously been reported as some of the major signatures of DUX4-induced gene expression (Geng et al., 2012; Jagannathan et al., 2016; Rickard et al., 2015). Together, these time course RNA-seq data show that each pathway misregulated by DUX4 has a unique temporal signature following DUX4 induction and suggest that a combination of perturbed cellular systems underlies DUX4 toxicity.
Most DUX4-induced coding transcripts are robustly translated
A previous study that used stable isotope labeling by amino acids in cell culture (SILAC) mass spectrometry to measure the DUX4 proteome revealed extensive post-transcriptional gene regulation including discordant changes at the RNA versus protein level, particularly for genes involved in RNA quality control (Jagannathan et al., 2019). While powerful, this study could not provide a complete index of all expressed proteins due to inherent limitations in mass spectrometry technology. Ribo-seq, by measuring which mRNAs are actively translated, is capable of identifying the proteins being produced with greater depth than proteomics. Additionally, DUX4 induces eIF2α phosphorylation at later time points in MB135 myoblasts (Jagannathan et al., 2019; Shadle et al., 2017), which is thought to generally inhibit cap-dependent translation (Proud, 2005), but the effect of this on protein expression in DUX4- expressing cells is unknown. Therefore, we asked whether transcript level changes driven by DUX4 misexpression were echoed at the level of translation by comparing the RNA-seq and Ribo-seq datasets. To verify the quality of our Ribo-seq data, we confirmed that reads exhibited the characteristic 3 nucleotide periodicity indicative of ribosome-protected RNA fragments (Figure 3 – figure supplement 1). Representative Ribo-seq read coverage plots of the housekeeping gene RPL27 showed constant, robust translation throughout the time course (Figure 3A, top). In contrast, a DUX4 target gene, ZSCAN4, showed no coverage in uninduced cells, low ribosome density beginning at 4 h, and active translation at 8 and 14 h (Figure 3A, middle). MYOD1 was robustly translated at 0 h but rapidly downregulated (Figure 3A, bottom). The changes in translation status at these specific genes mirrored the differences seen in their mRNA levels (Figure 2A).
On a genome scale, DUX4 altered the translation status for thousands of transcripts, with later time points showing larger differences and known DUX4 targets being translated at increasing levels throughout the time course (Figure 3B, Supplementary Table 3). Strikingly, most genes were concordantly up or downregulated at the level of transcript and translation status (Figure 3C, Materials and Methods) at all three time points with only a small number of genes showing some discordance. GO analysis of the discordantly regulated genes returned significant results only for the gene set (n = 137) that showed a mild translation downregulation at the 14 h time point with pathways such as protein targeting to ER and viral transcription being enriched (Supplementary Table 3). This likely represents a small subset of genes that are affected by the inhibition of cap-dependent translation due to eIF2α phosphorylation. In contrast, a pairwise correlation analysis of the RNA-seq, Ribo-seq and our previous SILAC proteomics (Jagannathan et al., 2019) data demonstrated that both RNA and translation status are discordant with protein levels (Figure 3D). Overall, these data show a high concordance of transcript level and ribosome occupancy in DUX4-expressing cells, demonstrating that post-transcriptional modulation by DUX4 likely occurs at the level of protein stability rather than protein synthesis.
We have previously shown that DUX4 induction leads to detectably lower protein abundance of key NMD factors such as UPF1, SMG6, and XRN1, and that UPF1 is proteolytically degraded (Feng et al., 2015; Jagannathan et al., 2019). We sought to determine if other NMD factors follow a mechanism similar to UPF1 or if their downregulation could be due to lowered translation of these proteins. Comparing RNA abundance from our RNA-seq data, translation status from our Ribo-seq data, and protein level from our previous proteomics data (Jagannathan et al., 2019) for all NMD factors showed that transcript level and translation is highly concordant, while protein level is not (Figure 3E). This confirms that multiple key NMD factors are indeed downregulated at the protein level without a corresponding change in their RNA level or translation status, pointing to protein stability as the mode of regulation by DUX4.
DUX4 causes widespread truncated protein production
Having established that most transcripts induced by DUX4 are robustly translated, we wanted to determine if truncated proteins are produced from RNAs containing premature translation termination codons (PTCs) stabilized as a result of DUX4-mediated NMD inhibition (Feng et al., 2015). Western blot analysis showed that levels of the key NMD factor UPF1 were reduced by 65% after only 4 h of DUX4 induction (Figure 4A). UPF1 levels fall further to 13% and 1% of baseline at 8 and 14 h, respectively. This confirmed our previous observation that NMD inhibition is an early event in the course of DUX4 expression (Feng et al., 2015), and suggested that proteins resulting from the translation of NMD-targeted, stabilized aberrant RNAs should appear in our dataset.
To ask if and when aberrant RNAs are translated, we used ORFquant (Calviello, Hirsekorn, & Ohler, 2020), a new pipeline that identifies isoform-specific translation events from Ribo-seq data. We then used DEXSeq (Anders, Reyes, & Huber, 2012) to conduct exon-level differential analysis on the set of ORFquant-derived open reading frames, using Ribo-seq data. This analysis identifies changes in relative exon usage to measure differences in the expression of individual exons that are not simply the consequence of changes in overall transcript level. After 4 h of DUX4 induction 397 genes showed differential expression of specific exons, of which only 24 are predicted NMD targets (Figure 4B), whereas later time points showed a greater fraction of exons that are unique to NMD targets as differentially expressed (Supplementary Table 4). We grouped exons based on their NMD target status and calculated their fold change in ribosome footprints at 4, 8, and 14 h of DUX4 expression compared to the 0 h time point (Figure 4C). We observed a progressive and significant increase in the translation status of NMD-targeted exons at 8 and 14 h, confirming the translation of stabilized aberrant RNAs in DUX4-expressing cells.
Translation of an NMD target typically generates a prematurely truncated protein. To ask how the specific truncated proteins being produced in DUX4-expressing myoblasts might functionally impact cell homeostasis, we conducted GO analysis of the 74 truncated proteins being actively translated at 14 h of DUX4 induction (Figure 4D, Supplementary Table 4). Strikingly, the truncated proteins are enriched for genes encoding RBPs involved in mRNA metabolism and specifically, splicing (Supplementary Table 4). Examples include genes such as IVNS1ABP, SRSF3, SRSF6, and SRSF7 (Figure 4E). Thus, not only are NMD targets stabilized by DUX4 expression but they also produce truncated versions of many RBPs, which could have significant downstream consequences to mRNA processing in DUX4-expressing cells.
Truncated SRSF3 is present in FSHD myotubes and perturbs RNA homeostasis
To explore the role truncated proteins play in DUX4-induced cellular phenotypes, we chose SRSF3 for further characterization. SRSF3 is an SR family protein that possesses an N-terminal RNA-binding RNA recognition motif (RRM) and a C-terminal arginine/serine (RS)-rich domain responsible for protein-protein and protein-RNA interactions. SRSF3 is a multifunctional splicing factor involved in transcriptional, co-transcriptional, and post-transcriptional regulation and has been implicated in a variety of human pathologies including heart disease, Alzheimer’s, and cancer (More & Kumar, 2020; Zhou et al., 2020). Examination of our paired RNA-seq and Ribo-seq data revealed robust expression of SRSF3 NMD-targeted exon 4 after 8 h of DUX4 induction, followed by robust translation of this exon that ends at the site of the PTC (Figure 5A).
To confirm translation of the aberrant SRSF3 RNA stabilized by DUX4, we carried out polysome profiling using sucrose density gradient separation. As expected, the polysome profile after 14 h of DUX4 expression showed a higher fraction of 80S monosomes compared to polysomes (Figure 5B, top left), consistent with our observation of eIF2α phosphorylation and general downregulation of translation at this and later time points of DUX4 expression (Jagannathan et al., 2019; Shadle et al., 2017). We extracted RNA from various fractions and found that RPL27 mRNA was ribosome-bound in both control and DUX4-expressing cells, as evidenced by its association with monosomes and polysomes and as expected for a housekeeping gene (Figure 5B, top right). In contrast, DUX4 mRNA and mRNA of the DUX4 target gene ZSCAN4 are highly translated only in DUX4-expressing myoblasts, as indicated by their enrichment in heavy polysomes (Figure 5B, middle). The normal SRSF3 isoform (SRSF3-Excl) is enriched in heavy polysomes in both control and DUX4-expressing myoblasts but is less abundant in the latter (Figure 5B, bottom left). In contrast, the NMD-targeted isoform of SRSF3 (SRSF3-Incl), also enriched in heavy polysomes in both conditions, was significantly more abundant in DUX4-expressing cells (Figure 5B, bottom right). These data confirm that aberrant SRSF3 mRNA is being actively translated into truncated protein in DUX4-expressing myoblasts.
To determine if we could detect truncated SRSF3 protein, we generated a custom antibody recognizing a 10 amino acid C-terminal neo-peptide unique to SRSF3-TR. This custom SRSF3-TR antibody was able to recognize FLAG-tagged SRSF3-TR exogenously expressed in 293T cells and endogenous SRSF3-TR immunoprecipitated from DUX4-expressing MB135-iDUX4 myoblasts (Figure 5C). We also used a commercial SRSF3 antibody that recognizes an N-terminal epitope common to both the full-length and truncated SRSF3. This antibody detected both exogenously expressed full-length and truncated FLAG-SRSF3, and endogenous full-length SRSF3, but was insufficient to visualize endogenous SRSF3-TR (Figure 5C), possibly due to lower affinity for this protein isoform in an immunoprecipitation assay. To determine if SRSF3-TR was present in FSHD myotubes expressing endogenous levels of DUX4, we carried out immunofluorescence for SRSF3-TR or DUX4 in differentiated FSHD and control muscle cells. While there was no detectable SRSF3-TR staining in control cells, in DUX4-expressing FSHD cultures SRSF3-TR appeared in cytoplasmic puncta (Figure 5D). Together, these data show that endogenous SRSF3-TR is present at detectable levels in DUX4-expressing MB135- iDUX4 and FSHD cells.
To explore the functional consequences of SRSF3-TR expression, we exogenously expressed FLAG-tagged full-length or truncated SRSF3 in healthy muscle cells. As previously described, full-length SRSF3 decreased the level of the normal SRSF3 mRNA isoform and increased the aberrant isoform ((Jumaa & Nielsen, 1997); Figure 5E). Strikingly, SRSF3-TR led to a 5-fold upregulation of the endogenous SRSF3 aberrant mRNA isoform (Figure 5F). These data suggest that DUX4-induced translation of SRSF3-TR may feedback to create more aberrant RNA and therefore more truncated protein, highlighting the importance of misregulated post-transcriptional processes in DUX4-induced pathology.
DISCUSSION
We paired RNA-seq and Ribo-seq across a time course of DUX4 expression in MB135-iDUX4 human skeletal muscle myoblasts, providing an integrative temporal view of the transcriptome and translatome in a well-accepted cellular model of FSHD. Our choice to examine early time points, before overt DUX4-induced cytotoxicity, provides a view into early DUX4-driven regulation that builds on prior work at later time points (Bosnakovski et al., 2019; Chew et al., 2019; DeSimone, Leszyk, Wagner, & Emerson, 2019; Geng et al., 2012; Jagannathan et al., 2019; Jagannathan et al., 2016; Lek et al., 2020; Resnick et al., 2019; Shadle et al., 2019; Shadle et al., 2017; Sharma et al., 2013; Whiddon et al., 2017). Taken together, our results demonstrate that critical pathways underlying FSHD pathophysiology are perturbed earlier than previously understood and provide datasets that should serve as an informative resource for the FSHD community as it works to develop effective FSHD therapeutics.
Our time course RNA-seq data revealed a large number of both up and downregulated genes after only 4 h of DUX4 expression. While the former result is expected given DUX4’s role as a transcriptional activator (Geng et al., 2012), this degree of early gene silencing is remarkable and would require rapid cessation of transcription. There is no evidence that DUX4 acts as a direct repressor of transcription (Bosnakovski et al., 2018); therefore, our results suggest that there exists one or more unstudied DUX4-induced transcriptional repressors responsible for early gene downregulation. Indeed, of the 883 genes significantly upregulated by DUX4 at 4 h, 28 are DNA-binding transcription factors known to effect gene repression and at least 5 more are DNA-binding proteins with an as-yet-undetermined role in gene regulation. This novel hypothesis provides a mechanism by which DUX4 inhibits gene expression that is unique from previously suggested models invoking competition of DUX4 with the homeodomain transcription factors PAX3 and PAX7 (C. R. S. Banerji et al., 2017; Bosnakovski, Daughters, Xu, Slack, & Kyba, 2009; Bosnakovski, Toso, et al., 2017; Bosnakovski et al., 2008) or transcriptional interference from DUX4-induced non-coding transcripts (Bosnakovski et al., 2018). Future work could determine which, if any, of the DNA-binding repressors induced by DUX4 cause early DUX4-mediated gene downregulation.
The profound consequences of DUX4 misexpression on cell identity were also revealed in our time course RNA-seq results. GO analyses showed that DUX4-expressing myoblasts begin to turn on genes that promote cell proliferation and inhibit cell differentiation, and silence genes that promote myogenesis, early after DUX4 induction. Thus, skeletal muscle – a post-mitotic, differentiated cell type – is pushed towards a proliferative, naïve embryonic state. This conflicted cell identity may contribute to apoptosis, as has been proposed before (Ashoti, Alemany, Sage, & Geijsen, 2021; Geng et al., 2012). Interestingly, a few hours after DUX4 begins forcing expression of an embryonic gene expression program, and just as overt apoptosis appears, cells respond by altering core regulatory processes such as RNA splicing and localization, and protein ubiquitination. Whether these changes underlie cellular “coping” mechanisms that delay death or underpin cytotoxicity remains to be determined.
We recently described the DUX4-induced proteome using MB135-iDUX4 cells (Jagannathan et al., 2019). Although this catalog of expressed proteins remains incomplete due to partial coverage and restricted dynamic range inherent to mass spectrometry, our proteomics data demonstrated that hundreds of genes are post-transcriptionally modulated by DUX4. But, these experiments could not determine if this regulation was mediated at the level of protein synthesis or degradation. The time course Ribo-seq presented here complements and extends this earlier work, and suggests that most DUX4 post-transcriptional regulation, including of multiple key NMD factors, occurs via proteolysis. We did uncover a small number of genes regulated at the level of protein synthesis at the 14 h time point of DUX4 expression, likely representing the subset of transcripts impacted by eIF2α phosphorylation induced by DUX4, as previously reported (Jagannathan et al., 2019; Shadle et al., 2017).
Loss of NMD leads to the stabilization of aberrant RNAs (Kurosaki & Maquat, 2016). However, few studies have looked at whether aberrant RNAs are translated and what proteins they might produce. Here, we demonstrate that DUX4-induced NMD inhibition causes truncated protein production in muscle cells. The confirmed existence of truncated proteins in DUX4-expressing cells has implications for how we understand cytotoxicity. Protein truncation could result in a dominant negative function that inhibits the activity of the remaining, cell critical full-length protein. Truncated proteins might misfold and facilitate the formation of protein aggregates, such as those observed in FSHD myotubes (Homma, Beermann, Boyce, & Miller, 2015; Homma, Beermann, Yu, Boyce, & Miller, 2016; Shadle et al., 2017). Additionally, some DUX4-induced truncated proteins contain unique C-terminal extensions, or neo-peptides, that could serve as novel antigens and might induce inflammation in FSHD muscle. Importantly, we were able to detect truncated proteins in FSHD muscle cell cultures, thereby validating our findings from the inducible DUX4 system in a more physiological setting and raising the possibility that these molecules could be used clinically as functionally relevant and FSHD-specific biomarkers. Future work is required to establish whether truncated proteins, either individually or in combination, contribute to cell death.
Many of the identified DUX4-induced truncated proteins are RBPs and splicing factors. It is well-established that DUX4 alters RNA splicing (Geng et al., 2012; Jagannathan et al., 2019; Jagannathan et al., 2016; Rickard et al., 2015) and therefore interesting to speculate that truncated RBPs and splicing proteins might be responsible for inducing global RNA processing defects. Such misprocessing would generate aberrant RNAs that could act to further overwhelm the already inhibited NMD pathway. As an example of this phenomenon, we observed that exogenous expression of the truncated splicing factor SRSF3-TR causes a significant increase in the level of the corresponding aberrant SRSF3 RNA. Splicing factors are known to regulate and coordinate their expression via various forms of autoregulation (Konigs et al., 2020; Leclair et al., 2020; Muller-McNicoll, Rossbach, Hui, & Medenbach, 2019). Overexpression of full-length SRSF3 leads to a reduction in the level of normal SRSF3 transcript while at the same time activating the production of aberrant SRSF3 RNA that gets degraded by NMD in a balancing act mechanism that prevents SRSF3 accumulation (Jumaa & Nielsen, 1997). Our results suggest that SRSF3-TR facilitates its own production and disrupts normal autoregulatory processes, leading to a new model in which DUX4-expressing cells hijack the splicing protein autoregulatory network and flip it – instead of acting to limit the production of full-length protein the creation of truncated protein is amplified.
In summary, our study provides a critical missing piece towards a comprehensive characterization of the major steps in DUX4-induced gene expression. We demonstrate that DUX4 induces cascading changes at all levels of gene regulation that are invisible at steady state, offering a glimpse into how developmentally regulated transcription factors expressed at the wrong time in the wrong context can wreak havoc leading to human disease.
MATERIALS AND METHODS
RESOURCE AVAILABILITY
Lead contact
Further information and requests for resources and reagents should be directed to and will be fulfilled by the lead contact, Sujatha Jagannathan (sujatha.jagannathan{at}cuanschutz.edu).
Materials availability
The cell lines and antibody generated in this study are available upon request. Plasmids generated in this study have been deposited to Addgene (plasmid #171951, #171952, #172345, and #172346).
Data and code availability
The RNA-seq and Ribo-seq data generated during this study are available at GEO (accession number GSE178761). The code generated during this study are available at GitHub (https://github.com/sjaganna/2021-campbell_dyle_calviello_et_al).
EXPERIMENTAL MODEL AND SUBJECT DETAILS
Cell lines and culture conditions
293T cells were obtained from ATCC (CRL-3216). MB135, MB135-iDUX4, MB135- iDUX4/ZSCAN4-mCherry, and MB200 immortalized human myoblasts were a gift from Dr. Stephen Tapscott and originated from the Fields Center for FSHD and Neuromuscular Research at the University of Rochester Medical Center. MB135-iDUX4 cells have been described previously (Jagannathan et al., 2016). MB135-iFLAG-SRSF3-FL, and MB135-iFLAG-SRSF3-TR immortalized human myoblasts were generated in this study. All parental cell lines were authenticated by karyotype analysis and determined to be free of mycoplasma by PCR screening. 293T cells were maintained in Dulbecco’s Modified Eagle Medium (DMEM) (Thermo Fisher Scientific) supplemented with 10% EqualFETAL (Atlas Biologicals). Myoblasts were maintained in Ham’s F-10 Nutrient Mix (Thermo Fisher Scientific) supplemented with 20% Fetal Bovine Serum (Thermo Fisher Scientific), 10 ng/mL recombinant human basic fibroblast growth factor (Promega), and 1 μM dexamethasone (Sigma-Aldrich). MB135-iDUX4/ZSCAN4-mCherry and MB135-iDUX4 myoblasts were additionally maintained in 2 μg/mL puromycin dihydrochloride (VWR). MB135-iFLAG-SRSF3-FL and -TR myoblasts were additionally maintained in 10 μg/mL blasticidin S HCl (Thermo Fisher Scientific). Induction of DUX4 and SRSF3 transgenes was achieved by culturing cells in 1 μg/mL doxycycline hyclate (Sigma-Aldrich). Differentiation of myoblasts into myotubes was achieved by switching the fully confluent myoblast monolayer into DMEM containing 1% horse serum (Thermo Fisher Scientific) and Insulin-Transferrin-Selenium (Thermo Fisher Scientific). All cells were incubated at 37 °C with 5% CO2.
METHOD DETAILS
Cloning
pTwist-FLAG-SRSF3_Full.Length_Codon.Optimized and pTwist-FLAG-SRSF3_Truncated_Codon.Optimized plasmids were synthesized by Twist Bioscience. To construct pCW57.1-FLAG-SRSF3_Full.Length_Codon.Optimized-Blast and pCW57.1-FLAG-SRSF3_Truncated_Codon.Optimized-Blast plasmids, the SRSF3 open reading frames were subcloned into pCW57-MCS1-P2A-MCS2 (Blast) (a gift from Adam Karpf, Addgene plasmid #80921) (Barger, Branick, Chee, & Karpf, 2019) by restriction enzyme digest using EcoRI and BamHI (New England Biolabs).
Antibody generation
Purified SRSF3-TR peptide (Cys-PRRRVTIMSLLTTL) was used as an immunogen and polyclonal rabbit anti-SRSF3-TR antibody production was done in collaboration with Pacific Immunology (Ramona, CA). The antisera from all animals were screened for reactivity by ELISA against the immunogen and with western blots and immunofluorescence against transfected SRSF3-TR.
Transgenic cell line generation
Lentiviral particles expressing doxycycline-inducible FLAG-SRSF3-FL or -TR transgenes were generated by co-transfecting 293T cells with the appropriate lentivector, pMD2.G (a gift from Didier Trono, Addgene plasmid #12259), and psPAX2 (a gift from Didier Trono, Addgene plasmid #12260) using Lipofectamine 2000 Transfection Reagent (Thermo Fisher Scientific). To generate polyclonal SRSF3 transgenic cell lines, MB135 myoblasts were transduced with lentivirus in the presence of 8 μg/mL polybrene (Sigma-Aldrich) and selected using 10 μg/mL blasticidin S HCl.
Plasmid transfections
293T cells were transfected with pTwist-FLAG-SRSF3_Full.Length_Codon.Optimized and pTwist-FLAG-SRSF3_Truncated_Codon.Optimized plasmids using Lipofectamine 2000 Transfection Reagent following the manufacturer’s instructions.
Live cell imaging
MB135-iDUX4/ZSCAN4-mCherry myoblasts were induced with doxycycline hyclate to turn on DUX4 expression and subjected to time lapse imaging using the IncuCyte S3 incubator microscope system (Sartorius). Images were collected every 15 min from the time of doxycycline addition (t = 0 h) to 28 h.
RNA extraction and RT-qPCR
Total RNA was extracted from whole cells using TRIzol Reagent (Thermo Fisher Scientific) following the manufacturer’s instructions. Isolated RNA was treated with DNase I (Thermo Fisher Scientific) and reverse transcribed to cDNA using SuperScript III reverse transcriptase (Thermo Fisher Scientific) and random hexamers (Thermo Fisher Scientific) according to the manufacturer’s protocol. Quantitative PCR was carried out on a CFX384 Touch Real-Time PCR Detection System (Bio-Rad) using primers specific to each gene of interest and iTaq Universal SYBR Green Supermix (Bio-Rad). The expression levels of target genes were normalized to that of the reference gene RPL27 using the delta-delta-Ct method (Livak & Schmittgen, 2001). The primers used in this study are listed in the Key Resources Table.
RNA-seq library preparation and sequencing
Total RNA was extracted from whole cells using TRIzol Reagent following the manufacturer’s instructions. Isolated RNA was subjected to ribosomal RNA depletion using the Ribo-Zero rRNA Removal Kit (Illumina). RNA-seq libraries were prepared using the NEXTflex Rapid Directional qRNA-Seq Kit (Bioo Scientific) following the manufacturer’s instructions and sequenced using 75 bp single-end sequencing on the Illumina NextSeq 500 platform by the BioFrontiers Institute Next-Gen Sequencing Core Facility.
Ribosome footprinting
Ribo-seq was performed as described previously (Calviello et al., 2020) using six 70% confluent 10 cm dishes of MB135-iDUX4 cells per condition. Briefly, cells were washed with ice-cold phosphate-buffered saline (PBS) supplemented with 100 μg/mL cycloheximide (Sigma-Aldrich), flash frozen on liquid nitrogen, and lysed in Lysis Buffer (PBS containing 1% (v/v) Triton X-100 and 25 U/mL TurboDNase (Ambion)). Cells were harvested by scraping and further lysed by trituration ten times through a 26-gauge needle. The lysate was clarified by centrifugation at 20,000 g for 10 min at 4 °C. The supernatants were flash frozen in liquid nitrogen and stored at - 80 °C. Thawed lysates were treated with RNase I (Ambion) at 2.5 U/μL for 45 min at room temperature with gentle mixing. Further RNase activity was stopped by addition of SUPERaseIn RNase Inhibitor (Thermo Fisher Scientific). Next, ribosome complexes were enriched using MicroSpin S-400 HR Columns (GE Healthcare) and RNA extracted using the Direct-zol RNA Miniprep Kit (Zymo Research). Ribo-Zero rRNA Removal Kit was used to deplete rRNAs and the ribosome-protected fragments were recovered by running them in a 17% Urea gel, staining with SYBR Gold (Invitrogen), and extracting nucleic acids that are 27 to 30 nucleotides long from gel slices by constant agitation in 0.3 M NaCl at 4 °C overnight. The recovered nucleic acids were precipitated with isopropanol using GlycoBlue Coprecipitant (Ambion) as carrier and treated with T4 polynucleotide kinase (Thermo Fisher Scientific). Libraries were prepared using the NEXTflex Small RNA-Seq Kit v3 (Bioo Scientific) following the manufacturer’s instructions and sequenced using 75 bp single-end reads on an Illumina NextSeq 500 by the BioFrontiers Institute Next-Gen Sequencing Core Facility.
RNA-seq and Ribo-seq data analysis
Fastq files were stripped of the adapter sequences using cutadapt. UMI sequences were removed, and reads were collapsed to fasta format. Reads were first aligned against rRNA (accession number U13369.1), and to a collection of snoRNAs, tRNAs, and miRNA (retrieved using the UCSC table browser) using bowtie2 (Langmead & Salzberg, 2012). Remaining reads were mapped to the hg38 version of the genome (without scaffolds) using STAR 2.6.0a (Dobin et al., 2013) supplied with the GENCODE 25 .gtf file. A maximum of two mismatches and mapping to a minimum of 50 positions was allowed. De-novo splice junction discovery was disabled for all datasets. Only the best alignment per each read was retained. Quality control and read counting of the Ribo-seq data was performed with Ribo-seQC (Calviello, Sydow, Harnett, & Ohler, 2019).
Differential gene expression analysis of the RNA-seq data was conducted using DESeq2 (Love, Huber, & Anders, 2014). Briefly, featureCounts from the subread R package (Liao, Smyth, & Shi, 2014) was used to assign aligned reads (in BAM format) to genomic features supplied with the GENCODE 25. gtf file. The featureCounts output was then supplied to DESeq2 and differential expression analysis was conducted with the 0 h time point serving as the reference sample. Genes with very low read count were filtered out by requiring at least a total of 10 reads across the 12 samples (3 replicates each of the 0, 4, 8, and 14 h samples). Log2 fold change shrinkage was done using the apeglm function (Zhu, Ibrahim, & Love, 2019).
Differential analysis of the RNA-seq and Ribo-seq data was performed using DESeq2, as previously described (Calviello et al., 2021; Chothani et al., 2019), using an interaction model between the tested condition and RNA-seq – Ribo-seq counts. Only reads mapping uniquely to coding sequence regions were used. In addition, ORFquant (Calviello et al., 2020) was used to derive de-novo isoform-specific translation events, by pooling the Ribo-seQC output from all Ribo-seq samples, using uniquely mapping reads. DEXSeq (Anders et al., 2012) was used to perform differential exon usage along the DUX4 time course data, using Ribo-seq counts on exonic bins and junctions belonging to different ORFquant-derived translated regions. NMD candidates were defined by ORFquant as open reading frames ending with a stop codon upstream of an exon-exon junction.
GO category analysis
Gene Ontology (GO) analysis was conducted using the web tool http://geneontology.org, powered by pantherdb.org. Briefly, statistical overrepresentation test using the complete GO biological process annotation dataset was conducted and p-values were calculated using the Fisher’s exact test and False Discovery Rate was calculated by the Benjamini-Hochberg procedure.
Polysome profiling
Polysome profiling was performed as previously described (Merrick & Hensold, 2001; Miura, Andrews, Holcik, & Jasmin, 2008) with the following modifications. Four 70% confluent 15 cm dishes of MB135-iDUX4 cells per condition were treated with 100 μg/mL cycloheximide for 10 min, transferred to wet ice, washed with ice-cold PBS containing 100 μg/mL cycloheximide, and then lysed in 400 μL Lysis Buffer (20 mM HEPES pH 7.4, 15 mM MgCl2, 200 mM NaCl, 1% Triton X-100, 100 μg/mL cycloheximide, 2 mM DTT, and 100 U/mL SUPERaseIn RNase Inhibitor) per 15 cm dish. The cells and buffer were scraped off the dish and centrifuged at 13,000 rpm for 10 min at 4 °C. Lysates were fractionated on a 10% to 60% sucrose gradient using the SW 41 Ti Swinging-Bucket Rotor (Beckman Coulter) at 36,000 rpm for 3 h and 10 min. Twenty-four fractions were collected using a Gradient Station ip (BioComp) and an FC 203B Fraction Collector (Gilson) with continuous monitoring of absorbance at 254 nm. RNA from each fraction was extracted using TRIzol LS Reagent (Thermo Fisher Scientific) following the manufacturer’s instructions. RT-qPCR was carried out as described above.
Protein extraction
Total protein was extracted from whole cells using TRIzol Reagent following the manufacturer’s instructions, excepting that protein pellets were dissolved in Protein Resuspension Buffer (0.5 M Tris base, 5% SDS). Isolated protein was quantified using the Pierce BCA Protein Assay Kit (Thermo Fisher Scientific) according to the manufacturer’s protocol. Protein was mixed with 4X NuPAGE LDS Sample Buffer (Thermo Fisher Scientific) containing 50 mM DTT and heated to 70 °C before immunoblotting.
Immunoprecipitation
MB135-iDUX4 myoblasts were treated with or without doxycycline for 14 h and then trypsinized prior to lysis on ice in 1 mL of Lysis Buffer (50 mM Tris-HCl pH 7.5, 150 mM NaCl, 1% NP-40) containing protease inhibitors (Sigma Aldrich). Lysates were precleared using Protein G Sepharose (Thermo Fisher Scientific) for 1 h prior to an overnight incubation at 4 °C with either anti-SRSF3 or anti-SRSF3-TR antibody. Protein G Sepharose was added the following morning for 5 h to bind the antibody, and beads were subsequently washed 5 times with 1 mL cold Lysis Buffer. After the final wash, 4X NuPAGE LDS Sample Buffer containing 50 mM DTT was added directly to the beads and samples heated to 70 °C for protein elution before immunoblotting.
Immunoblotting
Protein was run on NuPAGE Bis-Tris precast polyacrylamide gels (Thermo Fisher Scientific) alongside PageRuler Plus Prestained Protein Ladder (Thermo Fisher Scientific) and transferred to Odyssey nitrocellulose membrane (LI-COR Biosciences). Membranes were blocked in Intercept (PBS) Blocking Buffer (LI-COR Biosciences) before overnight incubation at 4 °C with primary antibodies diluted in Blocking Buffer containing 0.2% Tween 20. Membranes were incubated with IRDye-conjugated secondary antibodies (LI-COR Biosciences) for 1 h and fluorescent signal visualized using a Sapphire Biomolecular Imager (Azure Biosystems) and Sapphire Capture software (Azure Biosystems). When appropriate, membranes were stripped with Restore Western Blot Stripping Buffer (Thermo Fisher Scientific) before being re-probed. Band intensities were quantified by densitometry using ImageJ (Schneider, Rasband, & Eliceiri, 2012).
Immunofluorescence
Cells were fixed in 10% Neutral Buffered Formalin (Research Products International) for 30 min and permeabilized for 10 min in PBS with 0.1% Triton X-100. Samples were then incubated overnight at 4 °C with primary antibodies, followed by incubation with 488- or 594-conjugated secondary antibodies for 1 h prior to counterstaining and mounting with Prolong Diamond Antifade Mountant with DAPI (Thermo Fisher Scientific). Slides were imaged with a DeltaVision Elite deconvolution microscope, CoolSNAP HQ2 high-resolution CCD camera, and Resolve3D softWoRx-Acquire v7.0 software. Image J software (Schneider et al., 2012) was used for image analysis.
Antibodies
The antibodies used in this study are anti-DUX4 (Abcam 124699), anti-Histone H3 (Abcam 1791), anti-SRSF3 (Thermo Fisher Scientific 33-4200), anti-SRSF3-TR (this paper), anti-RENT1/hUPF1 (Abcam ab109363), Drop-n-Stain CF 488A Donkey Anti-Rabbit IgG (Biotium 20950), Drop-n-Stain CF 594 Donkey Anti-Rabbit IgG (Biotium 20951), IRDye 650 Goat anti-Mouse IgG Secondary Antibody (LI-COR Biosciences 926-65010), and IRDye 800CW Goat anti-Rabbit IgG Secondary Antibody (LI-COR Biosciences 926-32211).
QUANTIFICATION AND STATISTICAL ANALYSIS
Data analysis, statistical tests, and visualization
All data analysis and statistical tests were performed in the R programming environment and relied on Bioconductor (Huber et al., 2015) and ggplot2 (Wickham, 2016). Plots were generated using R plotting functions and/or the ggplot2 package. Bar and line graphs were generated using GraphPad Prism software version 9.0. Biological replicates were defined as experiments performed separately on distinct samples (i.e. cells cultured in different wells) representing identical conditions and/or time points. No outliers were eliminated in this study. All statistical tests were performed using R functions.
AUTHOR CONTRIBUTIONS
A.E.C., M.C.D., and S.J. conceived and designed the study. A.E.C, M.C.D., M.A.C., and T.F. performed experiments. A.E.C., L.C., T.M., R.F., A.E.G., S.N.F., and S.J. analyzed data. A.E.C. and S.J. wrote the paper with input from all authors.
DECLARATION OF INTERESTS
The authors declare no competing interests.
SUPPLEMENTAL FILES
Video 1. Time course imaging following DUX4 expression.
Live cell fluorescence microscopy recording of MB135-iDUX4/ZSCAN4-mCherry myoblasts treated with doxycycline to induce DUX4 expression.
Supplementary Table 1.
DESeq2 differential gene expression analysis results for DUX4 time course RNA-seq data at 4, 8, or 14 h post-induction compared to 0 h (control).
Supplementary Table 2.
Cluster analysis of RNA-seq log2 fold change at 4, 8, or 14 h of DUX4 induction compared to 0 h (control); and Gene Ontology analysis of genes within each cluster.
Supplementary Table 3.
ORFquant analysis results for DUX4 time course RNA-seq and Ribo-seq data at 4, 8, or 14 h post-induction compared to 0 h (control); and Gene Ontology analysis of genes with downregulated ribosome density at 14 h.
Supplementary Table 4.
DEXSeq analysis results for DUX4 time course Ribo-seq data at 4, 8, or 14 h post-induction compared to 0 h (control); and Gene Ontology analysis of NMD targets upregulated at 14 h.
ACKNOWLEDGEMENTS
We thank Stephen Tapscott for the MB135-iDUX4/ZSCAN4-mCherry cell line. We thank Jeffrey Kieft for his guidance in carrying out polysome profiling. We thank Neelanjan Mukherjee, Olivia Rissland, Srinivas Ramachandran, and all members of the Jagannathan laboratory for insightful manuscript feedback. We thank the BioFrontiers Institute Next-Gen Sequencing Core Facility, which performed the Illumina sequencing and library construction. This work was supported by the RNA Bioscience Initiative, University of Colorado Anschutz Medical Campus (S.J.), Friends of FSH Research and The Chris Carrino Foundation for FSHD AWD-194864 (S.J.), the FSHD Society FSHS-82018-01 (A.E.C. and M.D.), the California Tobacco-Related Disease Research Grants Program 27KT-0003 (S.N.F.), and the National Institutes of Health DP2GM132932 (S.N.F.).
APPENDIX KEY RESOURCES TABLE
Footnotes
Minor edit to abstract