Abstract
Detailed knowledge of cellular networks that are modulated by Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is needed to understand viral replication and host response. So far, transcriptomic analyses of interactions between SARS-CoV-2 and cells were performed on mixed populations of infected and uninfected cells or using single-cell RNA sequencing, both leading to inaccurate or low-resolution gene expression interpretations. Moreover, they generally focused on annotated messenger RNAs (mRNAs), ignoring other transcripts, such as long non-coding RNAs (lncRNAs) and unannotated RNAs. Here, we performed deep polyA+ transcriptome analyses of lung epithelial A549 cells infected with SARS-CoV-2, which were sorted based on the expression of the viral protein spike (S). To increase the sequencing depth and improve the robustness of the analysis, the samples were depleted of viral transcripts. Infection caused a massive reduction in mRNAs and lncRNAs, including transcripts coding for antiviral innate immune proteins, such as interferons (IFNs). This absence of IFN response probably explains the poor transcriptomic response of bystander cells co-cultured with spike positive (S+) ones. NF-κB and inflammatory response were among the pathways that escaped the global shutoff in S+ cells. In agreement with the RNA-seq analysis, inflammatory cytokines, but not IFNs, were produced and secreted by infected cells. Functional investigations revealed the proviral function of the NF-kB subunit p105/p50 and some of its known target genes, including IL32 and IL8, as well as the lncRNA ADIRF-AS1, which we identified as a novel NF-kB target gene. Thus, analyzing the polyA+ transcriptome of sorted populations of infected lung cells allowed unprecedented identification of cellular functions that are directly affected by infection and the recovery of coding and non-coding genes that contribute to SARS-CoV-2 replication.
Introduction
Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is the causative agent of Coronavirus Disease-2019 (COVID-19). The virus emerged in Wuhan, China, at the end of 2019 and has since spread around the globe. SARS-CoV-2 infection may be asymptomatic or it may cause a wide spectrum of symptoms, from mild upper respiratory tract infection to life-threatening pneumonia [1]. Viral replication is not limited to the respiratory tract, but rather occurs in numerous organs, including the blood, heart, vessels, intestines, brain and kidneys [2]. The mortality rate of SARS-CoV-2 infection is estimated at 3%–4%, compared with a mortality rate of less than 1% from influenza [3]. Severity of the disease correlates with an excessive pro-inflammatory immune response [4–6], which may be responsible for the symptoms observed in patients. Inflammation is a vital defense mechanism that is required to initiate an adaptive immune response via the recruitment and activation of immune cells. However, the non-resolution of acute inflammation leads to tissue damage [7].
SARS-CoV-2 infection is also characterized by a suppression of interferon (IFN) response in infected cells [8]. IFNs are potent antiviral cytokines secreted by various cell types. In virus-infected cells, the IFN response is initiated by the recognition of viral nucleic acids by cellular receptors. Once activated, these receptors recruit adaptor proteins and kinases that trigger the nuclear translocation of the transcription factors IRF3 and NF-κB, which, in turn, induce the rapid expression of IFNs and proinflammatory cytokines [9]. In particular, type I (IFNα and β) and type III (IFN-λ1 and IFN-λ2/3) IFNs play crucial roles in protecting infected and neighboring cells from virus replication and spread. Once secreted, they will signal in a paracrine and autocrine manner through their receptors, resulting in the activation of the transcription factor complex ISGF3, which subsequently induces the expression of up to approximately 2000 IFN-stimulated genes (ISGs). Many of these ISGs block the viral life cycle by targeting specific stages of replication, including entry into host cells, protein translation, replication or assembly of new virus particles. Some ISGs are specific to a virus or a viral family, while others are broad-spectrum. Their concerted actions establish the antiviral state [10,11]. Like all viruses [12], SARS-CoV-2 overcomes IFN responses via a wide array of mechanisms involving viral proteins [13–15] and a virus-derived microRNA [16,17]. These viral strategies likely contribute to an impaired IFN response in COVID-19 patients [18] and, consequently, high levels of viral replication.
A large effort has been undertaken to understand the molecular mechanism underlying the lack of IFN response and the overproduction of inflammatory cytokines in SARS-CoV-2 infected cells. Numerous transcriptomic analyses of human cells infected with SARS-CoV-2 have been performed to describe the perturbation of cellular pathways induced by infection, using several cellular models, such as human cells derived from lung, bronchial or colorectal tissue [19–21], as well as post-mortem lung samples of COVID-19 patients [19] and bronchoalveolar lavage fluids (BALF) from patients [22]. These genome-wide investigations of host cellular responses to SARS-CoV-2 infection were performed exclusively using bulk RNA-sequencing (RNA-seq) technologies, i.e. by analyzing gene perturbations in mixed populations of infected and uninfected cells. Previous studies on Zika virus infected cells have estimated that only 10% of the repressed and about 30% of the induced genes can be identified in a mixed population containing around one third of infected cells [23]. Bulk transcriptome signals are thus partly drawn into noise background, rendering impossible to efficiently and exhaustively portray the full variation of the host transcripts.
The perturbation of cellular responses in SARS-CoV-2 infected and bystander cells have also been analyzed using single-cell (sc) RNA-seq methods. Such studies were performed in a variety of cellular models, including COVID-relevant ones, such as human intestinal organoids [24], human tracheal-bronchial epithelial cells [25,26], human lung cell lines [21] and BALF from patients [27]. However, the technical variability, high noise and massive sample size of scRNA-seq data raise challenges in analyzing the total number of differentially expressed genes (DEGs) [28] out of a limited list of only 1000 to 3000 most expressed genes in individual cells. The balance between the number of cells to be sequenced and the sequencing depth to extract the maximum amount of information from the experiment also affects the results [29].
Moreover, most bulk and single-cell transcriptomic studies performed to investigate the cellular response to SARS-CoV-2 focused on the expression of the referenced coding genome, largely ignoring non-coding and unannotated information, mainly represented by long non-coding RNAs (lncRNAs). These RNAs, which are at least 200 nucleotides (nt) in length, are of specific interest since they play fundamental roles in cellular identity, development and disease progression through epigenetic or post-transcriptional regulation of mRNA expression [30]. Combined RNA-seq data from multiple sources reported over 58000 lncRNA loci in the human genome [31]. Future studies will plausibly increase this number, since lncRNAs are more cell-type specific [32] and expressed at lower levels than mRNAs [31]. Most of them are independently transcribed by RNA polymerase II and, like protein-coding RNAs, they can be 5’-capped, polyadenylated, and spliced by the cellular machinery [33]. Increasing evidence suggests the involvement of lncRNAs in virus-host interactions and antiviral immunity [34,35]. Current efforts are under progress to uncover, in different contexts, the unannotated RNAs that could encompass a variety of RNA biotypes, from rare mRNA isoforms to unannotated intergenic long noncoding RNAs, using reference-based approach with the human gencode annotation [36] or unreferenced-based methods for unmappable transcripts [37]. However, so far, none of these strategies have been engaged to dissect virus-cell interactions.
Here, we investigated the coding and non-coding transcriptional landscape of lung cells infected with SARS-CoV-2 and sorted according to the expression of the viral protein spike (S). Our deep transcriptome analysis using annotated RNA genes and reference-based RNA profiler uncovered pathways that are directly affected by infection and identified coding and non-coding genes contributing to an optimal SARS-CoV-2 replication.
Results
Transcriptional landscapes of SARS-CoV-2 infected and bystander lung cells uncovers a global expression shutoff
To analyze transcriptomic changes in infected and bystander cells, human alveolar basal epithelial carcinoma cells (A549) stably expressing the viral receptor ACE2 (A549-ACE2) were infected with a MOI of 1 for 24 hours, fixed, stained intracellularly using antibodies against S proteins and sorted into S-positive (infected cells, S+) and S-negative (bystander, S-) populations (Fig. 1A and 1B). Around 15% of A549-ACE2 cells were positive for S protein (Fig. 1B). Cells negative for S protein represent either uninfected cells or cells at an early stage of infection, prior to viral protein production. Mock-infected cells served as negative controls. The experiment was performed twice independently in triplicates. PolyA+ RNAs were isolated from mock-infected, S+ and S- cells. Around 85% of the total reads mapped to the viral genome in S+ cells, while less than 5% of the total reads aligned with the viral genome in S- cells (Fig. 1C), validating our sorting approach. The large dominance of viral reads over cellular reads illustrates the ability of the virus to hijack the cellular machinery for its replication. A similar proportion of SARS-CoV-2 reads in the RNA pool was previously reported in lung epithelial carcinoma Calu-3 cells infected for 8 hours [20]. These differences in the representation of viral RNA between S+ and S- cells altered the robustness of the statistical analysis used to identify DEGs. To overcome this limitation, the samples were depleted of viral RNAs (vRNAs) using a set of oligonucleotide probes covering the entire viral genome (Fig. 1A). Following depletion, viral reads represented between 0,01 and 2,8% of the total reads both in S+ and S- cells (Fig. 1C).
Differential transcriptomic analysis of SARS-CoV-2 infected and bystander lung cells. (A) Scheme summarizing the experimental workflow. A549-ACE2 cells were infected with SARS-CoV-2 at a MOI of 1 for 24h, stained for viral S protein followed by flow cytometry sorting of productively infected (S+) and bystander (S-) cells. Total RNA from mock, S- and S+ cells was depleted of ribosomal and viral RNAs and sequenced. (B) Representative FACS plot of S protein staining used for sorting productively infected cells. (C) Percentage of reads in libraries originating from human genome or SARS-CoV-2 sequence, before and after depletion of viral reads. (D) PCA plot based on the top 500 most variable genes between mock, bystander (S-) and infected (S+) cells. (E) Volcano plots presenting distribution of classes of transcripts (mRNA-blue, lncRNA-red, unannotated-green) based on their log2 fold-change for 3 comparisons: infected cells vs mock, infected cells vs bystander and bystander vs mock. (F) Heatmaps presenting z-score of log2 normalized counts for all differentially expressed genes between mock, bystander and infected cells, separated for mRNAs, lncRNAs and unannotated RNA.
Coding and long non-coding genes were identified using gencode annotation (v32), while unannotated RNAs were recovered with Scallop assembler [36] (Fig. 1D-F, Fig. S1 and tables S1-S3). Principal component analysis (PCA) of polyA+ transcriptomes segregated S+ cells from S- and mock-infected ones (Fig. 1D). This segregation based on S expression represented around 92% of the transcriptomic differences between the samples (Fig. 1D). Only subtle differences (2,5%) distinguished bystander and mock-infected cells (Fig. 1D), suggesting that the transcriptional landscapes of these 2 cell populations were very similar. An absence of response of S- cells was unexpected since cytokines, which are commonly secreted by virally infected cells, activate an antiviral state in bystander cells through surface receptors.
(A-B) Heatmaps presenting z-score of log2 normalized counts for differentially expressed genes between (A) S+ vs S- cells or (B) S- vs mock-infected treated cells, separated for mRNAs, lncRNAs and unannotated RNAs. (C) MA plot showing the response to infection of an artificially reconstructed mixed cell population (80% bystander, 20% infected, left) compared to cells sorted based on the expression of the viral protein Spike (infected, middle; bystander, right).
Analysis of gene expression allowed identification of thousands of annotated coding and non-coding genes that were differentially expressed (absolute fold change ≥2, p-value < 0.05) in S+ cells as compared to S- or mock-infected ones (Fig. 1E-F and tables S1-S3). We identified around 13 times more downregulated coding genes than upregulated ones in S+ cells (Fig. 1E-F), suggesting that infection triggers a massive, but incomplete, shutoff of gene expression. Among the top upregulated coding genes in S+ cells, we confirmed candidates revealed by previous analyses performed in non-sorted non-vRNA-depleted A549-ACE2 cells, such as CXCL8, CCL20, IL6 and NFKB1 [19,38,39], but also novel highly significant candidates, including IL32 and ITGAM (table S1). The genes encoding IFN type I and type III were not significantly upregulated in S+ cells, as compared to mock-infected cells. Accordingly, ISGs were not upregulated either in S+ cells. This absence of innate immune response in infected cells agrees with previous analyses performed in mixed population of A549-ACE2 cells infected with SARS-CoV-2 [19,38,39]. Such absence of innate response reflects the ability of the virus to potently inhibit the IFN response via numerous mechanisms in human cells [14]. Around 1260 annotated lncRNAs were downregulated in S+ cells as compared to mock-infected cells, and 184 were upregulated (Fig. 1E-F and table S2). RFPL3S, ADIRF-AS1 and WAKMAR2 were among the top 15 upregulated lncRNAs in S+ cells. RFPL3S and ADIRF-AS1 have no known functions, whereas WAKMAR2 restricts NF-kB-induced production of inflammatory chemokines in human keratinocytes [40]. Among the top downregulated lncRNAs, we identified HOXA-AS2 and NKILA, which are negative regulators of NF-kB signaling, in endothelial cells and breast cancer cell lines, respectively [41,42]. Altered expression of WAKMAR2, HOXA-AS2 and NKILA in infected cells could thus play a role in viral-associated inflammation. From the 1400 unannotated transcripts we detected using Scallop assembler [36], around 800 unannotated polyA+ transcripts were also differentially expressed in S+ cells as compared to S-ones (Fig. 1E-F and table S3).
In agreement with the PCA (Fig. 1D), volcano plots and heat maps revealed that S- bystander cells and mock-infected control cells exhibited very similar transcriptomic profiles (Fig. 1E-1F and Fig. S1A-S1B). Only around 170 polyA+ transcripts were differentially expressed in S- cells as compared to mock-infected ones (Fig. 1E and tables S1-S3). As a comparison, over 13000 DEGs were identified in S+ as compared to mock-infected cells (Fig. 1E and tables S1-S3). These analyses further suggest that S+ cells present none or very little paracrine signaling response. Among the 69 coding genes that were upregulated in S- cells as compared to mock-infected, 29 were also upregulated in S+ cells (Fig. S1B and table S3). Some of these common genes were inflammatory genes, such as IL32, IL6 and CCL20. Among the 39 upregulated coding genes that were unique to S- cells, 16 were ISGs (examples include MX1, APOL1 and IFI6). The expression of these inflammatory genes and ISGs in bystander cells could be induced early in infection, prior to the production of S proteins.
Our approach reveals that SARS-CoV-2 infection triggers a major shutoff of gene expression in A549-ACE2 cells. It also shows that S- cells do not exhibit a strong transcriptional signature despite being cultured with S+ cells, suggesting the absence of an efficient paracrine communication.
Separating lung cells based on the expression of the viral S protein improved discovery of DEGs
To compare our differential deep analyses with known datasets, we analyze publicly available polyA+ RNA-seq raw data of unsorted A549-ACE2 infected with SARS-CoV-2 at a MOI of 0.2 [19]. Viral reads represented around 50% of the total number of reads in these unsorted bulk population of cells [19], which was expectedly less than in A549-ACE2 cells positive for S (Fig. 1C). The 2 analyses shared 150 upregulated protein-coding genes and 238 downregulated ones (Fig. 2A, table S4). The vast majority (about 80%) of the downregulated mRNAs that we identified were classified as ‘unchanged’ in the analysis of unsorted cells (Fig. 2A, table S4). Thus, sorting cells based on S expression and depleting viral RNA allowed the identification of over 30 times more downregulated coding genes than in unsorted cells (Fig. 2A, table S4). The poor sensibility of analysis of mixed cell population in detecting downregulated genes is likely due to the large proportion of non-infected cells, in which the majority of genes remained normally expressed, thus masking any decrease of gene expression in the pool of infected cells. Indeed, an artificial reconstruction of a mixed cell population (80% S- and 20% S+) supports this hypothesis (Fig. S1C). About 41% of the upregulated protein-coding genes and 16% of downregulated ones that we identified did not appear in conventional RNA-seq analysis of mixed populations [19]. This comparison highlights the accuracy and the depth of our analysis.
Separating lung cells based on the expression of the viral S protein improved discovery of DEGs. (A) Venn diagram representing gene overlap between DE-seq from sorted vs mock samples and mixed vs mock data re-analyzed from Blanco-Melo et al. 2020 (MOI of 0.2). The genes were defined as upregulated if log2 fold change was equal or above 1 (right panel) and equal or below −1 for downregulated genes (left panel). Genes were defined as expressed when they were represented by at least 10 normalized reads in each replicate. The solid lines and central overlap show the genes that appear in both datasets while dashed gray zones outline genes detected in only one of the two datasets. (B) RT-qPCR quantification of viral genome copy number per μg of total RNA extracted from A549-ACE2 cells infected by SARS-CoV-2 at a MOI of 1, analyzed either in bulk (left side of graph, n=2 independent experiments, line at mean) or post-sorting based on Spike protein expression, allowing distinction between productively infected and bystander subpopulations (right side of graph, n=3 experiments, line at mean). (C-D) RT-qPCR quantification of mRNA, lncRNA, and unannotated-RNA, that were identified as upregulated (C) or downregulated (D) upon infection by SARS-CoV-2 in the RNA-seq analysis, in total RNA extracted from A549-ACE2 cells infected with SARS-CoV-2 at a MOI of 1, analyzed either in bulk (left side of graph) or post-sorting based on Spike protein expression (right side of graph, normalized fold change over mock-infected, n=3 independent experiments, ratio-paired t test, line at mean ± SEM). (E) Top 10 enriched GO terms for Biological Process (BP) and KEGG pathways from DAVID database ranked by the adjusted p-value (Benjamini), for upregulated mRNAs identified in RNA-Seq comparison between infected vs mock cells.
To validate the sorting approach combined with vRNA-depletion, we compared mRNA abundances of a few DEGs in a bulk population of cells infected with SARS-CoV-2 for 24 hours, as well as in sorted S+ and bystander S- cells infected in the same condition. As expected, S+ cells produced approximately 200-fold more intracellular viral RNAs than did S- cells (Fig. 2B). These qPCR analyses confirm that some S- cells are at an early stage of viral replication, prior to viral protein expression (Fig. 1B). We included in the analysis three coding transcripts (IL32, ITGAM and TRAF1), two lncRNAs (WAKMAR2 and AL132990.1) and one unannotated transcript (XLOC_007519) that were identified amongst the most upregulated RNAs in S+ A549-ACE2 cells (Fig. 2C, S2A and tables S1-S3). The abundance of IL32 mRNA did not increase significantly in the infected bulk population, as compared to mock-infected cells (Fig. 2C). By contrast, IL32 mRNA levels increased around 30-fold in S+ cells, compared to those in mock-infected cells (Fig. 2C). This difference explains why IL32 was not identified as an up-regulated gene in previous RNA-seq analysis performed in mixed population of infected A549-ACE2 cells [19,38]. Similarly, the expression of ITGAM, TRAF1, WAKMAR2, AL132990.1 and XLOC_007519 showed a modest increase of mRNA abundances in the bulk population and a significant increase in S+ cells, as compared to mock-infected cells (Fig. 2C and S2B). The decreased expression of transcripts identified as top downregulated hits in the RNA-seq analysis of S+ cells, such as the coding transcripts FEN1 and SNRPF, the lncRNAs AC016747.1, DANCR and TP53TG1, as well as the unannotated RNA XLOC_049236 (Fig. S2A), was significantly more pronounced in S+ cells than in the mixed population of cells, when compared to mock-infected cells (Fig. 2D and S2C). Analysis of RNA abundances in sorted cells thus highlighted the increased accuracy of our approach, compared to classical methods, in detecting up- and down-regulated genes.
(A) Visualization of read coverage (tag/nucleotide) from polyA+ RNA-seq normalized on ERCC reads for IL32, WAKMAR2, FEN1 and AC016747.1. (B-C) RT-qPCR quantification of mRNAs and lncRNAs that are either upregulated (B) or downregulated (C) upon infection with SARS-CoV-2, in total RNA extracted from A549-ACE2 cells infected at an MOI of 1, analyzed either in bulk (left side of graph) or post sorting based on Spike protein (right side of graph, normalized fold change over mock-infected, n=3 independent experiments, ratio-paired t test, line at mean ± SEM).
To identify pathways affected by infection in A549-ACE2 cells, we performed Gene Ontology (GO) terms and KEGG pathway enrichment analysis on the upregulated coding genes in S+ cells, as compared to mock-infected cells (Fig. 2E). We observed a significant enrichment in several inflammatory signaling pathways, including TNF and NF-κB signatures, which were previously identified in bulk transcriptomic analysis of infected A549 and Calu-3 cells [38,39] and in scRNA-seq analysis of infected colon and ileum organoids [24]. Members of the superfamily of TNF proteins are multifunctional proinflammatory cytokines. NF-κB plays an important role in promoting inflammation, as well as regulating cell proliferation and survival [43]. Activation of NF-κB is one of the signals transduced by the TNF-superfamily members [44]. These inflammatory signatures are also consistent with those observed in peripheral blood immune cells of severe or critical COVID-19 patients [18].
Inflammatory cytokines, but not IFNs, are produced and secreted by infected cells
We wondered whether the underwhelming response of the bystander S- cell population could be explained by a defect in paracrine communication between S+ and S- cells. Despite being present in high abundance in S+ cells as compared to mock-infected cells (table S1 and Fig. 2E), inflammatory cytokine transcripts may not be translated. Indeed, initiation of translation seems to be impaired in SARS-CoV-2 infected cells via two potential mechanisms: acceleration of cytosolic cellular mRNA degradation [20] and blockade of the mRNA entry channel of ribosomes by the viral protein Nsp1 [45–47]. Moreover, viral proteins Nsp8 and Nsp9 disrupt protein secretion in HEK293T cells [45], raising the possibility that cytokines are produced but not secreted by S+ cells.
To investigate these possibilities, we selected 5 inflammatory chemokines (IL-6, CXCL1, CCL2, CXCL8/IL-8 and CCL20) whose expression was upregulated in S+ cells upon infection (table S1) and quantified their intracellular and secreted levels in lysates and supernatants of A549-ACE2 cells infected for 24 hours (Fig. 3). As a comparison, A549-ACE2 cells transfected with the immuno-stimulant poly(I:C) were included in the analysis. In a mixed population of S+ and S- cells, mRNAs of these 5 cytokines were significantly more abundant than in mock-infected cells (Fig. S3A), in agreement with the increased levels of mRNAs detected in S+ cells by RNA-seq (table S1). Their expression was also induced by poly(I:C) (Fig. S3A). All five cytokines were expressed at detectable levels in cells stimulated by viral infection or poly(I:C) (Fig. 3A), indicating that infection does not hamper the translation of the corresponding mRNAs. As expected, based on their mRNA abundance (Fig. S3A), intracellular levels of IL-6, CCL2 and CXCL8 significantly increased upon poly(I:C) stimulation as compared to unstimulated control cells (Fig. 3A). By contrast, despite being induced by poly(I:C) downstream signaling (Fig. S3A), CXCL1 and CCL20 levels were comparable in stimulated and unstimulated cells (Fig. 3A). This could be due to a short protein half-life, protein degradation and/or rapid secretion. Intracellular levels of CXCL1 increased significantly upon infection compared to mock-infected cells (Fig. 3A) while intracellular levels of IL-6, CCL2, CXCL8 and CCL20 were similar in both conditions. However, all 5 cytokines were significantly more secreted by infected cells than mock-infected ones (Fig. 3B). Infected cells secreted even more IL-6 and CXCL1 than cells stimulated by poly(I:C) (Fig. 3B). Thus, inflammatory cytokines are expressed and secreted by A549-ACE2 cells infected with SARS-CoV-2, which is in line with the excessive inflammatory response reported in other cellular models [19,21,24,26,39] and characteristic of severe cases of COVID-19 [4–6]. The absence of paracrine communication that was revealed by the RNA-seq analysis of S- cells (Fig. 1) is thus unlikely to be linked to a defect in cytokine expression and secretion in S+ cells.
(A) RT-qPCR quantification of transcript induction of indicated chemokines in A549-ACE2 cells, 24 hours post infection with SARS-CoV-2 (MOI of 1) or post-treatment with transfectant alone or in combination with 10 ng/μL of Poly(I:C) (normalized fold change over mock-infected, n=3 independent experiments, ratio-paired t test, line at mean ± SEM). (B) RT-qPCR quantification of IFNβ, IFNλ1 and IFNλ2/3 transcripts induction in A549-ACE2 cells, 24 hours post infection at an MOI of 1 with SARS-CoV-2 or Measles virus expressing GFP (MeV) or post treatment with transfectant alone or in combination with 10 ng/μL of Poly(I:C) (normalized fold change over mock-infected, n=3 independent experiments, line at mean ± SEM).
Inflammatory cytokines, but not IFNs, are produced and secreted by infected cells. (A) Quantification of the indicated chemokines by cytometry bead array in A549-ACE2 cell lysates obtained 24 hours post mock-infection or infected with SARS-CoV-2 at a MOI of 1, or post-treatment with transfectant alone or in combination with 10 ng/μL of Poly(I:C) (n=3 independent experiments, paired One-Way ANOVA with Turkey’s post-test, line at mean ± SEM). (B) Quantification of the indicated chemokines by cytometry bead array in supernatant (SN) of cells shown in (A) (n=3 independent experiments, paired One-Way ANOVA with Turkey’s post-test, line at mean ± SEM). (C) Percentages of infected A549-ACE2 cells 24 hours post infection (MOI of 1) with SARS-CoV-2 or Measles virus expressing GFP (MeV), quantified by flow cytometry using Spike protein staining and GFP expression, respectively (n=3 independent experiments, line at mean ± SEM). (D) Quantification of secretion of IFNβ, IFNλ1 and IFNλ2/3 by cytometry bead arrays in supernatant of A549-ACE2 cells 24 hours post-infection with SARS-CoV-2 or MeV (MOI of 1), or post-treatment with transfectant alone or in combination with 10 ng/μL of Poly(I:C) (n=3 independent experiments, One-Way ANOVA with Šídák’s post-test, line at mean ± SEM).
ULOD: Upper Limit of Detection; LLOD: Lower Limit of Detection
Consistent with prior RNA-seq studies conducted in bulk A549-ACE2 cells [19,38,39], we failed to observe a significant IFN-I and III signature in S+ cells (table S1 and Fig. 2E), despite a robust induction of NF-κB activity (Fig. 2E). To validate this further, we compared the level of IFNβ, IFN-λ1 and IFN-λ2/3 transcripts in A549-ACE2 cells infected for 24 hours. Cells treated with poly(I:C) were used as positive controls for IFN production. Cells infected with Measles virus (MeV), a respiratory RNA virus known to trigger an IFN response in A549 cells [48], were also included in the analysis for comparison. Flow cytometry analysis identified on average 20 to 30% of cells positive for viral proteins upon SARS-CoV-2 or MeV infection (Fig. 3C). As expected, the level of IFNβ, IFN-λ1 and IFN-λ2/3 transcripts increased in poly(I:C)-treated cells compared to cells exposed to the transfecting reagent lipofectamine only (Fig. S3B). Amounts of IFNβ, IFN-λ1 and IFN-λ2/3 transcripts were several orders of magnitude higher in MeV-infected cells than in SARS-CoV-2 infected cells (Fig. S3B). Consistently with mRNA level analysis (Fig. S3B), around 200 and 850 pg/ml of IFNβ were secreted by MeV-infected cells and poly(I:C)-treated cells, respectively (Fig. 3D). SARS-CoV-2 infected cells secreted as little as 50 pg/ml of IFNβ, which was similar to the quantity secreted by mock-infected cells and lipofectamine-exposed cells, likely representing baseline levels (Fig. 3D). MeV infected cells secreted around 1000 pg/ml of IFN-λ1 and 5000 pg/ml of IFN-λ2/3 while no IFN-λ was detected in the supernatant of SARS-CoV-2 infected cells (Fig. 3D). This baseline level of IFN type-I secretion and absence of IFN type-III release by SARS-CoV-2-infected cells is likely to be responsible for the lack of paracrine signaling revealed by the RNA-seq analysis (Fig. 1).
Upregulated NF-κB target genes contribute to an optimal SARS-CoV-2 replication
Numerous genes associated with the NF-κB signaling pathway fall into the category of genes that escaped the virus-induced cellular shutoff (Fig. 2E). To determine which of these genes were directly controlled by NF-κB, we cross-compared the upregulated genes with known NF-κB target genes. Among the 68 upregulated NF-κB-targets in S+ cells, we identified cytokines such as CXCL8/IL8 and IL32 (Fig. 4A, S4A and table S5). NFKB1, which codes for the p105/p50 subunit of the transcription factor, and is itself a NF-κB-target gene [49,50], also showed a significant transcriptional induction in S+ cells (Fig. 4A, S4A and S4B). Such mechanism generates an auto-regulatory feedback loop in the NF-κB response [49]. To identify NF-κB-driven lncRNAs, we analyzed NF-κB chromatin immunoprecipitation (ChIP)-sequencing data generated in A549 cells stimulated with TNF-α [51] and searched for known NF-κB binding motifs [103]. The analysis recovered 15 NF-κB-targets among the 184 upregulated lncRNAs in S+ cells (Fig 4A and table S5), including PACERR and ADIRF-AS1. In U937 macrophages, PACERR modulates the expression of NF-κB-target genes via a direct interaction with the NF-κB subunit p50 [52]. ADIRF-AS1 is an antisense lncRNA with no known function. Novel NF-κB target genes were also identified among unannotated genes (Fig 4A and table S5).
(A) Visualization of read coverage (tag/nucleotide) from polyA+ RNA-seq and ChIP-seq (IP – Input) at NFKB1, CXCL8, IL32 and ADIRF-AS1 loci. RNA-seq and ChIP-seq data were normalized independently, on ERCC reads for RNA-seq and on library size for ChIP-seq. (B) RT-qPCR quantification of NFKB1, CXCL8 and ADIRF-AS1 transcripts induction in A549-ACE2 cells, 24 hours post infection with SARS-CoV-2 (MOI of 1) (normalized fold change over mock-infected, n=3 independent experiments, ratio-paired t test, line at mean ± SEM).
Upregulated NF-κB target genes contribute to an optimal SARS-CoV-2 replication. (A) Volcano plot presenting log2 fold change ofRNA expression from RNA-seq analysis between S+ and mock cells and showing known NF-κB target mRNAs (labeled in blue), as well as NF-κB target lncRNAs (red) and unannotated RNAs (green) predicted from ChIP and motif analysis. (B) RT-qPCR quantification of knock-down efficiency of indicated transcripts in A549-ACE2 cells, 48 hours post-transfection with a pool of siRNAs targeting indicated genes (normalized fold change over control siRNA, n=3 independent experiments, ratio-paired t test, line at mean ± SEM). (C) RT-qPCR quantification of viral genome copy number per μg of total RNA extracted from A549-ACE2 cells, with indicated genes knocked down, 24 hours after infection with SARS-CoV-2 (MOI of 1) (n=3 independent experiments, One-Way ANOVA with Dunnett’s post-test, line at mean ± SEM). (D) Percentages of infected A549-ACE2 cells, with selected genes knocked-down, 24 hours post infection with SARS-CoV-2 (MOI of 1), quantified by flow cytometry using Spike protein staining (n=3 independent experiments, mixed model one-Way ANOVA with Dunnett’s post-test, line at mean ± SEM).
Among the top upregulated NF-κB target genes identified in S+ cells (Fig. 4A), we selected NFKB1, CXCL8/IL8, IL32 and ADIRF-AS1 for functional analysis. NFKB1 served as a positive control in these experiments since reducing its expression was previously shown to decrease SARS-CoV-2 protein expression in A549-ACE2 cells [38]. These results were unexpected since NF-κB commonly acts as antiviral factor [43]. Analysis of mRNA abundances showed a significant transcriptional induction of NFKB1, CXCL8/IL8 and ADIRF-AS1 in a bulk population of A549-ACE2 cells infected by SARS-CoV-2 for 24 hours, as compared to mock-infected cells (Fig. S4B), validating the RNA-seq analysis performed on S+ cells (Fig. 1). We had previously confirmed that IL32 transcripts were significantly more abundant in S+ cells than in mock-infected cells (Fig. 2C). We explored the potential ability of NFKB1, CXCL8, IL32 and ADIRF-AS1 to modulate the replication of SARS-CoV-2 using siRNA-mediated knock-down approaches. Twenty-four hours post-infection, intracellular viral RNA production was quantified by RT-qPCR and the number of cells positive for the viral protein S was assessed by flow cytometry analysis. RT-qPCR analyses revealed that the siRNA pools efficiently reduced the expression of their respective targets in A549-ACE2 cells (Fig. 4B). Reduced expression of NFKB1, CXCL8, IL32 and ADIRF-AS1 significantly decreased both the viral RNA yield and the number of infected cells, as compared to cells transfected with control siRNA pools (Fig. 4C and 4D). These results confirmed the pro-SARS-CoV-2 activity of NFKB1 in A459-ACE2 cells [38] and revealed that CXCL8, IL32 and ADIRF-AS1 also exhibited significant proviral functions.
Thus, our sorting approaches identified coding and non-coding genes that contribute to an optimal SARS-CoV-2 replication.
Discussion
Transcriptomic analysis of lung A549-ACE2 cells sorted based on Spike expression permitted deep sequencing of many cells synchronized for viral protein expression. Depletion of viral RNA from the samples prior to RNA-seq allowed for a robust identification of host cell DEGs. Our approach thus unveiled an accurate and comprehensive picture of genome-wide signaling networks that are directly affected by SARS-CoV-2 replication in human lung cells. It reveals a massive, but somehow selective, gene expression shutoff in S+ cells. Such reduction of cellular transcripts was underestimated in analysis performed on bulk population of infected A549-ACE2 cells [19,38,39] but was detected by RNA-seq analysis performed on bulk population of Calu-3 cells infected at an high MOI [20]. This is probably due to the fact that Calu-3 cells express high levels of ACE2 [53] and are thus naturally permissive to SARS-CoV-2, ensuring a high proportion of infected cells in the mixed culture. SARS-CoV-2 employs several strategies to decrease the level of cellular mRNAs in infected cells, including inhibition of nuclear mRNA export [20,45] and accelerated mRNA degradation as compared to control cells [20]. SARS and SARS-CoV-2 Nsp1 largely contribute to these processes by interacting with the mRNA export machinery [54] and by inducing endonucleolytic cleavage of the 5’ UTR of capped mRNAs bound to 40S ribosomes [20,55–57]. SARS-CoV-2 RNAs are protected from Nsp1-mediated degradation by their 5’ end leader sequence [20,58], which explains why we observed, in agreement with previous studies performed in A549-ACE2 cells [19] and Calu-3 cells [20], a large dominance of viral RNA over the cellular RNA pool at 24 hpi.
One consequence of this drastic shutoff is the suppression of expression of innate immune genes, such as IFN type I and type III. In agreement with previous RNA-seq studies performed in bulk population of infected A549-ACE2 cells [19,38] and kidney HEK293T-ACE2 cells [59], our transcriptomic profiling combined with analysis of mRNA levels and IFN secretion showed that infected cells failed to mount an antiviral response. Besides global gene expression reduction in host cells, SARS-CoV-2 has evolved numerous mechanisms to specifically counteract the IFN induction and signaling pathways [14]. For instance, the viral proteins Nsp6 and Nsp13 bind and block the ability of TANK binding kinase 1 (TBK1) to phosphorylate IRF3 [13] and several viral proteins, including the N and Orf6 proteins, dampen STAT1/2 phosphorylation or nuclear translocation [13,60,61]. Consistent with an absence of IFN secretion by S+ cells and, consequently, a poor paracrine response, the transcriptome of bystander S- cells largely overlapped with the one of mock-infected cells. However, a small subset of ISGs underwent modest transcriptional induction in bystander cells. They may be induced during an early stage of viral replication, prior to the production of viral proteins that antagonize IFN signaling, or by the minute amount of type I IFNs that was secreted by S+ cells. Absence of IFN response is not, however, a universal feature of SARS-CoV-2 infection. Viral replication induces a type I and III IFN response in Calu-3 cells [62–65], primary airway epithelia cultured at the air-liquid interface [62,64], human intestinal epithelial cells [66], organoid-derived bronchioalveolar models [67] and intestinal organoids [68]. When infected at a high MOI, A549-ACE2 cells also induced expression of IFN and ISGs [19]. Thus, in vitro, the magnitude of the IFN response elicited by SARS-CoV-2 is cell-type specific and dependent on the viral load. Interestingly, RNA-seq analysis of postmortem lung tissues from lethal cases of COVID-19 failed to detect IFN-I or IFN-III [19]. Type I IFN responses were highly impaired in peripheral white blood cells of patients with severe or critical COVID-19, as indicated by transcriptional analysis [18]. Moreover, infected patients had no detectable circulating IFN-β, independently of the severity of the disease [18]. Thus, our results corroborate these clinical studies highlighting the efficient shutdown of IFN production by the virus.
Although our RNA-seq analysis identified over 12000 host transcripts that were significantly reduced during SARS-CoV-2 infection as compared to control cells, it also recovered around 1500 transcripts whose levels were significantly elevated and 2800 transcripts whose levels were unchanged upon infection. Among top upregulated genes in S+ cells, we identified numerous proinflammatory cytokines, such as IL6, CXCL1, CCL2, IL8/CXCL8 and CCL20. ELISA analysis confirmed that infected cells were producing these inflammatory cytokines. They were previously identified in bulk or sc-RNA analysis of A549-ACE2 cells as upregulated [19,38,39], while others, such as IL32, were underreported. High levels of proinflammatory cytokine transcripts have been also reported in infected primary bronchial cells [19], in lung macrophages [27] and post-mortem lung samples of COVID-19-positive patients [19]. Thus, SARS-CoV-2 appears to selectively inhibit IFN signaling while allowing chemokine production in lung cells.
GO and KEGG pathway analyses confirmed the upregulation of an inflammatory response in S+ cells, including TNF- and NF-κB-transcriptional signatures. An NF-κB transcriptional footprint was previously identified in RNA-seq analysis of bulk population of SARS-CoV-2-infected tracheal-bronchial epithelial cells [26] and in scRNA-seq analysis of infected A549-ACE2 cells [38]. Microarray analysis of Calu-3 cells infected with SARS-CoV-2 also showed a specific bias towards an NF-κB mediated inflammatory response [39]. Finally, inflammatory genes specifically up-regulated in peripheral blood immune cells of severe patients or critical COVID-19 patients mainly belonged to the NF-κB pathway [18]. Consistently, among the 741 upregulated protein-coding genes that we identified in S+ cells, 68 possess an NF-κB binding site in their promoter regions. Examples include IL6, CXCL8/IL8 and IL32. We also identified NF-κB binding site in the promoter regions of lncRNAs that were upregulated in S+ cells, such as ADIRF-AS1 and PACERR. NF-κB contribution to the antiviral response is well described and is supported by numerous in vivo experiments showing that mice deficient in different NF-κB subunits are more susceptible to viral infection than wild-type mice [43]. Consistently, many viruses have evolved strategies to counteract the NF-κB-mediated antiviral response [69]. However, certain human viruses, such as HIV-1, Epstein-Barr virus and influenza A virus, activate NF-κB to block apoptosis and prolong survival of the host cell to gain time for replication [70]. Our data show that disruption of NF-κB function through silencing of its subunit p105/p50 diminished the production of viral RNAs and proteins at 24 hpi in A549-ACE2 cells, confirming its proviral role [38,39]. Several SARS-CoV-2 proteins could contribute to the activation of NF-κB signaling in infected cells. When individually expressed, Orf7a and Nsp14 activate NF-κB signaling pathway and induce cytokine expression, in Hela and HEK293T cells, respectively [71,72]. Nsp5 also induces the expression of several inflammatory cytokines, such as IL-6 and TNF-α, through activation of NF-κB in Calu-3 and THP1 cells [73]. Further studies are required to understand how SARS-CoV-2 benefits from hijacking NF-κB-driven functions.
Consistent with a proviral role of NF-κB in the context of SARS-CoV-2 infection, we found that diminished expression of three NF-κB target genes (IL32, CXCL8/IL8, and ADIRF-AS1) significantly decreased viral RNA and protein production. IL32 is a proinflammatory interleukin secreted by immune and non-immune cells that induces the expression of other inflammatory cytokines, including TNF-α, IL6, and IL1β [74]. IL32 was previously described as an antiviral factor in the context of infection with several RNA and DNA viruses. For instance, its secretory isoform reduces the replication of Hepatitis B virus by stimulating the expression of IFN-λ1 [75]. Its antiviral activity was also demonstrated in U1 macrophages infected with HIV-1 [76] and canine kidney cells infected with influenza A [77], using silencing and over-expression approaches, respectively. Further studies are required to understand the pro-SARS-CoV-2 function of endogenous IL32. It may support SARS-CoV-2 replication via its ability to activate NF-κB [78]. CXCL8/IL8 is a potent neutrophil chemotactic factor. It was previously shown to possess proviral functions in the context of infection by several unrelated RNA and DNA viruses, probably via inhibition of the antiviral action of IFN-α [79,80]. It could act in a similar manner in SARS-CoV-2 infected A549-ACE2 cells.
As for coding genes, there was a higher proportion of down-versus up-regulated lncRNAs in S+ cells. GO cannot be extrapolated from lncRNAs since most of them have no known function, indicating the need for future studies in this area. Several RNA-seq and microarray studies have identified hundreds of lncRNAs induced by IFN stimulation or viral infection in diverse human and mice cell types [35,81–83]. Analysis of a handful of them has provided a glimpse of the potential regulatory impact of this class of RNAs on the IFN response itself [84] and on ISG expression [35,81,83]. However, the investigation of the precise role of individual lncRNAs in IFN-mediated antiviral response is still in its infancy stage. By analyzing publicly available SARS-CoV-2-infected transcriptome data, several studies recovered lncRNAs that were misregulated upon infection of human lung epithelial cell lines, primary normal human bronchial epithelial cells and BALF [85–88]. However, no lncRNA with a direct action on the life cycle of SARS-CoV-2 has been identified prior to this study. We show that the lncRNA ADIRF-AS1, which was among the top upregulated lncRNAs both in S+ cells and in a dataset that we re-analyzed [19], has a proviral function. We identified a NF-κB binding site near its promoter region. It would be interesting to understand the mechanisms by which ADIRF-AS1 enhances SARS-CoV-2 replication and whether its proviral function depends on NF-κB.
Finally, our analysis profiled about 600 differentially expressed unannotated polyA+ transcripts in S+ and bystander cells. The identification of these unannotated genes confirms that the genome is far from being well characterized. Having specific RNAs expressed in particular conditions could open the way for the identification of pro- or anti-viral genes that could be used for better prognosis of at-risk patients or for the follow up of the disease severity.
Our data suggests that the genes that are refractory to the viral-induced shutoff are proviral genes. Understanding the molecular mechanisms underlying the selectivity of the shut-off would be interesting. Since coronavirus Nsp1 induces the cleavage of the 5’UTR of capped transcripts bound to 40S ribosomes, the 5’UTR length and/or structure may affect Nsp1 binding and subsequent degradation. Alternatively, the extent of transcript reduction may be linked to their GC content and/or their lengths, which could affect the specificity of the host RNase that is presumably recruited by Nsp1. Discovering the host RNase responsible for transcript degradation in SARS-CoV-2-infected cells will shed light on the mechanism of selectivity of the viral-induced shutoff.
Material and Methods
Cell lines
Human lung epithelial A549-ACE2 cells, which have been modified to stably express ACE2 via lentiviral transduction, were generated in the laboratory of Pr. Olivier Schwartz (Institut Pasteur, Paris, France). A549-ACE2 and African green monkey Vero E6 cells (ATCC CRL-1586) were cultured in high-glucose DMEM media (Gibco), supplemented with 10% fetal bovine serum (FBS; Sigma) and 1% penicillin-streptomycin (P/S; Gibco). Cells were maintained at 37°C in a humidified atmosphere with 5% CO2.
Virus and infections
Experiments with SARS-CoV-2 isolates were performed in a BSL-3 laboratory, following safety and security protocols approved by the risk prevention service of Institut Pasteur. The strain BetaCoV/France/IDF0372/2020 was supplied by the National Reference Centre for Respiratory Viruses hosted by Institut Pasteur (Paris, France) and headed by Pr. S. van Der Werf. The human sample from which the strain was isolated has been provided by Dr. X. Lescure and Pr. Y. Yazdanpanah from the Bichat Hospital, Paris, France. Viral stocks were produced by amplification on Vero E6 cells, for 72 h in DMEM 2% FBS. The cleared supernatant was stored at 80°C and titrated on Vero E6 cells by using standard plaque assays to measure plaque-forming units per ml (PFU/ml). A549-ACE2 were infected at MOI of 1 in DMEM without FBS. After 2 h, DMEM with 5% FBS was added to the cells. The Measles Schwarz strain expressing GFP (MeV-GFP) was described previously [89] and was used at an MOI of 1.
Poly I:C stimulation
Cells were stimulated with 10 ng/μL Poly(I:C) (HMW, #vac-pic Invivogen) using Lipofectamine 3000 Reagent (Thermo Fisher Scientific) according to manufacturer’s protocol. Treatment was maintained for 24 hours, concomitantly with infection.
Flow cytometry
Cells were detached with trypsin, washed with PBS and fixed in 4% PFA for 30 min at 4°C. Intracellular staining was performed in PBS, 2% BSA, 2mM EDTA and 0.1% Saponin (FACS buffer). Cells were incubated with antibodies recognizing the spike protein of SARS-CoV-2 (anti-S2 H2 162, a kind gift from Dr. Hugo Mouquet, Institut Pasteur, Paris, France) and subsequently with secondary anti-human AlexaFluor-647 antibody (1:1000, A21455 Thermo) for 30 min at 4°C. Data were acquired using Attune NxT Acoustic Focusing Cytometer (Thermo Fisher) and analyzed using FlowJo software.
SARS-CoV-2 infected and bystander cell-sorting and RNA extraction on fixed samples for RNA-seq
A549-ACE2 cells were seeded the day prior to infection. Cells were infected with SARS-CoV-2 at MOI 1 or mock infected. Infections were done in two independent repeats with three technical replicates each. At 24 h post infection, cells were detached with trypsin, fixed in 4% PFA for 30 min on ice and stained for spike protein as described above for flow cytometry, with RNasin added to FACS buffer (1:100 dilution) just before use to prevent RNA degradation. Infected cell samples were resuspended in PBS 2%, 25 mM Hepes, 5 mM EDTA (sorting buffer) and sorted at 4°C on a FACSAria Fusion4L Sorter into infected (presence of S protein expression) and bystander (absence of viral protein expression) cell populations. Cells were collected in FBS-coated tubes containing buffer with RNasin to minimize RNA degradation. After sorting, cells were pelleted at 500g for 5 min at 4°C and RNA was extracted with the RecoverAll Total Nucleic Acid Isolation Kit starting at the protease digestion step. Digestion was performed for 15 min at 50°C and 15 min at 80°C in the presence of RNasin. Extraction was performed according to manufacturer’s instructions and the addition of RNAsin to all buffers just before use until final elution of RNA in DNAse-free water. Residual DNA was further digested using DNAse I (Invitrogen AM1906). RNAs were sorted at −80°C until further analysis.
Library preparation, viral RNA depletion and RNA-sequencing
500-1000 ng of total RNA were depleted of SARS-CoV-2 RNA using custom designed probes. The probes were synthesized using the NC_045512.2 Wuhan-Hu-1 complete genome reference. The design was made by Illumina and is composed of 459 probes, separated into two pools synthetized by IDT. For the SARS-CoV-2 depletion, we mixed both pools and used 1μl of this mix per sample, replacing the Ribozero+ probes at the ribodepletion reaction step of the Illumina Stranded Total RNA prep ligation protocol. The SARS-CoV-2 depleted RNA samples were normalized to 300ng and ERCC Spike was added as recommended by the protocol ERCC RNA Spike-In Control mixes User Guide. The libraries were prepared using the Illumina Stranded mRNA Prep Ligation Reference Guide.
PolyA+ RNA-sequencing analysis of sorted cells
Dataset consists of 9 paired-end libraries (150 nt), 3 replicates per condition: mock, bystander and infected cells. Adaptors were trimmed with Trim Galore v0.6.4 [90] (wrapper for cutadapt v2.10 [91] and FastQC v0.11.9 [92]), with options --stringency 5 --trim-n -q 20 --length 20 --paired --retain_unpaired. Reads were mapped to a reference containing human genome (hg38), SARS-CoV-2 (NC045512.2) and ERCC sequences. STAR v2.7.3a [93] was used to map the reads, with default parameters. Bam files were then filtered using SAMtools v1.10 [94] to retained reads flagged as primary alignment, and with mapping quality > 30 (option -q 30 -F 0×100 - F 0×800). Read coverage was computed for each strand with bamCoverage (deepTools v3.5.0 [95]) with options --binSize 1 --skipNAs --filterRNAstrand forward/reverse. For the detection of unannotated transcripts, Scallop v0.10.5 [36] was used to reconstruct transcripts, with options --library_type first --min_transcript_coverage 2 --min_splice_bundary_hits 5 --min_flank_length 5. Scallop was run on each library, and the resulting annotations were merged using cuffmerge v1.0.0 [96], with gencode annotation (v32) as reference (-g option). Then BEDtools v2.29.2 [97] was used to retain only intergenic and antisens transcripts regarding gencode annotation. Gene expression quantification was performed using featureCounts v2.0.0 [98], with options -O -M --fraction -s 2 -p, using a merged annotation of gencode v32, SARS-CoV-2 (NC045512.2), newly annotated transcripts and ERCC transcripts. Subsequent analyses were performed in R v3.6.2 [99]. Differential expression analysis was performed using DESeq2 package [100], after filtering out genes with less than 10 raw counts for all replicates in at least one condition. Gene counts were normalized on ERCC counts, using estimateSizeFactorsForMatrix function from DESeq2. All pairwise comparisons were performed (mock vs infected, mock vs bystander and bystander vs infected), and genes were retained as differential if adjusted p-value was < 0.05 and log fold-change > 1 or < −1. All plots were made using custom script, except for heatmaps that were done using pheatmap package (RRID:SCR_016418).
PolyA+ RNA-sequencing analysis of bulk population of infected cells (from a public dataset)
Fastq files produced in the study of Blanco-Melo et al (2020)[19] were retrieved from GEO repository (GSE147507). Dataset consist of single-end libraries (150 nt). We compared A549-ACE2 “mock” cells (SRR11517680, SRR11517681 & SRR11517682) versus A549-ACE2 cells infected with SARS-CoV-2 at MOI 0.2 (SRR11517741, SRR11517742 & SRR11517743). Adaptors were trimmed with Trim Galore v0.6.4 [90], with options --stringency 5 --trim-n -q 20 --length 20. Reads were mapped on a reference containing human genome (hg38) and SARS-CoV-2 (NC045512.2) sequence. Bam files were then filtered using SAMtools v1.10 [94] to retained reads flagged as primary alignment, and with mapping quality > 30 (option -q 30 -F 0×100 -F 0×800). Gene expression quantification was performed using featureCounts v2.0.0 [98], with options -O -M --fraction -s 2, using a merged annotation of gencode v32, SARS-CoV-2 (NC045512.2) and newly annotated transcripts. Gene counts were normalized on the full count matrix, using estimateSizeFactorsForMatrix function from DESeq2 [100]. Differential analysis was performed as described above.
GO enrichment analysis
The GO enrichment and KEGG pathway analysis were performed using DAVID online tool (updated version 2021) [101,102]. Upregulated protein-coding genes from each comparison were taken for the analysis with default background for Homo sapiens. GOTERM_BP_DIRECT and KEGG pathway were retained and top 10 results based on adjusted p-value (Benjamini) were plotted using ggplot2 R package (v 3.3.0).
Identification of NF-κB target genes
A list of coding genes that are known targets of NF-kB is available on Gilmore’s laboratory website (https://www.bu.edu/nf-kb/gene-resources/target-genes). We selected genes from this list that were shown to be direct targets of NF-κB, and for which the gene symbol could be retrieved in gencode annotation (354 genes). For identifying lncRNAs and unreferenced RNAs that possess NF-κB binding site in their promoter, we used p65 ChIP-seq data from GEO dataset GSE34329 [51] - one input file and 2 ChIP replicates, 38nt long reads, single-end. Reads were mapped using bowtie2 v2.4.1 using hg38 as reference, and SAMtools was used to retained the one flagged as primary alignment, with mapping quality > 30, and to remove PCR duplicates (markdup, with -r option). NF-κB binding sites were then detected using macs2 v2.2.7.1 [103], with command callpeak -t ChIP_BamFile1 ChIP_BamFile2 -c input_BamFile -f BAM -g hs -s 38 --keep-dup all. Peaks in the first decile of the −log10(qvalue) value were discarded. NF-kB motif genomic coordinates in the human genome were retrieved using EMBOSS fuzznuc v6.6 [104], using motif 5’-G(3)[AG]N[CT](3)C(2) - 3’ [105], on forward and reverse strand (option -complement Y). Peaks and NF-κB motif coordinates were compared using BEDtools [97]; if a motif was contained in a peak, the motif strand was assigned to the peak. LncRNA and un-references transcripts were identified as NF-κB potential targets if their promoter region (1kb before transcript TSS) had a peak containing a motif or a peak for which the −log10(qvalue) was in the top 5%.
RNA extraction and RT-qPCR assays
Total RNA was extracted from cells with the NucleoSpin RNA II kit (Macherey-Nagel) according to the manufacturer’s instructions. First-strand complementary DNA (cDNA) synthesis was performed with the RevertAid H Minus M-MuLV Reverse Transcriptase (Thermo Fisher Scientific) using random primers. Quantitative real-time PCR was performed on a real-time PCR system (QuantStudio 6 Flex, Applied Biosystems) with Power SYBR Green RNA-to-CT 1-Step Kit (Thermo Fisher Scientific). Data were analyzed using the 2-ΔΔCT method, with all samples normalized to endogenous BPTF, whose gene expression was confirmed as homogenous across samples by RNA-seq. Genome equivalent concentrations were determined by extrapolation from a standard curve generated from serial dilutions of plasmid encoding a fragment of the RNA-dependent RNA polymerase (RdRp)-IP4 of SARS-CoV-2. Primers used for RT-qPCR analysis are given in table S6.
siRNA-mediated knockdown
A549-ACE2 cells were transfected using Lipofectamine RNAiMax (Life Technologies) with 10nM of control (#4390843, Ambion) or CXCL8 (L-004756-00, Dharmacon), NFKB1 (L-003520-00, Dharmacon), IL32 (L-015988-00, Dharmacon), ADIRF-AS1 (siTOOLs Biotech) siRNAs following the manufacturer’s instructions. 48h after transfection, cells were infected with SARS-CoV-2 for 24 h.
Chemokine and Interferon expression and secretion
Cell lysates for intracellular chemokine quantification were obtained via repeated freeze-thaw cycles at −80°C of cells suspended in media containing protease inhibitor cocktail (Roche Applied Science) and final centrifugation at 8000g to pellet debris. IL6, CXCL1, CCL2, CXCL8 and CCL20 concentrations in supernatants of from control, infected or stimulated cells, were measured using a custom-designed LEGENDplex Human Panel. Data were acquired on an Attune NxT Flow Cytometer (Thermo Fisher) analyzed with LEGENDplex software (BioLegend). Similarly, IFN-β, IFN-λ1 and IFN-λ2/3 concentrations were measured in undiluted supernatants from control, infected or stimulated cells using a LEGENDplex Human Type 1/2/3 Interferon Panel assay (BioLegend) according to the manufacturer’s protocol.
Statistical analysis
Statistical parameters including the exact value of n, precision measures (as means ± SEM), statistical tests and statistical significance are reported in the figure legends. In figures, asterisks denote statistical significance: *p < 0.05, **p < 0.01, ***p < 0.005, ****p < 0.0001, and “ns” indicates not significant. Statistical analysis was performed in GraphPad Prism 9 (GraphPad Software Inc.).
Funding
This work was funded by the CNRS (NJ), Institut Pasteur (NJ), ‘Urgence COVID-19’ fundraising campaign of Institut Pasteur (NJ), ANR-DARK COVID (AM/NJ) and DIM-1-Health (NJ/AM). DF postdoctoral fellowship was supported by the DIM-1-Health from the Conseil Régional d’Ile-de-France. SMA is supported by the Pasteur-Paris University (PPU) International PhD Program. The funders had no role in study design, data collection and analysis, decision to publish or preparation of the manuscript.
Additional information
Table S1. Differential expression analysis of mRNAs (infected S+ vs mock, infected S+ vs bystander, and bystander vs mock).
Table S2. Differential expression analysis of lncRNAs (infected S+ vs mock, infected S+ vs bystander, and bystander vs mock).
Table S3. Differential expression analysis of unannotated RNAs (infected S+ vs mock, infected S+ vs bystander, and bystander vs mock).
Table S4. Gene overlap between DE-seq from our ‘sorted vs mock’ samples and ‘mixed vs mock’ data re-analyzed from Blanco-Melo et al. 2020 (MOI of 0.2).
Table S5. This table shows known NF-κB target mRNAs, as well as predicted NF-κB target lncRNAs and unreferenced RNAs, among upregulated RNAs in S+ vs mock cells.
Table S6. RT-qPCR primer sequences
Acknowledgments
We thank the French National Reference Centre for Respiratory Viruses hosted by Institut Pasteur (France) and headed by Pr. S. van Der Werf for providing the historical SARS-CoV-2 strain; C. Combredet and F. Tangy (Institut Pasteur) for producing and sharing MeV-GFP; H. Mouquet and C. Planchais (Institut Pasteur) for anti-S antibodies and O. Schwartz (Institut Pasteur) for the A549-ACE2 cells. We are grateful to our team member Felix Streicher for helping design the figure 2A and to all members of our laboratories for helpful discussions. We acknowledge the UTechS Immunology Platform of Institut Pasteur for the use of the cell sorter. The Biomics Platform was supported by France Génomique (ANR-10-INBS-09), IBISA and the Illumina COVID-19 Projects’ offer.
References
- 1.↵
- 2.↵
- 3.↵
- 4.↵
- 5.
- 6.↵
- 7.↵
- 8.↵
- 9.↵
- 10.↵
- 11.↵
- 12.↵
- 13.↵
- 14.↵
- 15.↵
- 16.↵
- 17.↵
- 18.↵
- 19.↵
- 20.↵
- 21.↵
- 22.↵
- 23.↵
- 24.↵
- 25.↵
- 26.↵
- 27.↵
- 28.↵
- 29.↵
- 30.↵
- 31.↵
- 32.↵
- 33.↵
- 34.↵
- 35.↵
- 36.↵
- 37.↵
- 38.↵
- 39.↵
- 40.↵
- 41.↵
- 42.↵
- 43.↵
- 44.↵
- 45.↵
- 46.
- 47.↵
- 48.↵
- 49.↵
- 50.↵
- 51.↵
- 52.↵
- 53.↵
- 54.↵
- 55.↵
- 56.
- 57.↵
- 58.↵
- 59.↵
- 60.↵
- 61.↵
- 62.↵
- 63.
- 64.↵
- 65.↵
- 66.↵
- 67.↵
- 68.↵
- 69.↵
- 70.↵
- 71.↵
- 72.↵
- 73.↵
- 74.↵
- 75.↵
- 76.↵
- 77.↵
- 78.↵
- 79.↵
- 80.↵
- 81.↵
- 82.
- 83.↵
- 84.↵
- 85.↵
- 86.
- 87.
- 88.↵
- 89.↵
- 90.↵
- 91.↵
- 92.↵
- 93.↵
- 94.↵
- 95.↵
- 96.↵
- 97.↵
- 98.↵
- 99.↵
- 100.↵
- 101.↵
- 102.↵
- 103.↵
- 104.↵
- 105.↵