Reanalysis of MERS, SARS and COVID-19 Infection Datasets using VirOmics Playground Reveals Common Patterns in Gene and Protein Expression

Three lethal lower respiratory tract coronavirus epidemics have occurred over the past 20 years. This coincided with major developments in genome-wide gene and protein expression analysis, resulting in a wealth of datasets in the public domain. Seven such in vitro studies were selected for comparative bioinformatic analysis through the VirOmics Playground, a user-friendly visualisation and exploration platform we recently developed. Despite the heterogeneous nature of the data sets, several commonalities could be observed across studies and species. Differences, on the other hand, reflected not only variations between species, but also other experimental variables, such as cell lines used for the experiments, infection protocols and potential discrepancies between transcriptome and proteome data. The results presented here are available online and can be replicated through the VirOmics Playground.


Introduction
Common human Coronavirus species, of which four are currently known, are usually associated with infections of the upper respiratory tract and cold-like symptoms [1]. However, the emergence of species infecting the lower respiratory tract and causing lethal epidemics has been a recurring event for the past 20 years. Starting with the SARS epidemic in China in 2002, the MERS epidemic in the Middle East ten years later and finally to the current COVID-19 pandemic [2].
The causative agents are all single-stranded RNA viruses (MERS-CoV for MERS, SARS-CoV-1 for SARS and SARS-CoV-2 for COVID- 19) belonging to the beta-coronavirus genus within the Coronaviridae family [3]. All three infections started as zoonotic diseases and while the animal reservoir for both SARS-CoV-1 and MERS-CoV have been discovered, there is still uncertainty concerning the animal host for SARS-CoV-2, although it is most closely related to a bat SARS-like coronavirus [4]. SARS-CoV-1 and SARS-CoV-2 are, as the name implies, also genetically closer to each other than to MERS-CoV. This is reflected in the receptors that are recognised by the viral spikes surface glycoproteins of each virus. Thus, while MERS-CoV binds to dipeptidyl peptidase 4 (DPP4) on the surface of host cells [5], both SARS-CoV-1 and SARS-CoV-2 use the Angiotensin-converting enzyme 2 (ACE2) as the main entry point into the cell [6,7]. Binding to ACE2 is induced by the priming of the viral S protein via the human serine protease TMPRSS2 [7]. A study on mice showed that SARS-CoV-1 infections resulted in the reduction of ACE2 expression in vivo [8], which was postulated to be responsible for several of the pathologies induced by the infection.
The epidemiologies of the three diseases display different characteristics. While MERS was a highly lethal infection with a low reproductive rate, both SARS and COVID-19 displayed a similarly high reproductive rate (>3) but a lower fatality rate [3]. COVID-19 has by far the lowest mortality rate of the three and also the highest estimated reproductive rate, which may explain why it is the only one of the three viruses to cause a global pandemic.
The clinical manifestations caused by the three viruses are similar. They all affect the lungs and provoke extensive alveolar damage [2]. However, while patients affected with SARS or MERS almost always developed a fever and also presented a high number of cases with gastrointestinal symptoms, patients with COVID-19 rarely develop gastrointestinal symptoms. Furthermore, the majority of COVID-19 patients are either asymptomatic or present mild symptoms, a factor that has played a role in the spread of the disease [3]. Thrombosis and embolisms have been observed both in SARS and COVID-19 patients [9,10] and may explain sudden deaths observed in recovering patients.
All three infections have been associated with a cytokine storm triggered by an excessively stimulated immune system [3], which contributes to the morbidity of the diseases in severe patients. This cytokine storm is characterised by the release of large quantities of pro-inflammatory cytokines (such as Interferon-alpha (IFN-alpha), IFN-gamma, Tumor Necrosis Factor alpha (TNF-alpha) and Interleukin 6 (IL-6)) and chemokines, which induce an immune responses that damages internal organs and can result in death [11]. Intriguingly, the upregulation of anti-inflammatory cytokines such as IL-4 and IL-10 has also been observed [12]. The use of cytokine inhibitors may be beneficial after the onset of the cytokine storm. In particular, tocilizumab, an anti-IL-6 receptor monoclonal antibody, has been tested in COVID-19 patients and showed promising results [13]. Conversely, IFN-beta expression is inhibited by SARS-CoV-1 infections [14], while a general type I interferon inhibition has been observed in SARS-CoV-2 infections [15]. Type I interferons have antiviral properties and may be actively repressed by SARS-CoV-1 and SARS-CoV-2 during infection to escape the host immune response. Indeed, news reports (https://www.bbc.com/news/health-53467022) of a recent clinical trial with an IFN-beta aerosol indicate a reduction in the number of patients developing critical conditions, although the results have yet to be peer reviewed.
There is still no vaccine against any of the three main coronaviruses and various antiviral and anti-inflammatory drugs are currently being tested for their efficacy. Among them, the most prominent ones are: lopinavir/ritonavir, remdesivir, IFN-beta, chloroquine [3], baricitinib, fedratinib, and ruxolitinib [16,17]. Trametinib and selumetinib, two ERK/MAPK signal pathway inhibitors, have also shown significant efficacy in vitro against MERS-CoV infections [18].
These epidemics coincided with the development and rise of large scale gene expression analysis technologies (microarray and later on RNA-seq), with various groups embracing these technologies to generate a plethora of data sets that are currently publicly available.
For this review, we have selected seven public datasets (six transcriptomes and a proteome) based on cell lines infected with MERS-CoV, SARS-CoV-1 and SARS-CoV-2.
While not aiming at providing a comprehensive meta-analysis, we wanted to highlight major similarities and differences both across species and across studies on the same species at the gene expression or proteomic level.

Materials and methods
We selected a total of seven studies: six consisted of transcriptomic data (four microarrays, two RNA-seq) and one consisted of proteomic data. Of these studies three were conducted on MERS-CoV, one on SARS-CoV-1, one on SARS-CoV-2 and one on both SARS-CoV-1 and SARS-CoV-2. Datasets were obtained from the Gene Expression Omnibus (GEO) repository, except for the proteomic dataset, which was obtained from the Proteomics Identification Database (PRIDE). Details on each dataset are provided in Table 1. Where available we used the read counts produced by the authors, otherwise we generated read counts using Salmon from the raw input files [19].  For the analysis presented here, only measurements taken 24h or longer post-infection were considered. For the GSE65574 dataset, only the infection with the wildtype MERS virus was considered. In total, the analysed samples consisted of five samples infected with MERS-CoV, eleven samples infected with SARS-CoV-1 and seven samples (including the proteome dataset) infected with SARS-CoV-2. The filtered read counts were then analysed with VirOmics Playground, a cloud-based open access bioinformatics interface based on a platform that we recently developed for the analysis and visualisation of transcriptomic and proteomic data [25].
Unless otherwise stated, significant differential expression (q<0.05) was calculated using the intersection between three algorithms, namely edgeR, DEseq2 and limma [26][27][28] (Robinson et al, 2010;Love et al, 2014;Ritchie et al, 2015), and with a minimum logFC of 0.5. The same approach was also used when identifying significantly altered KEGG pathways, when performing a drug connectivity analysis or when performing a word cloud analysis, with the three algorithms being camera, GSVA and fGSEA [29][30][31] Due to its different nature, discovery of pathways with altered expression for the proteomic data was performed via an enrichment analysis of keywords in reference datasets (displayed as a world cloud), as applying the same criteria for transcriptomic data results in a more stringent analysis than traditionally performed for proteomic studies. More information on the methods used can be found both in Akhmedov et al [25] and on the online documentation (https://omicsplayground.readthedocs.io/en/latest/).
All datasets used for the analysis are currently available at VirOmics Playground (https://covid19.bigomics.ch/). The platform allows convenient access to the Drug Connectivity Map database and can perform connectivity analysis to find drug gene sets with inversely correlated profiles to the experimental gene sets obtained. These may indicate drugs with potential antiviral properties. We used this method to identify common inhibitory drug mode of actions against all three viral species.

Differentially Expressed Genes
The numbers of differentially expressed genes (or proteins) varied greatly across studies. While the numbers were fairly similar across the three MERS datasets, great variations were observed in both the SARS-CoV-1 and SARS-CoV-2 datasets (Table 2). Thus, for example, the number of differentially expressed genes in the SARS-CoV-2 datasets (excluding the H1299 cell line that yielded no dysregulated genes) ranged from 55 for the GSE147507 A459 cell line to 4425 for the GSE148729 Calu-3 cell line. This could be a reflection of the different cell lines, infection protocols and possibly viral strains used in the studies. It was also observed that the H1299 cell line (potentially a typical host cell for SARS viruses) did not show any evidence of gene expression alterations. This could be due to a failure in the infection protocol.
In the only experiment where the same cell lines were infected with SARS-CoV-1 and SARS-CoV-2, there were striking differences in the amount of differentially expressed genes. Thus, in Caco-2 cells, infections with SARS-CoV-1 generated almost 6x more dysregulated genes compared to an infection with SARS-CoV-2. Meanwhile, the opposite was true in Calu-3 cells, where a SARS-CoV-2 infection generated more than 2X differentially expressed genes compared to a SARS-CoV-1 infection. Given the genetic similarity between the two viruses, such differences are rather unexpected. When looking at the top 10 up-and down-regulated genes in each sample split by virus (excluding samples consisting of H1299 infected cells), no shared downregulated genes could be observed across virus species. However, MERS-CoV infected cells all shared the same most upregulated gene (SSX2), independently of time of collection or cell type. Furthermore, gene SLC22A23 was identified among the top 10 upregulated genes in all the samples collected 24h post infection. Gene FBXL8 (F-box and leucine rich repeat protein 8) was in the top 10 upregulated genes in all samples from datasets GSE65574 and GSE86529 and was also significantly upregulated in dataset GSE45042. Gene HSPA6 was in the top 10 upregulated genes of all the GSE86529 samples and was also significantly upregulated in the samples from the other two datasets.
For SARS-CoV-1 infected Calu-3 cells, IL-6 was among the top 10 upregulated genes across various time points in both available studies. For SARS-CoV-2 infected cells, no shared upregulated genes could be found. Considering that all available samples were from a different cell line each may explain the lack of commonality.

Viral spike protein cellular targets and priming protein
There was no consistent up-or down-regulation of the DPP4 gene (the receptor for the MERS-CoV S protein) across the seven studies (Fig 1-3 A-C). In MERS-infected samples, the gene was significantly upregulated in study GSE45042 and non-significantly in study GSE65574 (Fig 1A and 1B), but downregulated in study GSE86529 (Fig 1C). One SARS-CoV study (GSE148729) conversely indicated down-regulation in Calu-3 cells and, in the case of SARS-CoV-1, Caco-2 cells (Fig 2B and 3A). There was no statistically significant change in expression in other SARS-CoV infected samples.  The expression of the ACE2 gene, whose product serves as a receptor for the SARS-CoV-1/2 S protein, was unaltered in MERS infected cell lines (Fig 1D-F). The picture was more mixed in SARS-CoV-1/2 infected cell lines. There was no evidence of gene expression reduction in Calu-3 cells infected with SARS-CoV-1 (Fig 2C) or NHBE and A459 cells infected with SARS-CoV-2 (Fig. 3E). Even at the proteome level, no difference in expression was detected ( Fig 3F). Only in one study, infection with both SARS-CoV species induced a statistically significant reduction in gene expression, but only in one of the three cell lines that were infected, namely Calu-3 (Fig 2D and 3B).
Finally, TMPRSS2 gene expression provided very mixed results. It was observed to be either down-regulated (Fig 1G), up-regulated ( Fig 1I) or unaltered ( Fig 1H) in MERS-infected cell lines. Expression was down-regulated in Calu-3 cells infected with either SARS-CoV-1 or SARS-CoV-2 ( Fig 2F and 3G), but was otherwise unaffected in all other samples. No data was available from the proteome study.

Immune Response Genes
Coronavirus infections have been associated with the increased expression of various cytokines. Fifteen cytokines selected based on the available literature was selected for analysis across any of the available samples collected at least 24h post infection, excluding the proteomic dataset as none of the cytokines were present there, and the H1299 cell lines, as there were no dysregulated genes present, for a total of 18 samples.
Of these cytokines, seven were found to be upregulated in at least one sample for each of the three species, namely: IL-1B, IL-6, IL-8, IL-12A, CXCL10 (C-X-C motif chemokine ligand 10), CCL2 (Chemokine C-C ligand 2) and TNF-alpha (Fig 4). IL-6, and to a lesser extent IL-8 and CCL2, were frequently upregulated across all three diseases (MERS, SARS and COVID-19). IL-7, on the other hand, was the only gene upregulated in SARS and COVID-19, but not MERS.

Immune Response Genes
Coronavirus infections have been associated with the increased expression of various cytokines. Fifteen cytokines selected based on the available literature was selected for analysis across any of the available samples collected at least 24h post infection, excluding the proteomic dataset as none of the cytokines were present there, and the H1299 cell lines, as there were no dysregulated genes present, for a total of 18 samples. Of these cytokines, seven were found to be upregulated in at least one sample for each of the three species, namely: IL-1B, IL-6, IL-8, IL-12A, CXCL10, CCL2 and TNF-α (Fig. 4). IL-6, and to a lesser extent IL-8 and CCL2, were frequently upregulated across all three diseases (MERS, SARS and COVID19). IL-7, on the other hand, was the only gene upregulated in SARS and COVID19, but not MERS. IFN-β and IFN-ε were the only two type I interferons that were measured in all three SARS and Covid-19 datasets (GSE14707, GSE37827 and GSE148729). There was a complete lack of statistically significant upregulation for IFN-ε following infection in any of the datasets at any available time point (Figs. 5-7). While there was a lack of upregulation early on in the infection, after 24h IFN-β expression did increase in Calu-3 cells in two of the datasets (Figs. 5-6). In the only study where SARS-CoV-2 infections were compared against two other respiratory tract IFN-beta and IFN-epsilon were the only two type I interferons that were measured in all three SARS and COVID-19 datasets (GSE14707, GSE37827 and GSE148729). There was a complete lack of statistically significant upregulation for IFN-epsilon following infection in any of the datasets at any available time point (Fig 5-7). While there was a lack of upregulation early on in the infection, after 24h IFN-beta expression did increase in Calu-3 cells in two of the datasets (Fig 5-6). In the only study where SARS-CoV-2 infections were compared against two other respiratory tract viruses, namely Influenza A virus (IAV) and Respiratory Syncytial Virus (RSV), the difference in type I interferon expression levels was striking.SARS-CoV-2 infections showed expression profile indistinguishable from control samples, in clear contrast with a general upregulation of the same interferons in the infections with IAV or RSV (Fig 7E-H). The picture was less clear in MERS-CoV infected samples, with some showing a slight upregulation and others no change in Type I interferon expression levels (data not shown).

Altered Pathway Analysis
For the six transcriptome studies, a KEGG pathway analysis was performed to identify pathways with significantly (q<0.05) altered expression levels in all samples collected at least 24h post infection. Samples composed of infected H1299 cell lines were excluded, due to the lack of differentially expressed genes. For the proteome study, due to the different nature of the input data, a word cloud plot of the most common keywords found in significantly altered gene sets was generated with the VirOmics Playground platform. The word cloud plot (Fig 8) contained terms that have been described in the published article based on the same dataset [24]. Thus, the term "cholesterol" appears with a negative normalised enrichment score (NES), indicating a downregulation of associated genes. Conversely, genes in keywords associated with carbon metabolism (glycolysis, glucose) were upregulated. In a separate KEGG pathway analysis, a slight upregulation of the spliceosome pathway could be observed, although it was not statistically significant (data not shown).   (Table S1-S3) were split into up-and down-regulated groups and combined by species and intersections between the three species displayed as Venn diagrams. Secondly, only experiments conducted on Calu-3 cells (the most common cell used in the studies selected here) were chosen, but including any samples collected between 24h and 72h post infection. The up-and down-regulated KEGG pathways (Table S4-S6) were again combined by species, and Venn diagrams were produced as indicated above.
The 24h samples intersection of KEGG pathways yielded three upregulated pathways and six downregulated that were shared by all three species (Fig 9-10), while two and eight were shared between SARS-CoV-1 and SARS-CoV-2 infected samples respectively (Fig 9-10). The upregulated pathways shared by all three species included terms related to immunity and ribosome formation, while downregulated pathways included the "Valine, Leucine and Isoleucine degradation" pathway, "pyruvate metabolism" and the "peroxisome" pathway ( Table 3). The "complement and coagulation" pathway was among the upregulated pathways in common between SARS-CoV-1 and SARS-CoV-2 infected cells, while shared downregulated pathways included the "glycolysis and gluconeogenesis" pathway (Table 4).   The intersection analysis based on Calu-3 cells, yielded 1 upregulated and 6 downregulated KEGG pathways shared between all viral species and 14 and 22 respectively shared between cells infected with SARS-CoV-1 and SARS-CoV-2 ( Fig  11-12). The list of pathways shared by all three species included the "cytokine cytokine receptor" pathway, as well as pathways related to carbohydrate metabolism and protection from oxidation (Table 5). When comparing SARS-CoV-1 and SARS-CoV-2 infected Calu-3 cells, shared upregulated pathways included different immune-related gene sets, while shared carbohydrate metabolism-related pathways and butanoate metabolism were present among the downregulated pathways (Table 6).

Drug Connectivity Analysis
A Drug Connectivity analysis was performed on a total of five MERS-CoV infected samples, 9 SARS-CoV-1 infected samples and five SARS-CoV-2 infected samples (including the proteome study). Drug gene sets that had statistically significant inversely correlated expression profiles to the experimental results were obtained and summarised by mechanisms of action through the Omics Playground platform. The top 10 most negatively correlated mechanisms were then combined by disease (MERS, SARS or COVID-19). Samples infected by MERS-CoV displayed a strong negative correlation with cyclin dependant kinase (CDK), epidermal growth factor receptor (EGFR), dual specificity mitogen-activated protein kinase (MEK), mammalian target of rapamycin (mTOR), phosphatidylinositol-3 kinases (PI3K) and tyrosine-protein kinase src (src) inhibitors, with at least 4 out of 5 samples showing a negative correlation with such mechanisms of action (Fig 13).
SARS-CoV-1 infected samples also showed a strong negative correlation (with detection among the top 10 inhibitory profiles in at least 7/9 samples) with CDK, EGFR, MEK and src inhibitors (Fig 14). Additionally, p38 MAPK inhibitors were also prominently represented.
SARS-CoV-2 infected samples displayed five mechanisms of action with inhibitory potential in at least three of the five samples analysed. These included CDK, EGFR, MEK and src inhibitors, as well as serotonin receptor antagonists, p38 MAPK inhibitors and 3-Hydroxy-3-Methylglutaryl-CoA Reductase (HMGCR) inhibitors (Fig 15).
Overall, strong support for four mechanisms of action with broad anti-coronavirus inhibitory potential could be detected, namely: CDK, EGFR, MEK and src inhibitors. Additionally, p38 MAPK inhibitors were also highly frequently negatively correlated with the gene expression profiles generated by SARS-CoV-1 and SARS-CoV-2 infections, while also being identified as potential inhibitors in three out of five MERS-CoV infected samples.

Gene expression results
The number of differentially expressed genes varied greatly between samples, even when those were restricted only to samples collected at least 24h after infection. Indeed, the range could vary from 0 to almost 9000 genes. Even when keeping time point, species and host cell lines constant, as in the case of datasets GSE148729 and GSE37827, the number of differentially expressed genes ranged from 72 to 1835. Despite these differences, when clustering studies either by viral species or by host cell line, some macrotrends could be observed at the individual gene level. An analysis of the expression of host genes or proteins involved in viral invasion (namely DPP4, ACE2 and TMPRSS2) provided a rather mixed picture. DPP4 appeared to be upregulated in Calu-3 cells, but downregulated in human fibroblasts (Fig 1A-3A). As a modulation of DPP4 by MERS viruses has not been implicated in any phenotypes, the actual biological significance of these observations or whether they represent a stochastic effect due to study setups, remains to be determined.
Conversely, ACE2 downregulation by the virus has been implicated in the pathology of SARS-CoV-1 [8] and potentially SARS-CoV-2. As expected, no alteration in expression was observed in MERS-infected cells. However, in SARS-CoV-1 and SARS-CoV-2 infected cells, a statistically significant downregulation of the gene could be observed only in one study and one clee line (Calu-3). Furthermore, in another study where Calu-3 cells were infected with SARS-CoV-1 (GSE37827), no significant downregulation was detectable (Fig 4B). There was no evidence of downregulation at the protein level either. Again, the discrepancies could be due to the many variables that separate each dataset. But it is also possible that ACE2 downregulation may only occur in certain cells in vivo, and requires an interaction with the host immune system in vivo. Alternatively, it could also be that the mouse model used to generate the original observation may not be a reliable model for human infections.
TMPRSS2 gene expression also provided a very mixed picture, with MERS studies showing both upregulation, downregulation and no difference, while in studies involving SARS-CoV-1 and SARS-CoV-2, a downregulation could be observed in Calu-3 cells only. Again, this may be more a reflection of the differences between studies than any biological significance.
Several cytokines have been implicated in the pathology caused by coronavirus infections. 15 of these were selected based on evidence in the literature and analysed for changes in expression at the gene level following viral infection. In the 6 transcriptomic studies, we found evidence for the significant upregulation of seven of these cytokines (namely IL-1B, IL-6, IL-8, IL-12A, CXCL10, CCL2 and TNF-alpha.) in at least a sample group per species. This is consistent with the theory of a "cytokine storm" occurring during coronavirus infections. On the other hand, there was a general lack of induction of type I interferons in SARS-CoV-1 and SARS-CoV-2 infections, particularly early on in the infection, consistent with observations that both viruses can interfere with the expression of these antiviral cytokines.

Altered Pathway Analysis
Analysis of the proteome of cells infected with SARS-CoV-2 confirmed that the cholesterol pathway was downregulated, whereas carbon metabolism was upregulated, as described in the published article of the study [24]. No significant upregulation of the spliceosome could be observed, although this was most likely due to the more stringent analysis performed by the platform compared to traditional approaches for proteome analysis.
When looking at the transcriptome samples collected 24h post-infection, irrespective of cell-line, the upregulation of two KEGG pathways associated with immune responses was observed in all three viruses, as well as the upregulation of the "Ribosome biogenesis" pathway. The former two are expected as the results of defensive mechanisms against viral infections.The latter could be a consequence of the postulated disruption of ribosome biogenesis by the viral N protein [32]. The "propanoate metabolism" and "pyruvate metabolism" pathways were both downregulated in all three viruses and as they are both involved in carbohydrate metabolism, were in contrast with the findings from the proteome analysis. Conversely, the inhibition of the "peroxisome pathway" might reflect a direct exploitation by coronaviruses [33]. As the peroxisome also plays a role in cholesterol biosynthesis [34], this observation is consistent with the proteome data. When focusing only on the two SARS-CoV species, another pathway related to immune responses ("complement and coagulation cascades") was upregulated in both species, while pathways related to carbohydrate metabolism were again downregulated. Additionally amino acid synthesis and . drug metabolic pathways were also under-expressed in both viral species. Intriguingly, the disruption of the cytochrome P450 metabolic machinery in COVID-19 patients has been hypothesised in a recent publication [35].
Focusing the KEGG pathway analysis on samples with infected Calu-3 cells instead, highlighted once more the upregulation of the "cytokine cytokine receptor" pathway in all three viral species. Several pathways were also downregulated in all three species. The downregulation of the "Lysosome" pathway may reflect the dependence of coronavirus cell entry on the lysosomal pathway [36]. Downregulation of the pyruvate pathway was related to the general inhibition of carbohydrate metabolism observed in the 24h samples. Finally, the downregulation of glutathione metabolism could reflect a direct viral manipulation of host defense mechanisms. Influenza infections have been associated with depletion of glutathione levels, which may favor viral replication [37]. The largest number of shared dysregulated pathways was observed between SARS-CoV-1 and SARS-CoV-2 infected Calu-3 cells. Besides immunity and cell stress related terms, upregulated pathways also highlighted the similarity with other viral diseases (Table 6). Pathways related to carbohydrate metabolisms were once more downregulated, highlighting the contrast between proteomic and transcriptomic datasets. The downregulation of the peroxisome pathway is consistent with glutathione inhibition, due to the role of the organelle in both the metabolism of reactive oxygen species and its antiviral properties [38]. The inhibition of the "butanoate metabolism" pathway could also fit this general trend of viral-mediated modulation of the immune system, due to its role in immune homeostasis [39].

Drug Connectivity Analysis
Four main mechanisms of action (CDK, EGFR, MEK and src inhibitors) with broad inhibitory potential against human coronavirus infections were postulated across most of the datasets. An additional mechanism (p38 MAPK inhibitors) was also suggested when focusing on SARS-CoV-1 and SARS-CoV-2 infections. All these inhibitory mechanisms target kinases and indeed kinase inhibitors have been proposed as broad-spectrum antiviral therapies [40]. When looking at individual drugs that were identified as having anti-coronavirus properties by the bioinformatic platform, the MEK-inhibitor trametinib had already shown inhibitory potential against MERS-CoV infections [41]. Furthermore, a very recent study identified various key cellular signaling pathways critical for SARS-CoV-2 infection [42]. These included the mTOR-PI3K-AKT and ABL-BCR/MAPK pathways, inhibitors against which also appeared in our analysis ( Fig  12).

Conclusion
Comparing gene expression coronavirus infection datasets across in vitro studies is fraught with potential pitfalls, due to the many variables that can alter the outcome of a study (such as viral strains used, infection protocols, cell lines used, time points of collection, sequencing technologies,etc..). Nonetheless, such information can provide some general insights on shared coronavirus infection features that can better help discern potential targets for intervention and provide a better understanding of viral biology.
In this article, we observed several patterns shared across coronavirus species that were consistent with coronavirus biology and pathology. However, we also observed contrasting results (e.g. the expression of the ACE2 gene or regulation of the carbohydrate metabolism) that highlight the difficulty of working with heterogeneous datasets.
We could also identify shared potential mechanisms of viral inhibition that reflect what is known from the literature and further strengthen the argument for the selection of drugs that can be repurposed to treat coronavirus infections.
As a final note, all the datasets and analysis tools are currently available online at VirOmics Playground (https://covid19.bigomics.ch) and can be accessed to repeat and verify the analysis performed in this article. Researchers can also alter parameters and perform further analysis if they so wish. We think that providing access to such resources is of crucial importance given the frequent issues arising both from a lack of reproducibility of scientific results and from inappropriate statistical analysis of the data. While open access to the experimental data through a bioinformatic platform cannot address experimental issues, it can at least standardise some statistical analysis and make its evaluation by reviewers less laborious.
Supporting information S1