Multi-omics analysis for potential inflammation-related genes involved in tumour immune evasion via extended application of epigenetic data

Accumulating evidence suggests that inflammation-related genes may play key roles in tumour immune evasion. Programmed cell death ligand 1 (PD-L1) is an important immune checkpoint involved in mediating anti-tumour immunity. We performed multi-omics analysis to explore key inflammation-related genes affecting the transcriptional regulation of PD-L1 expression. The open chromatin region of the PD-L1 promoter was mapped using the assay for transposase-accessible chromatin using sequencing (ATAC-seq) profiles. Correlation analysis of epigenetic data (ATAC-seq) and transcriptome data (RNA-seq) were performed to identify inflammation-related transcription factors (TFs) whose expression levels were correlated with the chromatin accessibility of the PD-L1 promoter. Chromatin immunoprecipitation sequencing (ChIP-seq) profiles were used to confirm the physical binding of the TF STAT2 and the predicted binding regions. We also confirmed the results of the bioinformatics analysis with cell experiments. We identified chr9 : 5449463-5449962 and chr9 : 5450250-5450749 as reproducible open chromatin regions in the PD-L1 promoter. Moreover, we observed a correlation between STAT2 expression and the accessibility of the aforementioned regions. Furthermore, we confirmed its physical binding through ChIP-seq profiles and demonstrated the regulation of PD-L1 by STAT2 overexpression in vitro. Multiple databases were also used for the validation of the results. Our study identified STAT2 as a direct upstream TF regulating PD-L1 expression. The interaction of STAT2 and PD-L1 might be associated with tumour immune evasion in cancers, suggesting the potential value for tumour treatment.


Introduction
can bind to programmed death-1 (PD-1) and inhibit antitumour immune reactions, enabling cancer cells to escape immunosurveillance.
Based on the importance of the PD-1/PD-L1 axis in immune evasion, many studies have demonstrated the remarkable clinical efficacy of anti-PD-1/PD-L1 therapy [3]. However, the clinical effects of these treatments are less efficient for certain tumour types, such as non-microsatellite instability (non-MSI) colorectal cancer [4]. To date, the level of PD-L1 expression in cancer cells is regarded as one of the most important factors for determining the effects of immune checkpoint therapy. Therefore, an improved understanding of PD-L1 regulation in cancer cells might be helpful for clinical cancer treatment.
Accumulating evidence has demonstrated the upregulation of PD-L1 expression during cancer pathogenesis. Inflammatory signalling is regarded as a primary mechanism involved in this complex regulatory network [5]. For instance, certain pro-inflammatory factors, including type I and type II interferons, induce PD-L1 expression efficiently. Furthermore, the expression of PD-L1 could be regulated via multiple inflammation-related transcription factors (TFs), such as IRF1, STAT1, STAT3 and NF-κB [6]. Given that cancer-related inflammation is observed in a substantial proportion of patients, a better understanding of the relationship between inflammation and PD-L1 is currently required. Furthermore, a search for novel inflammatory TFs that regulate PD-L1 expression is warranted.
Assay for transposase-accessible chromatin using sequencing (ATAC-seq) approach uses hyperactive Tn5 transposase to comprehensively recognize chromatin accessibility at the genome level and could map open chromatin regions in gene promoters, reflecting the possibilities of TF binding [7,8]. Although ATAC-seq analysis only indicates the necessity of TF binding, its results could be further confirmed through other experiments.
In this study, we aimed to perform a multi-omics analysis to screen novel inflammation-related TFs involved in PD-L1 regulation, followed by laboratory verification studies. Besides transcriptome data (RNA-seq), we also aimed to analyze epigenetic data (ATAC-seq) to explore the binding of TFs. We hypothesized that STAT2 (signal transducer and activator of transcription 2) could directly bind at open chromatin regions of the PD-L1 promoter and regulate PD-L1 expression in cancer cells. The predicted binding was then validated physically through ChIP assay, and its influence in translational regulation was further confirmed in cell experiments.

Data collection
The Cancer Genome Atlas (TCGA) datasets were accessed through the UCSC Xena database (https://xenabrowser.net/ ). Htseq-count profiles of 514 colon adenocarcinoma (COAD) samples were retrieved, and the corresponding clinical demographic information was also acquired. We also downloaded the Fragments Per Kilobase per Million mapped read (FPKM) profiles of the aforementioned patients with COAD. Publicly available ATAC-seq profiles were obtained from the NCI Genomic Data Commons (https://gdc.cancer.gov/ about-data/publications/ATACseq-AWG). The ChIP-seq profiles were acquired from the Cistrome database (http://cis trome.org/) [9]. RNA-seq profiles were obtained from the Gene Expression Omnibus (GEO, GSE137155, https://www. ncbi.nlm.nih.gov/geo/).

Chromatin accessibility analysis of PD-L1
To identify the open chromatin regions of PD-L1, peaks were visualized using the R package karyoploteR [14] and ChIPseeker [15], and were annotated using TxDb.Hsapiens.UCSC.Hg38. knownGene. The details of the aforementioned methods have been described in our former study [13,16].

Identification of potential TFs involved in PD-L1 regulation
In order to analyze transcriptional regulation of PD-L1, we used the workflow reported by Huang et al. [16,17]. Briefly, gene expression data of TFs were retrieved from the TCGA datasets. Then, we performed correlation analysis between the TF expression and the chromatin accessibility of the PD-L1 promoter region. The TFs with a p-value < 0.05 were further filtered using the Cistrome database and the GEO database.

Cell culture
The human colon cancer cell line DLD-1 and cervical carcinoma cell line HeLa were obtained from Shanghai Key Laboratory of Regulatory Biology, East China Normal University, Shanghai, China. Both cell lines were cultured in DMEM supplemented with 1% streptomycin-penicillin and 10% fetal bovine serum.

Statistical analysis
p-values < 0.05 were regarded as statistically significant. Pearson and Spearman analysis was used to calculate correlation coefficients. All statistical analyses were conducted using the R software (v. 3.5.1; www.r-project.org).

Identification of open chromatin regions of the PD-L1 promoter
The overview of our study is presented in figure 1. The chromatin accessibility landscape of patients with COAD was gauged from ATAC-profiles using the workflow reported by Huang et al. [16]. The open chromatin regions were widely expressed across the genome. The upset plot and the vennpie plot indicated that a considerable fraction of open chromatin regions was found in gene promoters (electronic supplementary material, figure S1A). Furthermore, we identified chr9 : 5449463-5449962 (Region 1) and chr9 : We also found that PD-L1 expression was significantly correlated with both Region 1 (r = 0.6, p < 0.05, figure 2a) and Region 2 (r = 0.4, p < 0.05, figure 2b).

Identification of STAT2 as an upstream factor of PD-L1 by integrative analysis
In order to explore the upstream factors of PD-L1, we combined transcriptome (RNA-seq) and epigenetics profiles (ATAC-seq) for co-analysis. Based on the list of TFs associated with inflammation, we obtained the mRNA expression of 21 inflammtory TFs from RNA-seq profiles in the TCGA database (figure 3a). These TFs were then filtered by correlation analysis with chromatin accessibilities of Region 1 and Region 2 (figure 3b). Among the 21 inflammtory TFs, STAT2 was found to have the strongest correlation with PD-L1 promoter accessibility. STAT2 demonstrated a remarkable correlation with both Region 1 and Region 2 (figure 3c). Additionally, in order to confirm that STAT2 had a high ranking even among all the TFs, we also used correlation analysis between PD-L1 promoter accessibility and expression of all database-recorded TFs (electronic supplementary material, table S2). Among all the TFs, STAT2 was only secondary to TF MAX with respect to correlation with Region 1 (electronic supplementary material, figure S2A, left). Similarly, STAT2 also had a significant correlation with Region2 (electronic supplementary material, figure S2A, right). Moreover, we performed correlation analysis between PD-L1 expression and the expression of potential TFs. Interestingly, among the top 5 TFs associated with PD-L1 promoter Region 1 or Region 2, STAT2 was found to had the strongest correlation with PD-L1 expression (electronic supplementary material, figure S2B, figure 4a), which also suggested that   NFKB1  NFKB2  RELB  RELA  REL  IRF1  IRF2  IRF3  IRF4  IRF5  IRF6  IRF7  IRF8  IRF9  STAT1  STAT2  STAT3  STAT4  STAT5A  STAT5B  royalsocietypublishing.org/journal/rsob Open Biol. 12: 210375 STAT2 might have a close relationship with PD-L1 expression.
Collectively, STAT2 showed a significant correlation with both PD-L1 promoter accessibility and PD-L1 expression. Hence, we proposed that STAT2 might be a potential upstream of PD-L1, and we chose STAT2 for further validation.

Validation of the physical interactions between STAT2 and PD-L1 promoter
Considering that the TF binding motif is an important index for predicting TF binding, we used the online Human Transcription Factor Database (http://bioinfo.life.hust.edu.cn/ HumanTFDB/) to explore the STAT2 binding motifs on Region 1 and Region 2. As shown in electronic supplementary material, table S3, STAT2 binding motifs could be found in both two regions, indicating that STAT2 might physically bind at Region 1 and Region 2. To confirm the physical binding, we further obtained STAT2 ChIP-seq profiles of colon cancer cells for validation. Figure 4b shows that STAT2 protein could specifically bind at Region 1 (highlighted in dark blue) and Region 2 (highlighted in light blue), confirming the physical interaction between STAT2 and PD-L1 promoter. Furthermore, we obtained the RNAseq data of colon cancer cells with STAT2-knockdown from the GEO dataset GSE137155. PD-L1 expression was significantly downregulated in the STAT2-knockdown group (figure 4c). The analysis of the protein-protein interaction network also supported the correlation observed between STAT2 and PD-L1 (electronic supplementary material, figure S3A). Collectively, the integrated analysis indicated that the TF STAT2 could physically bind at the open chromatin regions of the PD-L1 promoter and might regulate PD-L1 expression.

Validation of the results of bioinformatics analysis in colon cancer cells
To minimize the bias, we also confirmed the results of the aforementioned bioinformatics analysis with colon cancer

Identification of significant pathways and immune cells associated with STAT2 and PD-L1
Considering that STAT2 could be a direct upstream factor of PD-L1, we further explored associated pathways in COAD. We used the gene set variation analysis algorithm [23] to identify the expression level of genes enriched in the GO and KEGG pathway analysis (figure 5a-d). Correlation analysis was applied to explore significant pathways that were correlated with both STAT2 and PD-L1 expression (figure 5a,c). We found that 'KEGG_antigen_processing_and presentation' and 'GOBP_cellular_response_to_interferon_alpha' pathways were most significantly correlated (figure 5b, d). Therefore, we hypothesized that the interaction of STAT2 and PD-L1 might influence colon cancers in an immune-related way.
Considering this hypothesis, we analysed tumour-infiltrating immune cells in colon cancers. With the clustering function of R package corrplot, we found that macrophages might be potentially associated with both STAT2 and PD-L1 (figure 6a). Furthermore, besides clustering function, we also explored the estimated immune cells using correlation analyses. And we found that algorithm-estimated macrophages and T cells were statistically correlated with both STAT2 and PD-L1 (r > 0.4, p < 0.05, figure 6b). Thus, based on the analysis above, macrophages might play a role in the interaction between STAT2 and PD-L1. However, limited to bioinformatics methods, we only observed a correlation among macrophages, STAT2, and PD-L1. The underlying mechanisms still required further elucidation through laboratory experiments.

Multi-database validation
To confirm the bioinformatics results, we used multiple databases for validation. Profiles from Xena database showed that STAT2 was widely expressed in multiple cancer types, including colon cancers (figure 7a). Moreover, the HPA database indicated that in the tumour microenvironment of colon cancers, STAT2 could be detected in multiple cell types, including cancer cells. Although there is no significant difference of STAT2 protein expression between colon cancer tissues and normal tissues, we found that the expression level of STAT2 in normal endothelial cells was relatively lower ( p < 0.05, figure 7b). In addition, the LinkedOmics database demonstrated that the STAT2 expression was significantly lower in patients with MSS (non-MSI) colon cancer compared to that in patients with MSI-H colon cancer (figure 7c). To confirm the direct binding of TF STAT2 and PD-L1 promoter, we also obtained STAT2 ChIP-seq profiles of multiple types of cancer cell lines, including GM12878,

Discussion
With the understanding of immune evasion mechanisms, immune checkpoint therapy has been developed as an important clinical strategy for cancer treatment. Among various such strategies, anti-PD-1/PD-L1 therapy is one of the most extensively examined strategies. Considering that the level of PD-L1 expression in cancer cells is closely associated with clinical efficacy, there is an urgent need to elucidate mechanisms underlying PD-L1 regulation. Cancer-induced chronic inflammation is very common among patients and affects PD-L1 expression via multiple pathways, including transcriptional regulation. However, to date, its regulatory mechanism has not been fully clarified. Therefore, we aimed to screen undiscovered inflammatory TFs that were direct upstream factors of PD-L1 and to further confirm the results via laboratory validation. The integrated analysis of transcriptome and epigenetic profiles was applied for screening the potential inflammatory TFs. First, we performed correlation analysis between PD-L1 promoter accessibility and expression of inflammtory TFs. Among the 21 database-recorded inflammtory TFs, STAT2 was found to be most correlated with PD-L1 promoter accessibility. Then, we performed correlation analysis between PD-L1 promoter accessibility and expression of all the database-recorded TFs, confirming that STAT2 had a high ranking even among all the database-recorded TFs. Moreover, we further performed correlation analysis between PD-L1 expression and the expression of several TFs, which were also significantly correlated with PD-L1 promoter accessibility. Among these TFs, STAT2 was found to had the strongest correlation with PD-L1 expression. Collectively, the inflammtory TF STAT2 was found to be significantly correlated with both PD-L1 promoter accessibility and PD-L1 expression. Hence, we proposed that STAT2 might be a potential upstream of PD-L1, and chose STAT2 for further validation.
Based on the integrated analysis of transcriptome and epigenetic profiles, we proposed that the TF STAT2 might directly bind at the PD-L1 promoter. We also identified the potential binding sites chr9 : 5449463-5449962 and chr9 : 5450250-5450749. Upon analysing ChIP-seq profiles of STAT2, we confirmed the physical binding within the predicted region. Subsequently, we further verified our results at the cellular level. We demonstrated that the overexpression of STAT2 can significantly upregulate PD-L1 expression in cancer cell lines DLD-1 and HeLa. Based on the GEO database, we found that STAT2-knockdown could significantly inhibit PD-L1 expression. Taken together, based on the results of this study, we propose that STAT2 is a direct upstream factor of PD-L1. royalsocietypublishing.org/journal/rsob Open Biol. 12: 210375 STAT2 is known for its role in immunomodulatory reactions and anti-viral immunity. It is significantly different from other members of the STAT family. For instance, although other members can be activated by multiple cytokines, including type I and II interferons, STAT2 is primarily activated by type I interferon. Moreover, it is involved in mediating inflammatory pathways and acts as a cofactor [24]. Thus, the regulation of STAT2 might not have a significant impact on anti-tumour immunity. Therefore, if PD-L1 expression is regulated via STAT2, the combination therapy targeting STAT2 might be more effective.
In this study, we demonstrated the regulation of PD-L1 expression mediated by STAT2 via multi-omics analysis and laboratory validation. Over past decades, the regulation of PD-L1 expression by various inflammatory TFs has been reported; however, our study is the first to report that the TF STAT2 could directly regulate PD-L1 expression. We also confirmed the physical binding of the TF STAT2 and PD-L1 promoter based on ChIP-seq results. Interestingly, a study performed by Angel Garcia-Diaz et al. [25] used mutagenesis to delete the predicted binding sites of STAT2 in a firefly luciferase reporter plasmid comprising the PD-L1 promoter. The results demonstrated that interferon-induced luciferase expression was remarkably decreased in the transfected cells. To some extent, the results of this experiment supplemented the findings of our study, which provided strong evidence for the direct binding of the TF.
In addition, Angel Garcia-Diaz et al. [25] not only found that the deletion of STAT2 putative binding site could affect    royalsocietypublishing.org/journal/rsob Open Biol. 12: 210375 luciferase expression but also observed similar results upon deletion of IRF1 putative site. Indicating the importance of the JAK1/JAK2-STAT1/STAT2/STAT3-IRF1 axis, they revealed that IRF1 was the potential TF that regulated PD-L1 directly. In our study, we focused on the direct regulation of PD-L1 expression by STAT2, providing an improved understanding of the JAK1/JAK2-STAT1/STAT2/STAT3-IRF1 axis. As a direct upstream factor of both PD-L1 and IRF1, STAT2 showed great promise for anti-PD-1/PD-L1 immunotherapy. Despite the increasing number of studies examining STAT2, the effects of STAT2 in anti-tumour immunity remain controversial [26]. For instance, Yue et al. [27] demonstrated the acceleration of tumour growth in mice with STAT2 knockout. Wang et al. [28] confirmed the anti-tumour activity of STAT2 in a mouse model. Gamero et al. [29] used models of inflammation-induced cancers to demonstrate that STAT2 might promote colorectal and skin carcinogenesis. Considering the controversial roles of STAT2, our study might provide further clarifications with respect to the role of STAT2 in cancer. We found that since STAT2 can upregulate PD-L1 expression on the surface of cancer cells, it can aid the cancer cells in escaping immunosurveillance.
Currently, the anti-PD-1/PD-L1 treatment demonstrates low efficiency for patients with non-MSI colon cancers [4,5]. Based on the LinkedOmics database, we found that the STAT2 expression was significantly lower in no-MSI colon cancers. Thus, we hypothesized that the combination treatment targeting STAT2 might increase the efficacy of anti-PD-1/PD-L1 treatment in patients diagnosed with this cancer subtype.
Certain limitations of this study need to be addressed. First, most of our bioinformatics results were based on the data on colon cancers, which were used to perform integrated analysis. On the one hand, the multi-omics analysis would be more reliable when using profiles from the same cancer type. On the other hand, these results might be specific to colon cancers. To minimize the bias, we used multiple databases to validate our results at the pan-cancer level, and we also used other types of cell lines in subsequent experiments. Second, the association between the differential TF binding and the changes in PD-L1 promoter accessibility could be influenced by many potential events, such as the dynamics of histone modification, DNA methylation, and promoterenhancer looping. This study did not include all these factors into consideration. However, although we were not able to provide evidence for all the potential factors, we provided the evidence that there was a significant association between PD-L1 promoter accessibility and PD-L1 expression.
Despite the aforementioned limitations, our study was the first to highlight the direct regulation of PD-L1 expression mediated by the TF STAT2. We performed both bioinformatics and laboratory analysis to validate our results. Future studies should further validate the interaction of STAT2 and PD-L1 with larger data sizes, different cancer cell lines, and the STAT2 knockdown mouse model. The potential therapeutic value of the combination treatment should also be analysed further.
Our study identified STAT2 as a direct upstream TF that regulates PD-L1 expression, suggesting its potential to be used as a therapeutic target for tumour treatment.
Data accessibility. Publicly available datasets were analysed in this study.