Transcriptional regulatory mechanisms of fibrosis development in mouse lung tissue exposed to carbon nanotubes

Background Carbon nanotubes (CNTs) usage has rapidly increased in the last few decades due to their unique properties, exploited in various industrial and commercial products. Certain types of CNTs cause adverse health effects, including chronic inflammation and fibrosis. Despite the large number of in vitro and in vivo studies evaluating these effects, many important questions remain unanswered due to a lack of mechanistic understanding of how CNTs induce cellular stress responses. In order to predict CNT toxicity, it is important to understand which transcriptional programs are specifically activated in response to CNTs, and what similarities and differences exist in relation to other toxic inducers exerting similar adverse effects. Results A systems biology approach was applied to reveal complex interactions at the molecular level in mouse lung tissue in response to different fibrosis inducers: two types of multi-walled CNTs, NM-401 and NRCWE-26, and bleomycin (BLM). Based on mRNA gene expression profiles, we inferred gene regulatory networks (GRNs) to capture functional hierarchical regulatory structures between genes and their regulators. We found that activities of the transcription factors (TFs) Myc, Arid5a and Mxd1 were associated with the regulation of cytokine transcription in response to CNTs, while in response to BLM treatment, Myc was associated with p53 signaling. TF Litaf was identified as the essential regulator for noncanonical signaling of TLR2/4 driven by CNTs. Despite the different nature of the lung injury caused by CNTs and BLM, we identified common stress response modules, that included DNA damage (TFs: E2f8, E2f1, Foxm1), M1/M2 macrophage polarization (TF: Mafb), Interferon response (TFs: Irf7, Stat2 and Irf9) for all agents. Conclusions These results suggest that the reconstruction and analysis of TF-centric gene interaction networks can reveal key targets and regulators of cellular stress responses to toxic agents.

contribution to collagen deposition in ECM, TGFβ1 also plays a substantial role in the regulation of inflammation, macrophage recruitment, and the initiation of epithelialmesenchymal transition [8]. In addition to TGFβ1 signaling, other signaling pathways, such as the p53 DNA damage response and NF-κB inflammatory pathways are also implicated in the CNT-induced fibrosis-related effects.
The biological response to CNT exposure has many similarities with the responses observed for bleomycin (BLM) treatment, that is widely used as a classic model of inducing lung fibrosis [9]. BLM is a chemotherapeutic drug that causes DNA strand breaks via oxidative mechanisms and can induce lung fibrosis as a severe side effect in human patients. A single BLM dose initiates an acute inflammatory phase in lung tissue characterized by infiltration of immune cells, release of pro-inflammatory cytokines, and the increased presence of myofibroblasts [10]. In rodents the fibrotic phase is initiated seven days after BLM instillation with increased expression of pro-fibrotic cytokines and increased fibroblast proliferation and collagen accumulation [10]. Elevated levels of the active form of TGFβ1 are detected in both acute and fibrotic stages. However, there are differences in endpoint effects of BLM and CNTs. For CNT exposure, genotoxic and pro-fibrotic responses together with immunomodulation components prevail, leading to chronic inflammation, fibrosis and possibly cancer [11].
A large number of studies have previously investigated the toxic effects of CNTs on lungs.
These studies identified the genes and proteins which are influenced by various types of CNTs, as well as the pathways and biological process, in which these genes participated [12,13]. However, important questions remain unanswered due to a lack of mechanistic understanding of how CNTs influence gene expression in lung cells. In order to predict CNT toxicity, it is important to understand which transcriptional programs are specifically activated in response to CNTs, and what similarities and differences exist compared to classic fibrosis inducers, such as BLM. To better understand and then prevent CNT induced fibrosis, we need to know whether all CNTs initiate fibrosis related processes via a shared mechanism or through unique mechanisms specific to their physico-chemical characteristics. Here, we apply a systems biology approach to reveal complex interactions at the molecular level in mouse lung tissue in response to different fibrosis inducers: two types of multi walled carbon nanotubes (MWCNTs) NM-401 and NRCWE-26 (that have different physico-chemical characteristics), and bleomycin (BLM). We inferred gene regulatory networks (GRNs) to capture functional hierarchical regulatory patterns connecting genes and their regulators. Using reverse engineering techniques and gene expression profiles from mRNA high-throughput experiments, we identified both gene regulators and their targets, which have key roles in pro-fibrotic progression for both MWCNTs and bleomycin-induced responses. These genes orchestrate specific transcriptional responses to MWCNT and bleomycin instillation. This integrative framework can also be applied to investigations of gene regulatory programs activated in response to different type of toxic agents.

Results
To uncover the mechanisms of the pulmonary response to CNT and BLM exposure, transcriptomics data of previously published microarray experiments for MWCNTs NM-401 and NRCWE-26 [6] and BLM [9] were assessed. The gene expression profiles for two types of multi-walled carbon nanotubes (GSE55286), NM-401 (long and stiff) and NRCWE-26 (short and entangled), and BLM (GSE40151) were obtained from the Gene Expression Omnibus Database (https://www.ncbi.nlm.nih.gov/geo/). Data for CNTs and BLM were generated using Agilent SurePrint G3 Mouse GE 8x60K Microarrays and Affymetrix Mouse 430 2.0 arrays, respectively, with RNA isolated from mouse lung tissues. CNTs were instilled at three different doses of (18, 54, 162 μg), and gene expression was determined at three timepoints (1, 3, 28 days) post-instillation. BLM was administered in one dose (2U/kg body weight), and the lung tissue was harvested at 7 post-instillation timepoints (1, 2, 7, 14, 21, 28, 35 days). Both types of experiments were conducted with vehicle controls for each timepoint.
An overview schematic of our analysis pipeline is represented in Figure 1. First, using the reverse engineering algorithms and the transcriptomics data, we inferred a gene regulatory network for each agent. The gene interactions were identified for a combined list of DEGs derived from all experiments. Next, we identified network modules using the GLay clustering algorithm [14]. For each module, we performed pathway enrichment analysis to identify biological functions of genes in the module. To reveal TFs and signaling pathways directly associated with fibrosis development, we next analyzed subnetworks based on fibrosis markers and their direct regulators extracted from the whole networks. To infer gene interactions from measured transcriptomics data, reverse engineering algorithms were applied. Using the clustering method, network modules were identified. For each module, pathway enrichment analysis was performed to identify the biological functions of the genes in the module. To reveal fibrosis-associated TFs and signaling pathways, subnetworks based on fibrosis markers and their direct regulators extracted from the whole networks were analyzed. The subnetworks are represented by bold red lines.

Inferring gene regulatory networks
Gene regulatory networks (GRNs) can provide useful insights into transcriptional regulatory mechanisms. GRNs inferred from the expression data can suggest which transcription factors (TFs) are responsible for the changes in gene expression observed following the exposure to two types of nanoparticles (NM-401 and NRCWE-26) and BLM. GRNs have hierarchical structures where a few highly interconnected genes, usually TFs, are the hubs that account for most interactions. It was previously shown that combining multiple inference methods increases the accuracy of reconstructed GRNs [15]. Therefore, we apply three different algorithms in order to infer GRNs: 1) linear model of gene regulation with Bayesian variable selection [16]; 2) mutual information algorithm ARACNe-AP [17]; and 3) random forest based algorithm GENIE3 [18]. To improve the prediction accuracy, we integrated the results of all three algorithms using the Borda count ranking method [15]. The list of TFs was used from AnimalTFDB database [19], and target genes were selected based on the combined list of DEGs from CNTs and BLM experiments. Analysis of DEGs was performed using the limma package in R/Bioconductor [20]. Genes were considered significantly differentially expressed if they (i) showed expression changes of at least ± 1.5fold for CNTs or BLM treated groups compared to non-treated controls for each experimental condition; and (ii) have FDR p-values ≤ 0.05. The inferred networks characteristics are generally similar with clustering coefficients indicating that all GRNs are well-connected, small world networks (Table 1). However, the NM-401 GRN has a much larger diameter and a characteristic path length, suggesting that the activated regulatory program in response to NM-401 treatment might be more complex. The visualization of the networks is shown in Figure 2, and the cytoscape file can be found in the Additional file 1. Next, in order to prioritize gene regulators, we identified TFs with the largest numbers of connections for each network, which will subsequently be referred to as TFs hubs. These network topological features are widely used in the analysis of GRNs and such types of genes are considered important in the cellular regulatory program [21,22]. TFs were ranked based on their connection numbers (Additional file 2

Analysis of biological functions of genes in GRN modules
One of the known properties of GRNs structure is colocalization of genes from the same biological processes in the same network cluster [23]. We used this feature for a subsequent analysis of biological processes controlled by a TF. The inferred networks displayed modular structures ( Figure 2, Additional file 1). To identify modules from these networks, the GLay clustering method in Cytoscape was applied [14]. Visualization of the clustered networks with upregulated and downregulated DEGs for different time points is presented in Additional file 3.
To identify the signaling pathways and functional processes which were altered in each module of the networks following instillation of CNTs and BLM, over-representation analysis was performed using gProfileR and ReactomePA toolkits [24,25] and the KEGG, REACTOME and GO databases (see Additional files 4, 5, 6). The functional annotation analysis was performed for each time point and for upregulated/downregulated genes separately where the altered pathways had an adjusted p-value < 0.005. Instillation of CNTs and BLM affected various physiological and pathological mechanisms, such as the immunomodulatory response (innate and adaptive), response to DNA damage/integrity, pathways involved in cell-cell interactions and cell adhesion, and activation of regeneration processes, suggesting an involvement of these processes in the adverse effects observed upon CNTs and BLM instillation. We mainly focused on inflammation and fibrosis-related pathways, with the findings summarized in Table 2 and Additional file 7. cells which provide airways clearance from mucus and dirt. Attenuation of ciliogenesis in response to CNTs has been shown previously in bronchial epithelial cells [26].
All doses of NRCWE-26 inhibited DEGs from the cardiac muscle contraction pathway on day 3, while in the case of BLM this inhibition began at day 14 and was maintained even after day 28. The identified genes belonged to the actin and myosin family of genes, which are involved in the formation of the actomyosin cytoskeleton of non-muscle cells. Disruption of this structure in phagosome cells can facilitate engulfment processes [27]. TF Tbx20 was identified as TF hub in this module.  1 3 28 1 3 28 1 3 28 1 2 7 14 21 28  Other signaling pathways which were altered during the later time period included phagosome and osteoclast differentiation (KEGG). DEGs from these pathways were shown to be upregulated from the earliest time period and were continuously activated up to the 28 th day (NM-401) and 35 th day (BLM). Based on gene ontology analysis, the enriched genes from these pathways belong to the Fcgamma receptor (FCGR) family, ATPases and macrophage scavenger receptors. Enrichment analysis using the REACTOME database also revealed alteration of the FCGR signaling pathway. It is conceivable that treatments activated FCGR-dependent phagocytosis processes. The FCGR signaling is activated by immunoglobulin G (IgG) molecules [27] and plasma protein serum amyloid P (SAP) [28].
IgG antibodies can be contained in the protein corona of nanoparticles and act as opsonins that by binding to phagocyte FCGRs initiate the re-organization of the actomyosin cytoskeleton, membrane remodeling, and eventually phagocytosis [27]. Furthermore, the FCGR can be activated by SAP protein during clearance of apoptotic cells [28]. Likewise, the same module of BLM network consisted of upregulated DEGs (from day 7) from the complement and coagulation cascades pathway, which also can enhance opsonization.

. Visualization of DEGs in the NRCWE-26 GRN. The red and blue colors represent the upregulated and downregulated DEGs, respectively, for high doses (on the left) and middle doses (on the right) at the day 3 timepoint.
Taken together, our results indicate that instillation of these CNTs and BLM activated various physiological and pathological mechanisms, such as inflammation, response to DNA damage, alteration in ECM synthesis and activation of the regeneration process, thus highlighting the involvement of these processes in the adverse effects observed upon CNTs and BLM instillation. As indicated by the data for later time period responses, NM-401, as well as BLM, led to remodeling of the extracellular matrix and active phagocytosis.
Interestingly, NRCWE-26 induced biphasic effects: high doses of NRCWE-26 activated the innate immune response, while middle doses triggered the adaptive immune components.
Moreover, middle doses of NRCWE-26 induced the interferon response, while high doses did not alter the genes from this module (see Figure 3). Heatmap analysis of DEGs from the Interferon signaling pathway (where the list of genes was derived from REACTOME Interferon signaling pathway, R-MMU-913531) revealed the same effect for NRCWE-26 exposure (Additional file 8).

Analysis of gene regulators
In the previous section, we identified main TF hubs for functionally annotated modules.
These gene regulators were associated with the following different responses: Arid5a, Mxd1 for interleukin signaling; Irf7 for interferon response; E2f8, E2f1, Foxm1 for cell cycle and DNA damage response; Trp73 for cilium regulation; Mafb for phagocytosis and Fcgamma receptors activation; Tbx20 for actomyosin cytoskeleton remodeling. It was, however, unclear, whether these regulators were important for the main pathological effect of CNTs and BLM treatment such as pro-fibrotic responses.
To gain more insight into the role of gene regulators during the activation of fibrotic processes, we next focused on specific subnetworks within the GRNs. Results were obtained by using 87 genes, which were previously identified to be important for fibrogenesis and tissue remodeling in response to different types of CNTs [13,29]. This  Table 2, Additional file 7). Myc activity is essential for cell cycle progression, apoptosis and other biological processes. Myc has cross-regulatory interactions with cytokines, including Il1, Il2, Il4, Il6, Il8, Il10, and TNF-α [30]. In NM-401 GRN Myc colocalized with Arid5a, which had also the high number of connections in the subnetworks (see Table 3). The TF Arid5a has a role in the posttranscriptional regulation of IL6. Arid5a controls Il6 mRNA stability and protects Il6 mRNA from regnase-1-mediated degradation [31]. Importantly, Arid5a is regulated by the NF-κB and MAPK signaling pathways, which in turn are activated by Tolllike receptor 4 [32]. Another TF with high number of connections in CNT subnetworks was Mxd1, also known as Mad. This TF is involved in MYC regulation.
Heterodimerization of MYC with MAX is necessary for activation of MYC target genes. The protein MAD, which is encoded by Mxd1 gene, competes with MYC for binding to MAX and thereby inhibits MYC activity [33]. Analysis of the log fold change (logFC) values for Myc, Max, Mxd1 and cytokines showed that the largest Myc to Mxd1 ratio (ΔlogFC 1.66) was observed for day 3 for high dose of NRCWE-26 (Additional file 9).
Other TFs with a high number of connections in all subnetworks were identified. Litaf regulates the expression of cytokines, pro-inflammatory and pro-fibrogenic genes [34][35][36].
Transcription of Litaf can be induced by tumor suppressor p53 [37] and Toll-Like receptors (TLR 2/4) [38]. In the CNT networks Litaf is directly connected with Cd14 and Myd88 genes (see Additional file 10), which encode toll-like receptor interacting proteins. The TF Irf7 from the PRR-interferon GRN module (as mentioned above) also had high degree of connectivity in all subnetworks, especially at early time points (Table 3). This TF plays essential role in the activation of the viral defense system via triggering type I interferon pathway [39]. TFs Mafb and Batf3 had high connectivity degree in NM-401 and BLM subnetworks at late time points (Table 3). Mafb was identified in a module associated with FCGR activation and phagocytosis (as mentioned above), which was altered also at late time points in the NM-401 and BLM networks. Mafb can enhance phagocytic activity of macrophages by stimulating Fcgr3 [40] and has a key role in the activation of anti-inflammatory profile of macrophages by inducing M1/M2 macrophage polarization, which is important for fibrosis development [41,42]. Macrophages produce numerous cytokines, chemokines, matrix metalloproteinases, and other inflammatory and ECM remodeling mediators. Macrophages can be transformed by external stimuli into different types: M1 (pro-inflammatory) and M2 (anti-inflammatory) subtypes. The other TF, Batf3, is involved in formation of CD103 + and CD8 + dendritic cells that may facilitate lung fibrosis [43,44]. In line with this result, liver fibrosis was attenuated in Batf3 −/− knockout mice [45]. The TF Srebf2, also known as Srebp2, was increasing its connectivity over time in the CNT subnetworks for high doses of these agents. Srebf2 induces the expression genes that are involved in cholesterol and fatty acid synthesis, cholesterol transport [46,47] and in the formation of lipid-laden macrophages (foam cell) [48], which are associated with lung fibrosis [49]. Surprisingly, TFs from the cell cycle and DNA damage module, which included dozens of upregulated genes in all networks (as mentioned above), were weakly represented in the CNTs subnetworks. By contrast, analysis of the BLM subnetwork showed a high number of connections for these TFs. For instance, in the BLM network E2f8, E2f3 and Prrx2 had high degrees of connectivity up to late time points. Thus, the activity of genes from Cell cycle and DNA damage module was mainly associated with fibrotic changes in response to BLM.

Discussion
Transcription factors regulate stress-induced biological processes and determine the landscape of toxicological response [50][51][52]. Characterization of the gene regulatory programs activated in response to nanoparticle treatment can improve risk assessment and potentially aid to build predictive toxicity models. Here, we applied reverse engineering techniques for reconstructing gene regulatory networks based on microarray mRNA data from the lungs of mice exposed to MWCNTs (NM-401, NRCWE-26) and BLM. Previous studies have demonstrated the power of this approach to identify molecular biomarkers and drug targets in cancer and other diseases [53], the deduction of adverse outcome pathways (AOPs) for chemicals from high-throughput transcriptomic data sets [54,55], and the functional interpretation of responsive modules from gene expression data sets [56].
Additionally, based on prior knowledge of interactions between genes and a predefined list of disease-associated markers, the pipeline for detecting AOP-linked molecular pathway descriptions has been described [57].
Since TF activity is dependent on coactivators and because a single protein can be involved in different biological processes, as the first step, we identified functional roles of TFs using so-called "guilt by association" approach [23,58]. The inferred whole networks were clustered and functional enrichment analysis was performed for each gene module.
Additional analysis applied to the whole networks is a deduction of gene regulators associated with previously identified fibrosis markers. For this purpose, we inferred a subnetwork based on 87 genes (which were previously identified as important for fibrogenesis and tissue remodeling in response to different types of CNTs [13,29]) and their direct regulators identified in the whole networks (see Figure 2). Arid5a controls Il6 mRNA stability and protects Il6 mRNA from regnase-1-mediated degradation [31]. Importantly, Arid5a-deficient mice display reduction of BLM-induced lung injury [59]. Arid5a, together with the TF Myc, colocalized in the CNT networks with cytokinerelated genes, mostly appearing in the first phase of inflammatory response. Myc also was identified as a TF hub in whole CNT networks and Myc had one of the highest number of connections in the fibrosis-related subnetwork. This finding is in line with a recent report that upstream analysis identified Myc as a key regulatory gene, activated in response to both NM-401 and NRCWE-26 treatments [6]. Myc may also be involved in fibrogenetic processes. Myc can regulate and be itself regulated by several cytokines [30], which stimulate the expression of pro-fibrotic genes. On the other hand, among Myc's transcriptional targets are integrins, which are transmembrane, cell signaling proteins that control cell surface adhesion, cell-matrix and other processes [60,61]. Integrins can mediate the activation of the latent form of TGF-β [61][62][63], which is considered as a key regulator of fibrosis. Myc can directly bind to the promoter of integrin αv and induce its transcription [61], thereby promoting the activation of the latent form of TGF-β. Activation of Myc target genes depends on its cofactors. Heterodimerization of Myc with Max is necessary for activation of Myc target genes. The protein Mad, which is encoded by Mxd1 gene, competes with Myc for binding to Max and thereby inhibits Myc activity [33]. In our analysis, Mxd1 was also identified as a crucial TF. Therefore, we hypothesize that the balance between the fold change value of Myc and Mxd1 can be used for the prediction of Transcription factor Irf7, the important player in the innate immune response and the activator of the viral defense system via triggering type I interferon pathway [39], was identified in this analysis as a critical gene regulator in response to both CNTs and BLM.
The most interesting finding was that NRCWE-26 inhalation showed a biphasic effect: middle and low doses of NRCWE-26 induced the interferon response, while high doses did not alter genes from this signaling pathway. A possible explanation for this effect may be that in high concentrations NRCWE-26 nanoparticles can form agglomerates (clots), which are sensed by immune cells in a different manner than distinct nanoparticles.
Another transcription factor that has been identified as an inflammation-related gene regulator for CNTs and BLM treatment is Litaf, which controls one of the alternative downstream signaling pathways of TLR2/4 [38]. These types of toll-like receptors control interferons and cytokines expression and are involved in cellular stress response upon action of exogenous or endogenous ligands, such as bacterial LPS and group of proteins from damage-associated molecular pattern [64]. Data from several sources have identified the activation of TLR2/4 signaling in response to nanoparticles [65,66] or BLM treatment [67]. Litaf was upregulated during acute inflammation phase and had direct links with fibrosis-related genes in CNT and BLM networks for the later time period.
Our findings further support the idea of a pivotal role of macrophage polarization in the cellular response to nanoparticles and BLM [12,68]. We identified the TF Mafb as a potential driver of this activity. Mafb was identified in the networks associated with Fc gamma receptors expression (as mentioned above ), which can define pro-or antiinflammatory profile of immune cells [69] and also can be involved in regulation of M1/M2 macrophage polarization [70].
We have also shown that cell cycle and DNA damage signaling was altered in response to all agents. TFs E2f8, E2f1, Foxm1 were identified as important regulators of this stress response signaling. It is interesting to note that cell cycle and DNA damage module was mainly associated with fibrotic markers in the BLM network. These findings further support the idea of different nature of activation of cell cycle and DNA damage pathways in response to BLM and nanoparticles. BLM can directly induce single-and double-stranded DNA breaks [71], while NM-401 and NRCWE-26 nanoparticles induce mainly an inflammation response, which can cause DNA damage. Previous studies have reported the absence of DNA strand breaks for these nanoparticles [11,13].

Conclusions
In the present study we revealed the landscape of transcriptional regulation of responses to CNTs and BLM. This analysis identified common and distinctive features for each agent. In summary, we developed the TF-centric pipeline used to reveal gene regulators, their associated biological processes and signaling pathways, which were altered in response to CNTs and BLM. This method uses transcriptomics data, generates specific to toxic agent interaction networks and is independent from bias in the reference databases for pathway mapping. Moreover, this approach can be useful for generating toxicity pathways and adverse outcome pathway schemes for toxic agents.
DEG analysis was performed using the limma package in R/Bioconductor [20]. The list of genes was considered as significantly differentially expressed if the expression changes were equal to or larger than ± 1.5-fold for nanoparticles or bleomycin treated group compared to non-treated controls for each experimental conditions and the BH-adjusted (FDR) p-values were less than or equal to 0.05 (p ≤ 0.05).

Gene Regulatory Network inference
In order to infer gene regulatory networks for two types of MWCNTs and BLM, we applied three different algorithms: linear model of gene regulation with Bayesian variable selection [16], the mutual information algorithm ARACNe-AP [17], and the random forest based algorithm GENIE3 [18]. The gene interactions were identified for the combined list of DEGs derived from all experiments. The predefined list of gene regulators (TFs) was used from the AnimalTFDB database [19]. Next, for improving prediction accuracy, we integrated the results of all three algorithms by Borda count ranking as described previously [15]. The ARACNe-AP algorithm was run with three key steps: MI threshold estimation, bootstrapping/MI network reconstruction, building consensus network (only significant interactions are filtered, p < 0.05, Bonferroni corrected). Bayesian variable selection and GENIE3 algorithms were run with default parameters [16,18]. Top 3% of the ranked edges in each common network were selected for subsequent analysis. Network visualization, parameter analysis were performed by open source software platform Cytoscape version 3.4.0 [72]. The integration of initial GRNs, data processing, statistical analysis were performed with R version 3.3.3 (https://www.r-project.org/) and RStudio version 1.0.44 (https://www.rstudio.com). To identify GRNs modules, the GLay clustering method in Cytoscape was applied [14]. For functional annotation of GRN modules we used KEGG [73], REACTOME [74], GO [75] databases and gProfileR, ReactomePA toolkits [24,25]. A