Transcriptome analysis reveals the effects of transgenic expression of the Gal4 protein on normal gene expression in silkworm tissues

The Gal4/upstream activating sequence(UAS) system, a well-known genetic tool, has been widely used to analyze gene function in many organisms, including the silkworm (Bombyx mori), a model lepidopteran insect. Several studies have suggested that Gal4 protein activation in tissues can negatively affect transgenic individuals; however, whether and to what extent the Gal4 protein affects normal endogenous gene expression have rarely been studied. Here, we analyzed the transcriptomes of transgenic silkworms expressing the Gal4 protein at high levels in both the wing disc (WD) and epidermis (EP) and investigated gene expression changes in both tissues. Overall, 24,593 genes were identified in the WD and EP libraries, and 2,025 and 2,488 were identified as significant differentially expressed genes(DEGs) in the WD and EP between the transgenic and control groups, respectively. These DEGs were further annotated by gene function classification and pathway assessment using public databases. In addition, 506 DEGs were shared (common) between both tissues. Of these, 97 genes were commonly upregulated, and 234 were commonly downregulated; many of them were annotated to be involved in metabolic processes such as “fat digestion and absorption”, “glycine, serine and threonine metabolism” and “glutathione metabolism” and in signal transduction pathways such as the “Rap1 signaling pathway”, “MAPK signaling pathway” and “Hippo signaling pathway”. Overall, this work enhances understanding of the effects of transgenic Gal4 protein expression on normal gene expression in silkworm tissues and suggests that researchers should pay attention to unexpected effects when using the Gal4/UAS system to study gene function.


46
The Gal4/upstream activating sequence(UAS) binary expression system, derived from yeast and 47 originally developed in Drosophila [1][2][3], is a powerful genetic tool that allows manipulation of 48 target gene expression in a spatiotemporally precise fashion. Since its first application in Drosophila, 49 the Gal4/UAS system has been widely used to analyze gene function in dozens of organisms, 50 including mice [4] [12]. The Gal4/UAS system has also been employed to develop novel genetic tools, such as the 53 enhancer/gene trap system and the Q system [13][14][15], and it has been combined with genome editing 54 tools for conditional manipulation of gene expression in vivo [16][17]. 55 In recent decades, remarkable progress in gene function analysis has been achieved with the 56 Gal4/UAS system. However, the fact that high protein levels of Gal4 have certain toxicity toward 57 cells coexpressing UAS-linked target genes and Gal4 protein cannot be ignored. Although some 58 researchers have described and developed novel Gal4/UAS systems with smaller sizes but greater 59 transactivation efficiency than the original system [18][19][20], little attention has been paid to the effects The wild-type (WT) silkworm strain Nistari, the A4G4 transgenic line, and the UtdTomato 74 transgenic line harboring a UAS-linked red fluorescent protein variant (tdTomato) were maintained 75 in our laboratory. The hatched larvae were reared at 24-28°C with fresh mulberry leaves. WT and 76 A4G4 larvae at day5 of the fifth instar were selected, and the WDs and EPs were dissected, washed 77 with precooled phosphate-buffered saline and used for subsequent experiments. The WD samples 78 collected from WT and A4G4 larvae were named W5N_S1/S2/S3 and W5A_S1/S2/S3, respectively.

82
Total RNA was extracted from the WD and EP samples using TRIzol Reagent (Ambion, USA) and 83 examined on a NanoDrop 1000 spectrophotometer (Thermo Scientific, USA) and an Agilent 84 Bioanalyzer 2100 system (Agilent Technologies, USA) for RNA integrity and quality. The qualified 85 RNA samples were purified for poly-A-containing mRNA molecules using poly-T oligo-attached 86 magnetic beads, fragmented into small pieces using divalent cations under elevated temperature and 87 reverse transcribed using random primers. Thesecond-strand cDNA fragments were ligated with 88 index adapters after being purified, end-repaired, and A-tailed. Suitable fragments were used as 89 templates for PCR amplification. After quantification with a Qubit instrument, the PCR products 90 were sequenced on a BGISEQ-500 platform at Beijing Genomics Institute (BGI, China). FPKM values larger than 10.0, and 30% had FPKM values lower than 1.0 (Fig 2).

141
By comparing transcriptome data between the transgenic and WT groups, a number of genes 142 expressed in the WD and EP were identified as significant DEGs (Fig 3). In WD samples, 2,025 143 genes were identified as DEGs, including 771 upregulated genes and 1,254 downregulated genes 144 (Table S1). In EP samples, a total of 2,488 DEGs were identified, among which 771 genes were 145 upregulated and 1,717 genes were downregulated (Table S2) function categories. Among the biological process categories, "cellular process" was the main 155 functional group, followed by "metabolic process" and "response to stimulus". Among the cellular 156 component categories, "membrane" was the main functional group, followed by "membrane part" 157 and "cell". Among the molecular function categories, "binding" and "catalytic activity" were the two 158 main functional groups ( Fig 4A). In EP samples, 1,621 DEGs were functionally annotated with 15 159 biological process categories, 14 cellular component categories and 10 molecular function categories.

160
The most enriched GO terms in the biological process category were "cellular process", "metabolic 161 process" and "biological regulation". The terms "membrane", "membrane part" and "cell" were 162 significantly enriched at the cellular component level, and the terms "binding" and "catalytic activity" 163 were significantly enriched at the molecular function level (Fig 4B). The top 10 up-and 164 downregulated annotated DEGs as well as the significantly enriched GO terms for the DEGs in WD 165 and EP samples are listed in Table S3 ~ Table S6. 166 To better interpret the pathways in which the DEGs were involved and enriched, we annotated 167 the DEGs against the KEGG database. Briefly, the DEGs in WD samples were mainly enriched for 168 the "phototransduction -fly", "Influenza A", "Hippo signaling pathway", "fat digestion and 169 absorption", "viral myocarditis", and "oxytocin signaling pathway" terms ( Fig 5A). In EP samples, 170 the DEGs were mainly annotated with the "complement and coagulation cascades", "amoebiasis", 171 "tyrosine metabolism", "ECM-receptor interaction", "insect hormone biosynthesis", "Hippo 172 signaling pathway", "axon guidance", and "fat digestion and absorption" pathway terms, among 173 others ( Fig 5B).

175
Considering that the WD and EP are known to be involved in the regulation of wing development in 176 B. mori, the DEGs in both tissues were further analyzed to identify commonalities and differences.  (Table S7). 181 We further focused on the 331 common DEGs, the patterns of which might be influenced by the 182 Gal4 protein in a similar way in these two tissues.

183
First, the 331 DEGs were annotated in the NR NCBI database to a total of 208 Nr terms that 184 encompassed 124 Nr functions. Genes annotated with more than 2 functions are listed in Table 2.

192
Next, we mapped all of the genes to terms in the GO database to look for significantly enriched 193 GO terms. Among the 331 DEGs, 72 genes were annotated with 444 GO terms. Among the 60 most 194 enriched GO terms were "multi-organism process", "multicellular organismal process" and 195 "biological regulation" for the biological process category; "organelle part", "cell" and 196 "macromolecular complex" for the cellular component category; and "binding", "catalytic activity" 197 and "signal transducer activity" for the molecular function category (Fig 6B). Overall, 39 common 198 genes were annotated with these GO terms, including 16 upregulated genes and 23 downregulated 199 genes.

200
Finally, the 331 DEGs were annotated against the KEGG database to better understand the 201 biochemical pathways in which they were involved. Among the 331 DEGs,119 genes were annotated 202 in 5 main categories. The 20 most enriched KEGG terms were "fat digestion and absorption",