Comparative transcriptome analysis reveals higher expression of stress and defense responsive genes in dwarf soybeans obtained from the crossing of G. max and G. soja

Plant height is an important component of plant architecture and significantly affects crop breeding practices and yield. Dwarfism in plants prevents lodging and therefore it’s a desired trait in crops. To find differentially expressed genes to classify and understand the regulation of genes related to plant growth in mutant dwarf soybeans, which appeared in the F5 generation. We obtained a few segregated dwarf soybeans in the populations derived from the crossing of Glycine max var. Peking and Glycine soja var. IT182936 in an F5 RIL population. These dwarf soybeans may be useful genetic resources for plant breeders, geneticists and biologists. Using the Illumina high-throughput platform, transcriptomes were generated and compared among normal and dwarf soybeans in triplicate. We found complex relationship of the expressed genes to plant growth. There were highly significantly up-/downregulated genes according to the comparison of gene expression in normal and dwarf soybeans. The genes related to disease and stress responses were found to be upregulated in dwarf soybeans. Such over-expression of disease resistance and other immune response genes can be targeted to understand how the immune genes regulate the response of plant growth. In addition, photosynthesis-related genes showed very low expression in dwarf lines. The transcriptome expression and genes classified as related to plant growth may be useful resources to researchers studying plant growth.


Introduction
Dwarfism is a desirable trait in crop plants, as it prevents lodging and, in most cases, makes the plant study [1] . Dwarfism was one of the major contributors to the 'green revolution', which brought about a boom in agricultural production [2]. However, plant dwarfism cannot always be linked to increased yield. Plants sacrifice their growth for different defense responses [3]. Several factors may affect plant height such as light and soil nutrients [4], drought [5,6] , waterlogging [7] , floods [8] , cold temperatures [9] , infection [10,11], and herbivory [12]. . In soybeans, some high-yielding and lodgingresistant dwarf cultivars have been developed [13][14][15][16] ; however, there has not yet been any comparative transcriptome study dissecting the genetic factors involved in dwarfism in soybean recombinant inbred lines (RILs).
Dwarfism in plants is basically due to the cessation of cell elongation [17] and inhibited cell division [18] brought about by one or more of the aforementioned factors. Optimum levels of plant growth hormones (PGRs), particularly, gibberellin and auxin, and their interactions are known to play important roles in plant growth [19,20] . Interestingly, plants under defense pressure are known to redirect their resources to produce one or more of the defense response metabolites (salicylic acid (SA), jasmonates (JA) etc.), thereby reducing overall growth [3]. A transcriptome analysis reported that multiple genes involved in hormone biosynthetic pathways were influenced in dwarf soybean mutants [21]. In another study, knockout of a gene of the peroxidase superfamily reportedly brought about dwarfism in soybean plants [22].
In the present study, we performed a comparative transcriptome analysis of three normal and three dwarf soybean F5 RILs derived from the cross between G. max and G. soja. Numerous differentially expressed genes (DEGs) were identified using different bioinformatic tools, and putative associations between major DEGs were established and discussed.

RNA isolation and cDNA synthesis for RNA sequencing
The total RNA was purified from leaves of each of the six RIL lines (N1-3, D1-3) three weeks after germination using the RiboPure kit (Applied Biosynthesis, Foster City, CA, USA) following the manufacturer's protocol. DNA digestion was performed with DNase Ⅰ (Sigma, St. Louis, MO, USA), and the total RNA was quantified by absorption of light at A260 and the quality checked by analyzing the ratios A230/260 and A260/280 using a NanoDrop® 2000 Spectrophotometer (Thermo Scientific, Wilmington, DE, USA). Paired-end sequencing was performed with a HiSeq 2500 (Illumina, San Diego, CA, USA) at the National Instrumentation Center for Environmental Management (NICEM, Seoul National University, Seoul, South Korea). The raw reads were cleaned by filtering out adaptor-only nucleotides and trimming of adaptor nucleotides, empty nucleotides (N in the end of reads), short reads (<59 bp) and low quality nucleotides (reads containing more than 50% bases with Q-value ≤20) using trimmomatic [23]

DEG (differentially expressed genes) analysis
The Williams 82 Assembly 2 Annotation 1 (Wm82.a2.v1) coding sequence provided on the SoyBase website (https://soybase.org/GlycineBlastPages/blast_descriptions.php) was used as the reference gene set. The expression value for each gene was calculated as FPKM (fragments per kilobase of transcript per million mapped reads). Hisat2 [24] was used for the mapping, and the number of mapped reads for each unigene set was obtained from the mapping result file for each sample using Samtools [25]. The log2-fold change value was calculated as the normalized FPKM value of the expression level of the sample to be compared for analysis. The up-and down-regulation analyses were performed using Fischer's exact test (p ≤0.05). Additionally, the false discovery rate (FDR) was calculated for the significantly differentially expressed genes (DEGs). When the DEGs showed relatively higher FPKM values in tall RILs, they were regarded as upregulated but when this case occurred in dwarf RILs, the DEGs were regarded as downregulated. For the determination of each standard compound, UV spectra were evaluated at 300 nm. Salicylic acid was purchased from Sigma-Aldrich (Korea). The purities of the standard were above 98%. HPLC-grade methanol and water were purchased from TEDIA (Fairfield, OH, USA). Formic acid (FA) was purchased from DAE JUNG (Korea). Standard stock solutions of salicylic acid (500 μg/mL) were prepared in methanol. These standard stock solutions were diluted to 5 different concentrations by methanol to establish the calibration curves. The 100% metabolic extracts from each of three dwarf and normal F5 RILs and the initial parents were weighed and dissolved in methanol at the rate of 20.00 mg/mL. Before HPLC analysis, all of the sample solutions were filtered through a 0.45-μm membrane independently.

RNA-seq analysis of soybean RILs
Over the generations, RIL lines of G. max and G. soja had tall plants, in most cases, that further segregated to tall and dwarf phenotypes, while dwarf plants consistently produced progenies of their own phenotype (data not shown), giving an initial clue that the dwarf phenotype involves recessive gene/s. The tall plants had longer and thinner internodes, broader leaves and viney growth resembling G. soja, while dwarf plants had shorter and thicker internodes, narrow leaves, and G. max-like upright growth (Fig 2). Other than these traits, almost all morphological characteristics, such as color of seeds and flowers, in the dwarf lines were similar with other lines. To gain deeper insight into dwarfism, a comparative transcriptome analysis was performed between normal and dwarf plants, and significantly up-and downregulated genes were screened. In total, >105 million (M) RNA-Seq reads were generated from the tall and dwarf RILs. After trimming, a total of 65 M high quality reads were obtained ( Table   1).

Screening of differentially expressed genes (DEGs) in dwarf RILs
The FPKM values of each transcript for each of the 6 RIL libraries were calculated. Differential expression was observed in 22,897 genes ( Fig 3A), although only 1,265 were found to be significantly differentially expressed (p<0.05) ( Supplementary Fig 1 and Supplementary table T1). When the RILs were ranked from 1 to 6, N2 and N3 showed a somewhat similar number of upregulated (Rank 4 to 6) as well as downregulated (Rank 1 to 3) DEGs. A similar pattern was observed for the D2 and D3 RILs, although tall RILs had more upregulated DEGs ( Fig 3B). All four RILs (N2, N3, D2, and D3) were the descendants of parents with their own phenotype; however, N1 and D1 were newly segregated from a parent of normal phenotype (Fig 1), which might explain the slightly different pattern of DEG ranks compared to their similar phenotype counterparts. Interestingly, N1 still had a slightly higher number of upregulated DEGs compared to downregulated ones while D1 DEGs showed the opposite pattern ( Fig 3B).

Categorization of DEGs according to GO terms
To gain further functional insights, the DEGs were categorically clustered under molecular function, biological process, and cellular component GOs, which assigned GO terms for only 912, 733, and 797 DEGs, respectively ( Fig 3C). The total number of transcripts with at least one GO term was 1097. After removal of the genes with unassigned function, the DEGs were mainly assigned to ion binding (198 upand 184 downregulated) under molecular function (Supplementary Table T1). However, when the differences in number of DEGs under the same GO term that were upregulated and downregulated were compared, dwarf RILs had the largest positive difference for the DEGs under kinase activity and largest negative difference for those under oxidoreductase. Other major GOs with larger numbers of upregulated DEGs included nucleic acid-binding transcription factor (TF), ion binding, signal transducer, and DNA binding. Similarly, other GOs with lower numbers of upregulated DEGs included molecular function, lyase activity, isomerase activity, transmembrane transporter, and peptidase activity.
However, some GOs such as glycosyl bond hydrolase and methyl transferase had the same number of genes (12 and 10 respectively) in both up-and downregulated groups (Fig 3-A Table T1). A larger portion of the sequences were found to be downregulated in most of the GO terms. Major terms with lower numbers of upregulated DEGs included plastid, thylakoid, protein complex, cellular component, mitochondria, and vacuole Golgi apparatus. Other major terms with higher numbers of upregulated DEGs included plasma membrane, cytoplasm, extracellular region intracellular, and organelle. However, the differences of up-and downregulated genes in GO terms with higher numbers of downregulated DEGs were much wider than those in GO terms with higher numbers of upregulated DEGs (Fig 3-B). Additional clustering of the DEGs under biological process GO terms showed 54 different clusters among which all terms had higher numbers of downregulated DEGs compared to upregulated ones with the exception of only 12 GO terms. While most of the terms possessed few sequences, the GO term with highest DEGs was biosynthetic process (76 up-and 105 downregulated) (Supplementary Table T1). Interestingly, all of the DEGs under the photosynthesis GO term were downregulated. Other GO terms with comparatively higher numbers of downregulated DEGs included precursor and energy generation, small molecule metabolic process, biosynthetic process, cofactor metabolic process, biological process, and carbohydrate metabolic process. In contrast, The GO terms with relatively higher numbers of upregulated DEGs inlcuded response to stress, signal transduction, secondary metabolic process, and vesicle mediated transport. (Fig 3C).

Pathway analysis of the DEGs
Among a total 1,097 DEGs with at least one GO term, KEGG pathway analysis revealed that only 281 were assigned to one or more enzyme IDs of 104 different pathways (Supplementary table T2). Among them, only the pathways with >5 DEGs were selected for further analysis in the present study. Up-or downregulated DEGs under each of these pathways were enumerated to see whether the pathways were positively or negatively affected in dwarf RILs (Fig 4). Even though the antibiotic biosynthesis pathway had the highest number of enzymes, the DEGs number was the highest in the purine metabolism pathway. None of the pathways had all of its genes upregulated; however, all DEGs in the pathways of 'fructose metabolism', 'nitrogen metabolism', 'porphyrin and chlorophyll metabolism', methane metabolism, and oxidative phosphorylation were downregulated. While most of the other pathways also had a comparatively larger proportion of downregulated DEGs (an additional 16 of the 33 pathways), pathways including 'starch and sucrose metabolism', 'glutathione metabolism', 'phenylpropanoid biosynthesis', 'pyruvate metabolism', 'drug metabolism-cytochrome P450', and thiamine metabolism had a relatively higher number of upregulated DEGs. Interestingly, 'thiamine metabolism' had the lowest enzyme-to-DEG ratio with 35 DEGs representing just a single enzyme (nucleoside-triphosphate phosphatase, EC 3.6.1.15).
In summary, the distribution of upregulated and downregulated genes is given in Table 2. It was observed that higher numbers of growth-related genes were downregulated whereas higher numbers of stress-responsive and defense-related genes were upregulated in dwarf soybean (Table 2). With close observation, dwarf RILs had upregulated PAMP (pathogen-associated molecular pattern) and DAMP (damage-associated molecular pattern) receptors such as TIR-NBS-LRR genes, cysteine-rich RLK, protein kinase superfamily, and lectin protein kinase. Expression of MAPK (mitogen-activated protein kinase), multiple WRKY genes, and PR-proteins (pathogenesis-related proteins), the downstream enzymes of stress response, were also upregulated in these plants. However, one group of receptors, LRR family proteins, was found to be downregulated (Table 3). Interestingly, heat shock protein 70, which, in combination with LRR family protein, activates MAPK (mitogen-activated protein kinase) precursors, was significantly upregulated in these lines. In addition to these, dwarf RILs also showed the upregulated expression of NB-ARC domain containing proteins, senescence-associated gene, SNF2, Ca2+ binding EF hands family protein, Ca2+ ATPase 2, early-response to dehydration protein, lycopene cyclase, and glucose-6-phosphate dehydrogenase, among others (Table 3). However, the dwarf RILs also had multiple downregulated genes which were related to photosynthesis including Calvin-Benson cycle, glycolysis, TCA cycle, and chlorophyll metabolism (Supplementary table T3).
In addition to the upregulation of stress-related DEGs and downregulation of primary metabolismrelated genes, some transcripts involved in hormone signaling also showed significantly differential expression in dwarf RILs. The plants showed the downregulation of SAUR-like auxin-responsive proteins and upregulation of auxin response factor 8, auxin-responsive GH3 family protein, auxin signaling F-box, and auxin-associated family protein. Others, such as RGL2 (RGA-like 2, a GA signaling repressor), SKP1, cytokinin-responsive GATA factor, cytokinin oxidase, and ethylene responsive element binding factor were downregulated while DEGs encoding ethylene response factor, AP2/B3-like transcriptional factor, ERD (early-responsive to dehydration) protein, and SNF2 domain containing protein, among others, were significantly upregulated (Fig 5). Moreover,the majority of the uniquely significant DEGs (with all of their versions expressing significantly) were photosynthesisrelated (upregulated in normal RILs) and disease/stress-related (upregulated in dwarf RILs) (Fig 5).

Salicylic acid determination by HPLC
Observing the upregulated state of stress/disease-responsive DEGs in dwarf RILs (Fig 7), we chose to assess the change in plant salicylic acid (SA) content. HPLC analysis was conducted for different tall and dwarf RILs along with their inital parents (G. max cv. Peking and G. soja). The result clearly showed higher SA contents in dwarf RILs compared to their tall counterparts (Fig 6). Additionally, G. soja showed relatively higher SA content than G. max suggesting a relatively highly activated state of SA signaling in the latter. Due to the short stature of the dwarf RILs and putatively activated SA signaling in the plants, the plants were tested for possible infection by soybean dwarf virus. The analysis gave negative results suggesting that the short stature of dwarf RILs was clearly due to genetic segregation and not because of viral infection.

Dwarf RILs have upregulated stress responsive genes
Stresses are known to counteract overall plant growth. In the present study, we found the largest number of disease-responsive genes and stress-responsive genes upregulated in dwarf soybean RILs, suggesting their constitutive expression. However, the KEGG pathway showed that the number of upregulated DEGs involved in antibiotic biosynthesis were fewer than the downregulated ones, indicating comparatively lower immunity of the dwarf RILs. Overexpression of disease resistance and other immune responsive genes has been attributed to produce an overdose effect in plants, thereby activating the downstream defense response pathways and curtailing plant growth by inhibiting protein synthesis [26,27]. Huot et al. [3] also strongly argued for the reduced growth of plants under pathogenic pressure.
The auto-toxicity of constitutively expressed defense-responsive gene products hindering plant growth has also been reported in tobacco [28]. In a separate extensive study on wheat, Heil et al. [29] strongly argued for the tradeoff between the defense response and plant growth because of the energetically costly molecular mechanism. Programmed cell death (PCD) induced by the constitutive expression of pathogenesis-related genes is a general phenomenon in plants [30]. . The present study also showed the significant upregulation of a DCD (development and cell death) domain protein in dwarf RILs, likely in response to the upregulated disease responsive genes. The gene, as its name suggests, is involved in development and PCD in plants [31]. Severe dwarfism was also reported in Arabidopsis thaliana due to the disruption of a MAP3K gene (MEKK1), which led to constitutive expression of pathogenesisassociated genes in the plant [32].

Genes involved in hormonal regulation
Auxin plays a vital role in plant growth and development [33]. Our dwarf RILs showed upregulated auxin response factor (ARF8) and auxin-responsive GH3 family protein. Interestingly, the GH3 gene has been reported to be involved in negative regulation of shoot cell elongation and lateral root formation [34] . Moreover, ARF8 is known to play a role in controlling auxin (IAA) levels through negative feedback by regulating GH3, thereby reducing hypocotyl growth [35]. Additionally, the small auxin-up RNA (SAUR)-like gene was found to be downregulated in the dwarf RILs. As the gene is reportedly regulated by the auxin level in plants and is involved in cell division [36], it is likely that the dwarf RILs have comparatively lower auxin content than their tall counterparts. Another study by ten Have [37] showed resistance to higher IAA concentrations by LRR RLK mutant Arabidopsis thaliana plants and suggested the involvement of LRR RLK genes in auxin signaling. Interestingly, the dwarf soybean RILs in the present study showed down regulation of the LRR family protein (Fig 5) which might have hindered auxin signaling in the plants.

Dwarf phenotype is likely due to a hypersensitive response
Salicylic acid (SA) is a signaling molecule and is generally associated with biotic stress in plants [38,39] . SA is necessary for the plant's systemic acquired resistance (SAR) [40]. The molecule brings about a hypersensitive response (HR) when accumulated in plants with high reaction oxygen species and actively expressed pathogenesis-related (PR) genes [41] . Our study showed that dwarf soybean RILs had higher SA content (Fig 6) in addition to the upregulated states of different pathogen-associated molecular pattern (PAMP) receptor genes (Fig 5A and 5B). As SA signaling in plants under stress precedes ROS bursts in different intracellular compartments [42], it is highly likely that the dwarf RILs also have high ROS. This phenomenon is supported, in part, by the upregulated state of respiratory burst oxidase in the dwarf RILs (Supplementary table T1). The gene is also reported to be involved in disease resistance [43]. Additionally, the comparatively higher SA in G. max relative to that in G. soja suggests the higher expression of PR genes in the former. In general, HR is characterized by the necrosis at the site of pathogenesis [40] . Since there was no apparent pathogenesis in the plants (dwarf RILs), the accumulated SA might instead have contributed to stunting of the plants.

Genes involved in energy metabolism are downregulated in dwarf RILs
Our study showed that dwarf RILs had relatively lower expression of both nitrate reductase (NR) and nitrite reductase (NiR). These genes are closely associated with nitrate assimilation as NR reduces nitrate to nitrite that, in turn, is reduced to ammonia by NiR, and ammonia is finally fixed into carbon [44]. Additionally, the study by Crawford has reported the severely dwarf phenotype of NiR-antisense tobacco plants [44].
In normal healthy plants, glucose is metabolized via glycolysis and the non-oxidative branch of the pentose phosphate pathway (PPP) [45][46][47]. However, in plants under stress, the oxidative branch of PPP is more active which is critical to drive the conversion of glucose 6-phosphate to CO2, ribulose-5phosphate, and NADPH [47]. The present study found the clear upregulation of glucose-6-phosphate dehydrogenase, a key enzyme that mediates the reaction, indicating the possible constitutive stress/defense response in the plants at the molecular level. Moreover, the key enzymes of the Calvin-Benson cycle and chlorophyll metabolism were found to be exclusively downregulated, suggesting a lower rate of primary metabolism in the dwarf RILs. Such global downregulation of genes involved in photosynthesis is common in plants under biotic stress [38]. However, their lower states of expression in dwarf RILs compared to their tall counterparts (even though all plants were grown under the same conditions) also indicate a constitutively active state of biotic defense genes in the dwarf RILs. An earlier study [48] also reported the decreased expression of genes involved in photosynthesis, in contrast to increased expression of defense-related genes, upon herbivory in tobacco. Moreover, overall downregulation of photosynthesis-related genes is common in plants which are under biotic stress as the plants have to re-allocate their resources from growth to defense [38].

Conclusion
We obtained the segregation of RILs in dwarf and normal lines. There were highly significantly up-/downregulated genes in the comparison of gene expression in normal and dwarf soybeans. We observed that dwarf lines exhibited higher expression of stress-and defense-related genes. We also found higher levels of SA in dwarf lines, which may have contributed to their stunted growth phenotype.
Such over-expression of disease resistance and other immune responsive genes were targeted to understand how the immune genes regulate the response of plant growth. In addition, all the photosynthesis-related genes were downregulated in dwarf plants. The transcriptome expression and genes classified as related to plant growth may be useful resources for researchers studying plant growth.       RIL progenies for their salicylic acid (SA) content.