Abstract
Endophytic microorganisms play important physiological functions in plants and animals. In this paper, we developed a method to obtain endophytic microbiome information directly by analyzing transcriptome sequencing data files of plants and animals. Compared with the use of amplicon analysis or whole-genome sequencing of animal and plant tissues to analyze microbial composition information, this method can obtain endophytic microbiome information in addition to obtaining gene expression information of host plants and animals.
Introduction
Microorganisms are almost ubiquitous, such as rivers, lakes, seas, soils, air, body surfaces, and insides of humans, animals, and plants. Many microorganisms, including bacteria, archaea, and fungi, can live in the tissues of plants. Endophytes are microorganisms that are present in various tissues of the host plant in a symbiotic or beneficial manner or without causing any detrimental effects (Ali, et al., 2021; Kaul, et al., 2016). A few endophytes can enhance the host plant tolerance to environmental abiotic stresses, such as thermotolerance (Marquez, et al., 2007; Shekhawat, et al., 2021), while others can promote plant growth through the production of phytohormones, solubilization of phosphorus and potassium, biological fixing of nitrogen, inhibition of ethylene biosynthesis (Fadiji, et al., 2021; Lata, et al., 2019). In addition, many endophytes can protect plants from microbial pathogens by the production of ammonia, hydrogen cyanide, and siderophores (Ali, et al., 2021; Carrion, et al., 2019; Lata, et al., 2019). Pathogenic fungi and symbiotic fungi can also penetrate plant cells for absorption or exchange of nutrients (Han, et al., 2019; Han, et al., 2020).
Current endophytic microorganisms detection methods mainly including culture-dependent and culture-independent methods (Dissanayake, et al., 2018; Tian, et al., 2019). Several culture-independent methods, such as Real-time PCR, In situ hybridization FISH, microarray, metabarcoding approaches, whole plant genome DNA sequencing have been used (Aslam, et al., 2017; Dissanayake, et al., 2018; Fadiji, et al., 2021; Fadiji, et al., 2021), and the mystery of a large number of plant endophytes was revealed (Aslam, et al., 2017; Matsumoto, et al., 2021; Trivedi, et al., 2020). The use of whole plant tissues for DNA extraction for amplifying 16S rDNA, fungal ITS area, or amplifying specific target genes and then sequencing with the second or third-generation sequencing technology is the often-used method. Another next-generation sequencing method is the direct sequencing of the whole genomic DNA from plant tissues and analysis using microbiome analysis software to generate endophytic microbiome information.
In this paper, a new method for mining endophytic microbiome information from plant transcriptome data was described. Briefly, after the removal of adaptor and low-quality bases, the clean data were mapped to the host plant genome or all cDNA files to generate unaligned files. The unaligned files were further analyzed by using a microbiome analysis pipeline to obtain plant endophytic microbiome information.
Materials and Methods
Endophytic microbiome mining method
The Workflow diagram is shown in Figure 1. The animal or plant transcriptome raw data were assessed via fastqc, and then cleaned via trimmomatic with the default parameter. The clean data were mapped to the host genome by using hisat2 with the default parameter. Use its output unaligned sequence parameter “--un-conc” to obtain an unaligned data file, which contains not only a part of the host animal and plant sequences but also expressed microbial gene fragments. Use microbiome analysis piplines, such as Kraken2, bracken and Pavian (Breitwieser and Salzberg, 2020; Lu, et al., 2017; Wood, et al., 2019), to analyze the unaligned data files to generate endophytic microbiome information.
Data sets
We selected the RNA-seq data of two plants inoculated or infected by microorganisms.
(1) Maize inoculated with arbuscular mycorrhiza (AM) fungus data sets
In our previous study, the maize (Zea mays) roots inoculated with AM fungus Rhizophagus irregularis DAOM-197198 (previously known as Glomus intraradices) were compared with the control without fungal inoculation (Han, et al., 2019). The maize raw RNA-Seq data (Bioproject accession: PRJNA553580) were used in this study. The maize inbred line B73 genome was download from Ensembl Plants.
(2) Common bean infected by Xanthomonas data sets
In the study by Foucher et al. (2020), a common bean (Phaseolus vulgaris L.) susceptible genotype (JaloEEP558) was infected by Xanthomonas phaseoli pv. phaseoli strain CFBP6546R 48 h after inoculation (Foucher, et al., 2020). The raw RNA-Seq data of Common bean (SRA accession: SRP273448) were used. The Common bean genome was download from Ensembl Plants.
Results and discussion
Generation the unaligned files from the host RNA-seq data
Approximately 18-28% of maize RNA-seq data cannot be mapping to the maize genome (Table S1), while about 5-11% of common bean RNA-seq data cannot mapping to the common bean genome (Table S2). The unmapping rates from maize inoculated with AM fungus are higher than that of the control, similar results were also observed in the common bean RNA-seq data (Figure 2).
Endophytic microbiome information
As an example, in addition to maize root transcriptome data, the transcriptome data also included fungi, bacteria, archaea, and viruses, and similar results were found in the common bean transcriptome data (Figure 3). Further analysis can be done to obtain the content relationships between various types of microorganisms (Figure 4). The microorganisms in the maize root inoculated with AM fungi were mostly fungi, whereas microorganisms in common beans were dominated by bacteria (Figure 4).
In our previous study, the maize seedling was inoculated with R. irregularis DAOM-197198 to investigate the regulatory network responsive to AM fungi colonization in maize roots. The sankey diagram clearly shows that a large number of R. irregularis can be detected in the maize inoculated with AM fungus while the control was not detected (Figure 3). Similarly, X. phaseoli can be detected in the sample inoculated with X. phaseoli while the control was not detected (Figure 3). Species R. irregularis is the most significant biomarker in the maize roots inoculated with AM fungus R. irregularis (Figure 5), while Genus Xanthomonas is one of the most significant biomarkers in the common bean sample inoculated with X. phaseoli pv. phaseoli strain CFBP6546R (Figure 6). The evidence shows that the method of mining endophytic microbiome information from plant transcriptome data is reliable, and this method is also applicable to animal transcriptomes.
Conclusions
We developed a method of mining endophytic microbiome information from plant or animal transcriptome data, which is reliable and useful in investigating microbiome information while investigating the gene expression information of the host plant or animal.