“Integrative Genomic Analysis for the Bioprospection of Regulators and Accessory Enzymes Associated with Cellulose Degradation in a Filamentous Fungus (Trichoderma harzianum)”

Jaire A. Ferreira Filho; Maria Augusta C. Horta; Clelton A. dos Santos; Deborah A. Almeida; Natália F. Murad; Juliano S. Mendes; Danilo A. Sforça; Claudio Benício C. Silva; Aline Crucello; Anete P. de Souza

doi:10.1101/731323

Abstract

Background Unveiling fungal genome structure and function reveals the potential biotechnological use of fungi. Trichoderma harzianum is a powerful CAZyme-producing fungus. We studied the genomic regions in T. harzianum IOC3844 containing CAZyme genes, transcription factors and transporters.

Results We used bioinformatics tools to mine the T. harzianum genome for potential genomics, transcriptomics, and exoproteomics data and coexpression networks. The DNA was sequenced by PacBio SMRT technology for multi-omics data analysis and integration. In total, 1676 genes were annotated in the genomic regions analyzed; 222 were identified as CAZymes in T. harzianum IOC3844. When comparing transcriptome data under cellulose or glucose conditions, 114 genes were differentially expressed in cellulose, with 51 CAZymes. CLR2, a transcription factor physically and phylogenetically conserved in T. harzianum spp., was differentially expressed under cellulose conditions. The genes induced/repressed under cellulose conditions included those important for plant biomass degradation, including CIP2 of the CE15 family and a copper-dependent LPMO of the AA9 family.

Conclusions Our results provide new insights into the relationship between genomic organization and hydrolytic enzyme expression and regulation in T. harzianum IOC3844. Our results can improve plant biomass degradation, which is fundamental for developing more efficient strains and/or enzymatic cocktails for the production of hydrolytic enzymes.

Background

Trichoderma harzianum is a common fungal species in soil and is used as a biological control in a variety of phytopathogenic fungi [1]. However, the use of lignocellulosic biomass degradation is still poorly explored when compared to that of other cellulolytic fungi. Due to the high cellulolytic activity of some strains, T. harzianum has shown considerable potential for application in plant biomass hydrolysis [2–4]. T. harzianum strains have potential for the production of an enzymatic/protein arsenal necessary for the complete hydrolysis of cellulosic compounds in fermentable sugars [5–10].

Currently, the most-studied and widely used industrial-scale enzymes are produced by the fungus T. reesei and species from the Aspergillus genus. These organisms are the source of the majority of enzymes that make up enzymatic cocktails that are available on the market [11]. T. reesei is a widely studied fungus and is found in several works in genomics, transcriptomics, proteomics and metabolic engineering [12–16]. Thus, increasing the number of biotechnological studies related to this bioprocess for T. harzianum is necessary.

The three main groups involved in the hydrolysis of cellulose (CEL) are cellobiohydrolases, endo--1,4-glucanases and -glucosidases. In addition, accessory enzymes such as copper-dependent lytic polysaccharide mono-oxygenases (LPMOs), cellulose-induced protein 1 and 2 (CIP1 and CIP2) and swollenin also participate in this process [17–20].

One of the great challenges in understanding the molecular mechanism of biomass degradation is how the transcription factors (TFs) related to this system act. Several fungal TFs have been identified as related to the degradation of plant biomass, many of which belong to the binuclear zinc family [21]. Many TFs have been described as being directly involved in the regulation of plant biomass [22]. This number has been expanding rapidly in recent years, mainly due to the increase in the sequencing scale of whole genomes and the exponential increase in bioinformatics tools for analysis, which produce massive amounts of information, and in the number of genes identified [22, 23].

The purpose of the present study was to analyze genomic regions with CAZyme genes using a bacterial artificial chromosome (BAC) library that we built [24] and to integrate these data with RNA-seq, secretome data and coregulation networks. We sequenced a massive amount of DNA and used it to integrate genomic data (genomic regions containing CAZymes), expression patterns (the transcriptome under degradation conditions), proteins (the secretome by mass spectrometry) and systems biology (with gene regulatory networks) to obtain a broad and precise overview of the CEL degradation pathways. Based on our study, we characterized the main genes, accessory enzymes and regions involved in the degradation and regulation process of hydrolytic enzymes. In addition, we analyzed the regulator cellulose degradation regulator 2 (CLR2) found in a cluster with other important enzymes. These results will be important for further studies of regulation and gene silencing.

Results

Genomic regions of T. harzianum IOC3844

In this study, a library of large genomic regions was used as a resource to search for genes of interest and to thoroughly study the genomic structure of T. harzianum IOC3844 (ThIOC3844) (accession numbers MK861589-MK861650 - Supplementary Table S1 and Fig. S1). Screening for genes of interest resulted in a total of 62 regions that contained CAZymes genes related to the degradation of plant biomass in the ThIOC3844 genome. Sequencing of these regions generated a total of 5 Mb of the estimated 40 Mb genome (Supplementary Table S2 and S3). These regions ranged in size from 43 to 152 kb, enabling the prediction and annotation of 1676 gene models for this strain (Supplementary Table S4). The average number of genes per region was 26 (Supplementary Table S1).

The genome of T. reesei QM6a (PRJNA325840) was used to analyze the distribution of genes in ThIOC3844. This genome, which is composed of seven chromosomes with a total size of 34 Mb, was divided into 38 intervals (1 Mb) (Fig. 1). It was possible to observe CAZyme genes annotated in ThIOC3844 distributed throughout the whole genome. Only four intervals had no CAZyme genes, and when all the genes in the genomic regions of ThIOC3844 were mapped, genes were found in all intervals.

Figure 1.

Distribution of the T. harzianum IOC3844 genes on the 1 Mb intervals of the seven chromosomes of T. reesei QM6a. CAZyme genes of T. harzianum IOC3844 are in red, CAZymes genes of T. reesei are in blue, and all genes of T. harzianum IOC3844 are in green. Th: T. harzianum IOC3844; Tr: T. reeseiQM6a.

The genes were functionally annotated for the main gene ontologies: biological processes, cellular components and molecular functions (Fig. 2a and Supplementary Fig. S2). We found 209 sequences of hydrolytic activity, 139 related to transport proteins and 85 sequences involved in regulation of gene expression (possible TFs). In addition, a specific annotation was made for genes identified as enzymes, where hydrolases (40%), oxidoreductases (25%), transferases (22%), lyases (6%), ligases (4%) and isomerases (3%) (Figure 2b) were found. We also identified genes directly related to the degradation of CEL and hemicellulose, with action of α-L-arabinofuranosidase (EC 3.2.1.55), endo-1,4-β-xylanases (EC 3.2.1.8), cellobiohydrolases (3.2.1.91), endo-β-1,4-glucanase (EC 3.2.1.4) and β-glucosidase (EC 3.2.1.21) (Fig. 2c and Supplementary Table S5).

Figure 2.

Functional annotation of the genes predicted in the genomic regions of T. harzianum IOC3844. Annotation of genes for gene ontologies for biological processes, cellular components and molecular functions. (a) Distribution of enzymes annotated according to enzyme commission (b) and major enzyme commission (EC) related to cellulose and hemicellulose degradation (c).

A total of 1676 genes were predicted. Of these, 222 were annotated as CAZymes in ThIOC3844, including 45% of GHs, 23% of GTs, 10% of CEs, 8% of AAs and 14% of CBMs (Fig. 3 and Supplementary Table S6). The GH class presented with the highest number of families, including GH2 (3 genes), GH7 (1 gene), GH3 (9 genes), GH5 (6 genes), GH12 (1 gene), GH18 (4 genes) and GH62 (1 gene).

Figure 3.

CAZy classification of genes annotated in the genomic regions of T. harzianum IOC3844. GH: glycoside hydrolases; GT: glycosyl transferases; PLs: polysaccharide lyases; CEs: carbohydrate esterases; AA: auxiliary activities; CBM: carbohydrate-binding modules.

Genomic comparison

For this analysis, we compared the genomic regions of ThIOC3844 against the entire genome of different strains and species of the genus Trichoderma. Genomic comparison of the sequenced regions of ThIOC3844 with two other strains of the same species (T. harzianum B97 – ThB97 and T. harzianum – T6766) showed a higher similarity to ThB97 (99.25%) than ThT6766 (91.61%). For the T. atroviride IMI206040 genome (TaIMI206040), the similarity to ThIOC3844 was 85.09%. For T. virens Gv29-8 (TvGv29-8), the similarity was 86.55%, and for T. reesei QM6a (TrQM6a), the similarity was 85.11%.

When we compared syntenic genes between groups of genes, a greater difference between T. harzianum and T. atroviride and T. reesei was observed. The T. harzianum TR274 (ThTR274) strain presented the same gene profile of genomic organization as that found in ThIOC3844. In TaIMI206040, four genes (GH4, transporter and two GH26) from the cluster were not found; for TvGv29-8, two genes were not found (GH1 and GH4). For T. reesei QM6a, three genes (GH4 and two GH26) were not found; in addition, the translocation of genes (MFS x GH2 and TF2 x CLR2) was found. The genes for the transcription factor CLR2, putative transcription factor TF2 and MFS (major facilitator superfamily permease) were maintained in all species analyzed. This result suggests a potential association between the regulation and expression of these genes (Fig. 4).

Figure 4.

Comparison between the gene clusters of T. harzianum IOC3844 and those of other species of the genus Trichoderma spp. GH1: glycoside hydrolase 1; GH4: glycoside hydrolase 4; MFS: major facilitator superfamily permease; Trans: putative transporter; TF-1: putative transcription factor 1; GT38: glycosyl transferases 4; CBM18: carbohydrate-binding modules 18; TF-2: putative transcription factor 2; CLR2: cellulose regulator 2; GH2: glycoside hydrolase 2; GH26: glycoside hydrolase 26; Th: T. harzianum; Tv: T. virens; Ta: T. atroviride; Tr: T. virens.

Expression by RNA-Seq and secreted proteins

All genes predicted in the genomic regions were analyzed according to expression data by RNA-Seq (under CEL and GLU degradation conditions) (Supplementary Table S7) and secreted proteins identified by mass spectrometry (LC-MS/MS). We found 114 genes with differential expression under CEL degradation conditions when compared to GLU degradation conditions; among them, 51 were classified as CAZymes, such as beta-glucosidase of the GH1 family (1.8-fold change - FC), LPMOs of the AA9 family (FC 5.0) and hypothetical protein with domain CBM1 (FC 3.7). In addition, two differentially expressed TFs were identified, CLR2 (FC 1.6) and unidentified transcriptional regulator of zing finger – Zn2Cys6 (FC 2.3). Six transport proteins were also found (iron permease, MFS hexose transporter, siderophore transporter, ammonium permease, sugar transporter and siderophore iron transporter).

Among the genes annotated as CAZymes in ThIOC3844, 31 were found in the secretome of ThIOC3844 under CEL conditions, and the main families were GH3, GH12, CBM1, AA9, GH6/CBM1, GH45/CBM1, GH62 and GH5. In this analysis, we also used the level of expression of the secreted genes. The gene with the highest TPM index (1567.4 TPM) is a cellobiohydrolase (EC 3.2.1.91) of the GH6 family. However, our results indicate that genes with low expression levels are also important secreted enzymes (Table 1).

View this table:

Table 1.

Proteins identified in genomic and in the T. harzianum IOC3844 secretome under cellulose growth conditions.

CLR2 transcription factor

The phylogenic analysis of the CLR2 factor showed a clear separation of this TF in relation to Basidiomycetes and Ascomycetes (Fig. 5a and Supplementary Table S8). However, even within these groups, considerable phylogenetic diversity was observed among the species of analyzed fungi with a variety of clades within the same group. Different strains of T. harzianum grouped in a single clade with proximity to T. reesei and T. atroviride species. Our results show a wide range of functional variety for CLR2, which may indicate different types of performance between species.

Figure 5.

Molecular phylogeny of the CLR2 transcription factor in Ascomycota and Basidiomycota (a); in silico protein modeling for CLR2 in T. harzianum IOC3844 (b) and T. reesei QM6a (c).

A structural modeling analysis for the CLR2 protein of ThIOC3844 was performed using T. reesei as a comparator. For both proteins, the best template was 6F07 (Saccharomyces cerevisiae), with e-values of 4.07e^-06 and 6.62e^-06 for ThIOC3844 (Figure 5b) and T. reesei (Figure 5c), respectively. Prediction of 1 and 3 protein domains was made for ThIOC3844 and T. reesei, respectively. For ThIOC3844, 59% of the residues were already modeled, and for T. reesei, it was possible to model 83%. For ThIOC3844, the secondary structure prediction was 46% H (helix), 0% E (beta-sheet) and 53% C (loop), and for solvent access, it was 56% E (exposed), 19% M (medium) and 23% B (buried).

A coregulation network of genes directly related to the CLR2 regulator was constructed, searching for insights about other important proteins in the process of cellulase expression. We identified 36 genes directly linked to CLR2, of which 21 genes were annotated as hypothetical proteins. In addition, we found that genes with known annotations were related to the process of gene expression, including genes annotated as initiation factors, kinases and helicases (Fig. 6a and Supplementary Table S9).

Figure 6.

Subnetwork of CLR2 transcription factors and related genes (a) and network of induced (blue) and repressed (red) genes under cellulose conditions (b). CLR2: cellulose regulator 2; GH: glycoside hydrolases; GT: glycosyl transferases; AA: auxiliary activities.

Network of induced/repressed genes in cellulose

Using the gene expression data of the secreted proteins, a Bayesian network of induced/repressed genes was constructed based on the CEL growth conditions for T. harzianum IOC3844 (Fig. 6b). The major genes that were induced under this condition belong to the GH7 (exoglucanase), GH5 (endo-β-1,4-glucanase), GH3 (β-glucosidase), GH12 (murein transglycosylase), CE15 (CIP2), AA9 (LPMO) and AA8 (hypothetical protein) families. In addition, seven genes that were not classified as CAZymes were also induced under CEL conditions. The families of repressed genes were GH10 (glycoside hydrolase 10 family endo-1,4-β-xylanase), GH11 (glycoside hydrolase 11 family endo-1,4-β-xylanase), GH76 (alcohol dehydrogenase 1), GH20 (β-N-acetylhexosaminidase) and GH35 (glycoside hydrolase 35).

Discussion

In the present study, an integrative multi-omics approach was used to mine CAZyme-rich regions of ThIOC3884. BAC clones were selected, sequenced and used in comparative analyses focusing on the expression profile via RNA-Seq and the exoproteome under different fungal growth conditions, enabling the discovery of important gene/proteins related to plant biomass degradation (Supplementary Fig. S3).

The vast majority of important enzymes for the degradation of plant biomass are already known [25–27]. The current challenge is how enzymes are regulated and the genetic mechanism of their activation. Thus, many works with cellulolytic fungi have focused on TFs, accessory enzymes, transporters and the way the type of biomass affects the process of regulating the cellulases and hemicellulases [22, 28–30]. Other studies have already shown the potential of T. harzianum for the degradation of plant biomass. This is the first work that integrates results from different biotechnology approaches and that focuses on the prediction of the most important enzymes and TFs used by T. harzianum IOC3844 to degrade CEL.

The molecular process of CEL degradation is extremely complex and involves hydrolytic enzymes acting on the extracellular medium, carrier proteins and TFs (Figure 7). For T. harzianum and T. reesei, the major CAZy families related to CEL degradation were identified in the genome (GH1, GH3, GH6, GH7, GH12, GH45 and AA9) [7], and many of the cellulases have already had their three-dimensional structure solved; however, many key proteins in this process are not well known as transporter TFs related to the regulation of these enzymes.

Figure 7.

Molecular scheme of the enzymatic model in the degradation of cellulose in Trichoderma spp. Enzymes and PDB code: beta-glucosidase (5BWF), cellobiohydrolase I (2YOK), cellobiohydrolase II (1CB2), endoglucanase 3 (4H7M), copper-dependent lytic polysaccharide mono-oxygenases (LPMOs) (5O2W).

The study of genomic regions is an important tool for providing a global view of the important genes and regulatory regions of a genome [24, 31]. The genomes of a few strains of T. harzianum are available [32, 33]. A complete genome draft sequenced in 1572 scaffolds is available for T. harzianum T6776 [32]; however, little is known about the ThIOC3844 genome, and as it is a strain with potential for hydrolytic enzymes, more genomic information regarding CAZyme sequences is needed. In this study, our strategy was to use large genomic regions and integrate these data with other genetic information.

A large number of fungal genomes have already been used as a platform to search for new genes related to the degradation of biomass, as is the case for T. reesei QM6a, which has a finalized genome divided into seven chromosomes [34]. Our study results with the genomic regions of ThIOC3844 showed a large number of enzymes classified as CAZymes, as well as TFs and transporters in clusters in the genome, which may be important for future studies of genetic modification of this lineage.

Analyzing the level of expression of certain genes under certain conditions is an important step in understanding how transcription is affected in a specific biological condition [17, 35]; however, there is not always a direct relationship between what is being highly expressed and the proteins that are important in the extracellular medium. Thus, in this work, in addition to studying the most expressed genes that we found in the genomic regions, we also searched for those with a confirmed presence in the fungus secretome CEL degradation conditions. Our results showed that CAZy families are key in the degradation of CEL, with a high level of expression and a positive presence as a secreted protein.

Genomic comparison is a powerful tool for understanding differences and evolutionary dynamics among related species [36–38]. Our data show a high similarity between different strains of T. harzianum (IOC3844, B97 and T6776), which indicates that differences in enzyme production and efficiency may be related more to gene regulation mechanisms than differences in the sequence itself. In addition, by synteny analysis, it was possible to observe a greater difference in relation to the genome of T. reesei, which can be explained by the loss of genes and genomic modifications carried out in lineages of this fungus to increase its productivities of enzymes related to plant biomass degradation [12, 39].

The CLR2 transcription factor was described as an important regulator in the expression of cellulases by Neurospora crassa [22]; however, its functional role is not yet clear for fungi of the genus Trichoderma, including T. reesei [14, 40]. In the genome of ThIOC3844, we found a cluster with the CLR2 TF in association with other putative transcription factors, CAZymes, transporters and MFS permease. The same behavior was found for the T. reesei CLR2 TF, which has physical proximity and coexpression with a sugar transporter [29, 41]. These results indicate that there may be a mechanism for the joint regulation and expression of this TF with transporters related to biomass degradation. Based on RNA-Seq data, we observed differential expression of CLR2 in the cellulose condition. In this way, we analyzed the coregulation network of the CLR2 regulator. The present study illuminates unclear areas of the genomic organization, expression and putative regulation of CLR2 in T. harzianum.

Coregulation networks provide insights into how genes correlate and interact with each other [35, 42, 43]. We identified 36 genes directly associated with the CLR2 regulatory factor; these genes may be important in the regulation process of this factor, which is linked to the expression of cellulases in other filamentous fungi. Techniques such as gene knockout can further validate the functional or synergistic importance of these genes with key TFs for the expression of genes related to degradation of plant biomass.

Conclusions

Our results present an innovative approach in using different types of omics data to search for new important genes and genetic regulation mechanisms during the process of CEL degradation. We found several TFs, accessory enzymes and transporters in the genomic regions of ThIOC3844 that may be important for the expression/secretion of CAZyme genes. Among these, CLR2, CIP2 and LPMOs are promising candidates for further study. Our results indicate that the CRL2 regulator matches all the requirements for involvement in cellulose degradation by T. harzianum. In addition, through the approach of coregulation networks, it is possible to understand the relationship between genes and to find new targets for biochemical characterization. The results allowed the identification of important genetic regions, key genes and functional proteins, and this information can be used for the development and improvement of enzymatic hydrolysis technology for the bioethanol industry.

Methods

T. harzianum strain and genomic resources

T. harzianum IOC3844 (ThIOC3844) was obtained from the Brazilian Collection of Environment and Industrial Microorganisms (CBMAI). A library of BACs consisting of 5,760 clones previously constructed for this fungus strain [24] was used to search for genomic regions. The genomic sequences of T. harzianum T6776 (PRJNA252551), T. reesei QM6a (PRJNA325840), T. atroviride IMI206040 (PRJNA19867) and T. virens Gv29-8 (PRJNA19983) were used for comparison with ThIOC3844.

BAC library screening for gene selection in T. harzianum IOC3844

We designed primers for 62 target CAZyme genes (Supplementary Table S1) using transcriptome data [3] to search for positive BAC clones that contain genes previously selected from the plate (with the complete BAC library comprising fifteen 384 plaques) and column pools (24 columns of each plate). The plate and column pools were amplified using the Illustra GenomiPhi HY DNA Amplification Kit (GE Healthcare Life Sciences, UK) following the manufacturer’s instructions. The screening reactions for the search for positive clones were performed via PCR using the CFX384 Touch Real-Time PCR Detection System (Bio-Rad).

Single-molecule real-time (SMRT) sequencing and assembly

Libraries for sequencing were prepared according to the Pacific Biosciences (PacBio) protocol, and sequencing was performed at the Arizona Genomics Institute (AGI; Tucson, USA) using a Single-Molecule Real-Time (SMRT) DNA sequencing system available from PacBio. De novo assembly was performed with the PacBio Corrected Reads (PBcR) pipeline implemented as part of Wgs-assembler v8.3rc2 [44] and Celera Assembler [45]. The contigs obtained with the assemblers were subjected to error correction with pbalign (v0.2). The PacBio reads were aligned using the BLASR algorithm [46], and assembly polishing was performed with the Quiver tool (Supplementary Table S2 and S3) [47].

Gene prediction and functional annotation

The FGENESH tool was used for initial gene prediction analysis [48], followed by manual correction with the T. harzianum T6776 and T. reesei QM6a gene models. Annotations of the ontologies were performed with Blast2GO [49]. InterPro protein domains were predicted using InterProScan (http://www.ebi.ac.uk/interpro/) [50]. Information derived from the CAZy database was downloaded for each CAZyme family (www.cazy.org). The protein sequences of T. harzianum IOC3844 were used as queries in basic local alignment search tool (BLASTp) searches against the locally built CAZyme BLAST database. Only BLAST matches showing an e-value less than 10^-11, identity greater than 30% and queries covering greater than 70% of the sequence length were retained and classified according to the CAZyme catalytic group as glycoside hydrolases (GHs), glycosyl transferases (GTs), polysaccharide lyases (PLs), carbohydrate esterases (CEs), carbohydrate-binding modules (CBMs) or auxiliary activities (AAs).

Genomic comparison in Trichoderma spp

The software used for alignment was Nucmer (-maxmatch), which is part of the software package MUMmmer 3.23 [51]. The delta-filter (-q), show-coords (-rcl), and DNADIFF (standard parameters) were used for filtering, obtaining the mapping coordinates and generating the statistical report in the alignment, respectively. SimpleSynteny software (https://www.dveltri.com/simplesynteny/) [52] was used to compare a cluster of 12 genes among different species of Trichoderma spp.

Phylogenetic analysis and structure modeling of CLR2

The CLR2 sequences of ThIOC3844, T. reesei QM6a, T. atroviride, T. virens and other species of fungi were used as the basis for constructing the phylogenetic trees. These fungi were divided into Ascomycetes and Basidiomycetes. The sequences were aligned using ClustalW [53] and analyzed with Molecular Evolutionary Genetics Analysis (MEGA) software v7.0 (https://www.megasoftware.net/) [54]. The phylogenetic analyses were performed in MEGA7 using the maximum likelihood (ML) [55] method of inference based on the Jones-Taylor-Thornton (JTT) matrix-based model and 1000 bootstrap replicates [56] for each analysis. Pairwise deletion was employed to address alignment gaps and missing data. The trees were visualized and edited using the FigTree program (http://tree.bio.ed.ac.uk/software/figtree/). In silico modeling of the domain of CLR2 was performed using RaptorX protein structure prediction software (http://raptorx.uchicago.edu/) [57].

RNA-Seq and exoproteome analysis

The expression levels of ThIOC3844 were analyzed using RNA-Seq data (PRJNA336221) obtained from a previous study in which the transcripts were obtained following growth of the fungus on two different carbon sources, CEL and GLU [35]. The reads from the RNA-Seq library were mapped against the ThIOC3844 genes using the CLC Genomics Workbench (https://www.qiagenbioinformatics.com/products/clc-genomics-workbench/) [58]. The expression values were expressed in reads per kilobase of exon model per million mapped reads (RPKM), and the normalized value for each sample was calculated in transcripts per million (TPM). For the analysis of differential expression, the following parameters were used: fold change greater than or equal to 1.5 and p-value lower than 0.05. The analysis of the exoproteome was performed by means of a BLASTn search of the predicted gene of ThIOC3844 against the local database of protein sequences from T. harzianum found in the extract of fungal growth under CEL and GLU conditions.

Gene regulatory network

The gene regulatory networks were assembled from the reference mapped RNA-Seq data using each set of biological triplicates for the CEL and GLU conditions [35]. The interaction between the genes was obtained by calculating Pearson’s correlation for each pair of genes. The induction and repression networks were constructed based on the expression data of a set of genes that were identified in the secretome of the CEL growth condition by the Bayesian inference method [59]. If the secreted protein was present in the condition, it was assigned a value of one. If the secreted protein was absent, it was assigned a value of zero. The treatment conditions were considered as regulators of the network to detect the direct relationships between the conditions and the genes. Thus, the Bayesian network represents the relationships among the conditions, gene expression, and secreted proteins. Cytoscape software v 3.4.042 [60] (https://cytoscape.org/) was used for data analysis and construction of the CLR2 subnetwork.

Additional files

Additional file 1: Fig S1. Screening genes of interest in the genomic library of T. harzianum IOC3844 by qPCR (a); reads size sequenced using PACBio technology (b); genes cluster in a genomic region of T. harzianum (c). Fig. S2. Distribution of the main GO terms of the annotated genes in T. harzianum IOC3844. Fig. S3. Pipeline approach for the analyzes used in this work of genes and genomic study in T. harzianum. Supplementary Table S2. Assembly parameters of a set of sequenced genomic region using PACBio technology. Supplementary Table S3. Comparison of genomic data among different species of Trichoderma spp. Supplementary Table S8. Description of the species used for the phylogenetic analysis of the transcription factor CLR2. Supplementary Table S9. Description of the genes found in the coregulation networks.

Additional file 2: Supplementary Table S1. Description of the genomic regions sequenced in T. harzianum IOC3844.

Additional file 3: Supplementary Table S4. Annotation of all genes predicted in T. harzianum IOC3844.

Additional file 4: Supplementary Table S5. Description of the EC codes for T. harzianum IOC3844 genes.

Additional file 5: Supplementary Table S6. Description of the CAZymes genes for T. harzianum IOC3844.

Additional file 6: Supplementary Table S7. Level of expression of the genes annotated in T. harzianum IOC3844 by means of RNA-seq.

Declarations

Ethics approval and consent to participate

Not applicable

Consent for publication

Not applicable

Availability of data and materials

The RNA-seq data can be accessed by the accession number PRJNA336221. Data from the genomic regions were submitted to GenBank (https://www.ncbi.nlm.nih.gov/genbank/) under the accession numbers MK861589-MK861650 (Supplementary Table S1).

Competing interests

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Funding

This work was supported by grants from the Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP 2015/09202-0), Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES) Computational Biology Program (CBP) and Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq). JAFF received a PhD fellowship from CNPq (170565/2017-3) and a PD fellowship from CAPES (CBP – 88887.334235/2019-00); MACH received a PD fellowship from CAPES (CBP) and a SWE PD fellowship from CAPES, Computational Biology Program; CAS received a PD fellowship from FAPESP (2016/19775-0) and a SWE PD fellowship from CAPES (CBP); DAA received partial MS fellowship from FAPESP (17/17782-2) and partial MS fellowship from CAPES (CBP); NFM received a PD fellowship from CNPq and CAPES (CBP); JSM received a PD fellowship from CNPq and CAPES (CBP); CBCS received a PD fellowship from FAPESP (17/26781-0) and CAPES (CBP); and APS is the recipient of a research fellowship from CNPq.

Authors ‘ contributions

APS and JAFF designed the study. JAFF, MACH, CAS, DAA, JSM, DAS and AC performed the research. JAFF, MACH, CAS, DAA, NFM and CBCS analyzed the data. JAFF, MACH, CAS and APS wrote the paper. All authors critically read the text and approved the manuscript.

Acknowledgements

We would like to acknowledge the funding from Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP 2015/09202-0), Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES, Computational Biology Program) and Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq).

List of abbreviations

AA: Auxiliary enzymes;
B: buried;
BAC: bacterial artificial chromosome;
BLAST: Basic local alignment search tool;
bp: Base pair;
BRENDA: Braunschweig Enzyme Database;
C: loop;
CAZymes: Carbohydrate-active enzymes;
CBMAI: Brazilian Collection of Environment and Industrial Microorganisms;
CBM: Carbohydrate-binding module;
CE: Carbohydrate esterases;
CEL: Cellulose;
CIP1: cellulose-induced protein 1;
CIP2: cellulose-induced protein 2;
CLR2: cellulose degradation regulator 2;
DNA: Deoxyribonucleic acid;
E: beta-sheet;
EC: Enzyme commission number;
Ex: exposed;
FC: fold change;
GH: Glycoside hydrolases;
GLU: Glucose;
GO: gene ontologies;
GT: Glycosyltransferases;
H: helix;
JTT: Jones-Taylor-Thornton;
kb: Kilobases;
LPMO: Lytic polysaccharides monooxygenase;
M: medium;
Mb: Megabase;
MEGA: Molecular evolutionary genetics analysis;
MFS: major facilitator superfamily permease;
ML: maximum likelihood;
PacBio: Pacific Biosciences;
PBcR: PacBio Corrected Reads;
PCR: Polymerase chain reaction;
PL: Polysaccharide lyases;
RNA: Ribonucleic acid;
RNA-Seq: RNA sequencing;
RPKM: Reads per kilobase of exon model per million mapped reads;
SMRT: Single-Molecule Real-Time;
TaIMI206040: T. atroviride IMI206040;
TFs: transcription factors;
ThB97: T. harzianum B97;
ThIOC3844: Trichoderma harzianum IOC-3844;
ThTR274: T. harzianum TR274;
Th6766: T. harzianum;
TPM: Transcripts per million;
TrQM6a: T. reesei QM6a;
TvGv29-8: T. virens Gv29-8

References

1.↵
Elad Y, Chet I, Katan J. Trichoderma harzianum: a biocontrol agent effective against Sclerotium rolfsii and Rhizoctonia solani. Phytopathology. 1980;70:119–21.
OpenUrl
2.↵
Delabona PDS, Farinas CS, da Silva MR, Azzoni SF, Pradella JGC. Use of a new Trichoderma harzianum strain isolated from the Amazon rainforest with pretreated sugar cane bagasse for on-site cellulase production. Bioresour Technol. 2012;107:517–21.
OpenUrl CrossRef PubMed
3.↵
Horta MA, Vicentini R, Pda SD, Laborda P, Crucello A, Freitas S, et al. Transcriptome profile of Trichoderma harzianum IOC-3844 induced by sugarcane bagasse. PLoS One. 2014;9:e88689.
4.↵
Delabona PDS, Rodrigues GN, Zubieta MP, Ramoni J, Codima CA, Lima DJ, et al. The relation between xyr1 overexpression in Trichoderma harzianum and sugarcane bagasse saccharification performance. J Biotechnol. 2017;246:24–32.
OpenUrl
5.↵
de Castro AM, Pedro KCNR, da Cruz JC, Ferreira MC, Leite SGF, Pereira N. Trichoderma harzianum IOC-4038: a promising strain for the production of a cellulolytic complex with significant-glucosidase activity from sugarcane bagasse cellulignin. Appl Biochem Biotechnol. 2010;162:2111–22.
OpenUrl PubMed
6.
Santos CA, Zanphorlin LM, Crucello A, Tonoli CC, Ruller R, Horta MA, et al. Crystal structure and biochemical characterization of the recombinant ThBgl, a GH1 β-glucosidase overexpressed in Trichoderma harzianum under biomass degradation conditions. Biotechnol Biofuels. 2016;9:71.
OpenUrl
7.↵
Filho JAF, Horta MAC, Beloti LL, dos Santos CA, de Souza AP. Carbohydrate-active enzymes in Trichoderma harzianum: a bioinformatic analysis bioprospecting for key enzymes for the biofuels industry. BMC Genomics. 2017;18:779.
OpenUrl
8.
Polikarpov I. Structure and function of enzymes and auxiliary proteins from Trichoderma, active in cell-wall hydrolysis/center of biological and industrial process for biofuels – CeProBIO. São Carlos, SP: IFSC/USP; 2017.
9.
Lopez-Ramirez N, Volke-Sepulveda T, Gaime-Perraud I, Saucedo-Castañeda G, Favela-Torres E. Effect of stirring on growth and cellulolytic enzymes production by Trichoderma harzianum in a novel bench-scale solid-state fermentation bioreactor. Bioresour Technol. 2018;265:291–8.
OpenUrl
10.↵
Santos CA, Morais MAB, Terrett OM, Lyczakowski JJ, Zanphorlin LM, Ferreira-Filho JA, et al. An engineered GH1 beta-glucosidase displays enhanced glucose tolerance and increased sugar release from lignocellulosic materials. Sci Rep. 2019;9:4903.
OpenUrl
11.↵
Bischof RH, Ramoni J, Seiboth B. Cellulases and beyond: the first 70 years of the enzyme producer Trichoderma reesei. Microb Cell Fact. 2016;15:106.
OpenUrl CrossRef
12.↵
Martinez D, Berka RM, Henrissat B, Saloheimo M, Arvas M, Baker SE, et al. Genome sequencing and analysis of the biomass-degrading fungus Trichoderma reesei(syn. Hypocrea jecorina). Nat Biotechnol. 2008;26:553–60.
OpenUrl CrossRef PubMed Web of Science
13.
Jun H, Kieselbach T, Jönsson LJ. Enzyme production by filamentous fungi: analysis of the secretome of Trichoderma reesei grown on unconventional carbon source. Microb Cell Fact. 2011;10:68.
OpenUrl CrossRef PubMed
14.↵
Häkkinen M, Arvas M, Oja M, Aro N, Penttilä M, Saloheimo M, et al. Re-annotation of the CAZy genes of Trichoderma reesei and transcription in the presence of lignocellulosic substrates. Microb Cell Fact. 2012;11:134.
OpenUrl CrossRef PubMed
15.
Alvira P, Gyalai-Korpos M, Barta Z, Oliva JM, Réczey K, Ballesteros M. Production and hydrolytic efficiency of enzymes from Trichoderma reese iRUTC30 using steam pretreated wheat straw as carbon source. J Chem Technol Biotechnol. 2013;88:1150–6.
OpenUrl
16.↵
Li Y-H, Zhang X-Y, Zhang F, Peng L-C, Zhang D-B, Kondo A, et al. Optimization of cellulolytic enzyme components through engineering Trichoderma reesei and on-site fermentation using the soluble inducer for cellulosic ethanol production from corn stover. Biotechnol Biofuels. 2018;11:49.
OpenUrl
17.↵
Bischof R, Fourtis L, Limbeck A, Gamauf C, Seiboth B, Kubicek CP. Comparative analysis of the Trichoderma reesei transcriptome during growth on the cellulase inducing substrates wheat straw and lactose. Biotechnol Biofuels. 2013;6:127.
OpenUrl CrossRef
18.
Gupta VK, Steindorff AS, de Paula RG, Silva-Rocha R, Mach-Aigner AR, Mach RL, et al. The post-genomic era of Trichoderma reesei: what’s next? Trends Biotechnol. 2016;34:970–82.
OpenUrl
19.
Santos CA, Ferreira-Filho JA, O’Donovan A, Gupta VK, Tuohy MG, Souza AP. Production of a recombinant swollenin from Trichoderma harzianum in Escherichia coli and its potential synergistic role in biomass degradation. Microb Cell Fact. 2017;16:83.
OpenUrl
20.↵
Villares A, Moreau C, Bennati-Granier C, Garajova S, Foucat L, Falourd X, et al. Lytic polysaccharide monooxygenases disrupt the cellulose fibers structure. Sci Rep. 2017;7:40262.
OpenUrl CrossRef
21.↵
Benocci T, Aguilar-Pontes MV, Zhou M, Seiboth B, de Vries RP. Regulators of plant biomass degradation in Ascomycetous fungi. Biotechnol Biofuels. 2017;10:152.
OpenUrl CrossRef
22.↵
Coradetti ST, Craig JP, Xiong Y, Shock T, Tian C, Glass NL. Conserved and essential transcription factors for cellulase gene expression in ascomycete Fungi. Proc Natl Acad Sci U S A. 2012;109:7397–402.
OpenUrl Abstract/FREE Full Text
23.↵
Liu F, Xue Y, Liu J, Gan L, Long M. ACE3 as a master transcriptional factor regulates cellulase and xylanase production in Trichoderma orientalis EU7-22. BioResources. 2018;13:6790–801.
OpenUrl
24.↵
Crucello A, Sforça DA, Horta MAC, dos Santos CA, Viana AJC, Beloti LL, et al. Analysis of genomic regions of Trichoderma harzianum IOC-3844 related to biomass degradation. PLoS One. 2015;10:e0122122.
OpenUrl
25.↵
Himmel ME, Ding S-Y, Johnson DK, Adney WS, Nimlos MR, Brady JW, et al. Biomass recalcitrance: engineering plants and enzymes for biofuels production. Science. 2007;315:804–7.
OpenUrl Abstract/FREE Full Text
26.
van den Brink J, de Vries RP. Fungal enzyme sets for plant polysaccharide degradation. Appl Microbiol Biotechnol. 2011;91:1477–92.
OpenUrl CrossRef PubMed Web of Science
27.↵
Van Dyk JS, Pletschke BI. A review of lignocellulose bioconversion using enzymatic hydrolysis and synergistic cooperation between enzymes—Factors affecting enzymes, conversion and synergy. Biotechnol Adv. 2012;30:1458–80.
OpenUrl CrossRef PubMed
28.↵
Nitta M, Furukawa T, Shida Y, Mori K, Kuhara S, Morikawa Y, et al. A new Zn (II) 2Cys6-type transcription factor BglR regulates-glucosidase expression in Trichoderma reesei. Fungal Genet Biol. 2012;49:388–97.
OpenUrl CrossRef PubMed
29.↵
Häkkinen M, Valkonen MJ, Westerholm-Parvinen A, Aro N, Arvas M, Vitikainen M, et al. Screening of candidate regulators for cellulase and hemicellulase production in Trichoderma reesei and identification of a factor essential for cellulase production. Biotechnol Biofuels. 2014;7:14.
OpenUrl CrossRef PubMed
30.↵
Westereng B, Loose JS, Vaaje-Kolstad G, Aachmann FL, Sørlie M, Eijsink VG. Analytical tools for characterizing cellulose-active lytic polysaccharide monooxygenases (LPMOs). Methods Mol Biol. 2018;1796:219–46.
OpenUrl
31.↵
Toyotome T, Hamada S, Yamaguchi S, Takahashi H, Kondoh D, Takino M, et al. Comparative genome analysis of Aspergillus flavus clinically isolated in Japan. DNA Res. 2018;26:95–103.
OpenUrl
32.↵
Baroncelli R, Piaggeschi G, Fiorini L, Bertolini E, Zapparata A, Pè ME, et al. Draft whole-genome sequence of the biocontrol agent Trichoderma harzianum T6776. Genome Announc. 2015;3:e00647–15.
OpenUrl
33.↵
Kubicek CP, Steindorff AS, Chenthamara K, Manganiello G, Henrissat B, Zhang J, et al. Evolution and comparative genomics of the most common Trichoderma species. BMC Genomics. 2019;20:485.
OpenUrl
34.↵
Li W-C, Huang C-H, Chen C-L, Chuang Y-C, Tung S-Y, Wang T-F. Trichoderma reesei complete genome sequence, repeat-induced point mutation, and partitioning of CAZyme gene clusters. Biotechnol Biofuels. 2017;10:170.
OpenUrl CrossRef
35.↵
Horta MAC, Filho JAF, Murad NF, Santos EO, dos Santos CA, Mendes JS, et al. Network of proteins, enzymes and genes linked to biomass degradation shared by Trichoderma species. Sci Rep. 2018;8:1341.
OpenUrl CrossRef
36.↵
Kuan CS, Yew SM, Toh YF, Chan CL, Ngeow YF, Lee KW, et al. Dissecting the fungal biology of Bipolaris papendorfii: from phylogenetic to comparative genomic analysis. DNA Res. 2015;22:219–32.
OpenUrl CrossRef PubMed
37.
Haitjema CH, Gilmore SP, Henske JK, Solomon KV, de Groot R, Kuo A, et al. A parts list for fungal cellulosomes revealed by comparative genomics. Nat Microbiol. 2017;2:17087.
OpenUrl
38.↵
Wang R, Dong L, He R, Wang Q, Chen Y, Qu L, et al. Comparative genomic analyses reveal the features for adaptation to nematodes in Fungi. DNA Res. 2018;25:245–56.
OpenUrl
39.↵
Xie B-B, Qin Q-L, Shi M, Chen L-L, Shu Y-L, Luo Y, et al. Comparative genomics provide insights into evolution of Trichoderma nutrition style. Genome Biol Evol. 2014;6:379–90.
OpenUrl CrossRef PubMed
40.↵
Hassan L, Lin L, Sorek H, Sperl LE, Goudoulas T, Hagn F, et al. Crosstalk of cellulose and mannan perception pathways leads to inhibition of cellulase production in several Filamentous fungi. MBio. 2019;10:e00277–19.
OpenUrl
41.↵
Ivanova C, Ramoni J, Aouam T, Frischmann A, Seiboth B, Baker SE, et al. Genome sequencing and transcriptome analysis of Trichoderma reesei QM9978 strain reveals a distal chromosome translocation to be responsible for loss of vib1 expression and loss of cellulase induction. Biotechnol Biofuels. 2017;10:209.
OpenUrl
42.↵
Lawler K, Hammond-Kosack K, Brazma A, Coulson RM. Genomic clustering and co-regulation of transcriptional networks in the pathogenic fungus Fusarium graminearum. BMC Syst Biol. 2013;7:52.
OpenUrl CrossRef
43.↵
Castro LDS, Pedersoli WR, Antoniêto ACC, Steindorff AS, Silva-Rocha R, Martinez-Rossi NM, et al. Comparative metabolism of cellulose, sophorose and glucose in Trichoderma reesei using high-throughput genomic and proteomic analyses. Biotechnol Biofuels. 2014;7:41.
OpenUrl CrossRef PubMed
44.↵
Berlin K, Koren S, Chin C-S, Drake JP, Landolin JM, Phillippy AM. Assembling large genomes with single-molecule sequencing and locality-sensitive hashing. Nat Biotechnol. 2015;33:623–30.
OpenUrl CrossRef PubMed
45.↵
Myers EW, Sutton GG, Delcher AL, Dew IM, Fasulo DP, Flanigan MJ, et al. A whole-genome assembly of Drosophila. Science. 2000;287:2196–204.
OpenUrl Abstract/FREE Full Text
46.↵
Chaisson MJ, Tesler G. Mapping single molecule sequencing reads using basic local alignment with successive refinement (BLASR): application and theory. BMC Bioinformatics. 2012;13:238.
OpenUrl CrossRef PubMed
47.↵
Chin C-S, Alexander DH, Marks P, Klammer AA, Drake J, Heiner C, et al. Nonhybrid, finished microbial genome assemblies from long-read SMRT sequencing data. Nat Methods. 2013;10:563–9.
OpenUrl CrossRef PubMed Web of Science
48.↵
Salamov AA, Solovyev VV. Ab initio gene finding in Drosophila genomic DNA. Genome Res. 2000;10:516–22.
OpenUrl Abstract/FREE Full Text
49.↵
Conesa A, Götz S, García-Gómez JM, Terol J, Talón M, Robles M. Blast2GO: a universal tool for annotation, visualization and analysis in functional genomics research. Bioinformatics. 2005;21:3674–6.
OpenUrl CrossRef PubMed Web of Science
50.↵
Mitchell A, Chang H-Y, Daugherty L, Fraser M, Hunter S, Lopez R, et al. The InterPro protein families database: the classification resource after 15 years. Nucleic Acids Res. 2015;43:D213–21.
OpenUrl CrossRef PubMed
51.↵
Kurtz S, Phillippy A, Delcher AL, Smoot M, Shumway M, Antonescu C, et al. Versatile and open software for comparing large genomes. Genome Biol. 2004;5:R12.
OpenUrl CrossRef PubMed
52.↵
Veltri D, Wight MM, Crouch JA. Simple synteny: a web-based tool for visualization of microsynteny across multiple species. Nucleic Acids Res. 2016;44:W41–5.
OpenUrl CrossRef PubMed
53.↵
Thompson JD, Higgins DG, Gibson TJ. CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 1994;22:4673–80.
OpenUrl CrossRef PubMed Web of Science
54.↵
Kumar S, Stecher G, Tamura K. MEGA7: molecular evolutionary genetics analysis version 7.0 for bigger datasets. Mol Biol Evol. 2016;33:1870–4.
OpenUrl CrossRef PubMed
55.↵
Jones DT, Taylor WR, Thornton JM. The rapid generation of mutation data matrices from protein sequences. Comput Appl Biosci. 1992;8:275–82.
OpenUrl CrossRef PubMed
56.↵
Felsenstein J. Confidence limits on phylogenies: an approach using the bootstrap. Evolution. 1985;39:783–91.
OpenUrl CrossRef GeoRef PubMed Web of Science
57.↵
1. Kihara D
Källberg M, Margaryan G, Wang S, Ma J, Xu J. RaptorX server: a resource for template-based protein structure modeling. In: Kihara D, editor. Protein structure prediction. New York, NY: Springer; 2014. p. 17–27.
58.↵
CLC Genomics Workbench 9.0. Qiagen (Aarhus A/S). Manual for CLC genomics workbench 9.0 windows, Mac OS X and Linux Denmark. 2016.
59.↵
Wilczynski B, Dojer N. BNFinder: exact and efficient method for learning bayesian networks. Bioinformatics. 2009;25:286–7.
OpenUrl CrossRef PubMed Web of Science
60.↵
Shannon P, Markiel A, Ozier O, Baliga NS, Wang JT, Ramage D, et al. Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res. 2003;13:2498–504.
OpenUrl Abstract/FREE Full Text

View the discussion thread.

Posted August 21, 2019.

Download PDF

Supplementary Material

Citation Tools

Subject Area

Genomics

Subject Areas

All Articles

Animal Behavior and Cognition (5217)
Biochemistry (11755)
Bioengineering (8757)
Bioinformatics (29209)
Biophysics (14980)
Cancer Biology (12103)
Cell Biology (17416)
Clinical Trials (138)
Developmental Biology (9425)
Ecology (14187)
Epidemiology (2067)
Evolutionary Biology (18314)
Genetics (12246)
Genomics (16807)
Immunology (11870)
Microbiology (28101)
Molecular Biology (11599)
Neuroscience (60995)
Paleontology (452)
Pathology (1872)
Pharmacology and Toxicology (3238)
Physiology (4961)
Plant Biology (10429)
Scientific Communication and Education (1683)
Synthetic Biology (2887)
Systems Biology (7341)
Zoology (1651)

[1] 1.↵
Elad Y, Chet I, Katan J. Trichoderma harzianum: a biocontrol agent effective against Sclerotium rolfsii and Rhizoctonia solani. Phytopathology. 1980;70:119–21.
OpenUrl

[2] 2.↵
Delabona PDS, Farinas CS, da Silva MR, Azzoni SF, Pradella JGC. Use of a new Trichoderma harzianum strain isolated from the Amazon rainforest with pretreated sugar cane bagasse for on-site cellulase production. Bioresour Technol. 2012;107:517–21.
OpenUrl CrossRef PubMed

[3] 3.↵
Horta MA, Vicentini R, Pda SD, Laborda P, Crucello A, Freitas S, et al. Transcriptome profile of Trichoderma harzianum IOC-3844 induced by sugarcane bagasse. PLoS One. 2014;9:e88689.

[4] 4.↵
Delabona PDS, Rodrigues GN, Zubieta MP, Ramoni J, Codima CA, Lima DJ, et al. The relation between xyr1 overexpression in Trichoderma harzianum and sugarcane bagasse saccharification performance. J Biotechnol. 2017;246:24–32.
OpenUrl

[5] 5.↵
de Castro AM, Pedro KCNR, da Cruz JC, Ferreira MC, Leite SGF, Pereira N. Trichoderma harzianum IOC-4038: a promising strain for the production of a cellulolytic complex with significant-glucosidase activity from sugarcane bagasse cellulignin. Appl Biochem Biotechnol. 2010;162:2111–22.
OpenUrl PubMed

[6] 6.
Santos CA, Zanphorlin LM, Crucello A, Tonoli CC, Ruller R, Horta MA, et al. Crystal structure and biochemical characterization of the recombinant ThBgl, a GH1 β-glucosidase overexpressed in Trichoderma harzianum under biomass degradation conditions. Biotechnol Biofuels. 2016;9:71.
OpenUrl

[7] 7.↵
Filho JAF, Horta MAC, Beloti LL, dos Santos CA, de Souza AP. Carbohydrate-active enzymes in Trichoderma harzianum: a bioinformatic analysis bioprospecting for key enzymes for the biofuels industry. BMC Genomics. 2017;18:779.
OpenUrl

[8] 8.
Polikarpov I. Structure and function of enzymes and auxiliary proteins from Trichoderma, active in cell-wall hydrolysis/center of biological and industrial process for biofuels – CeProBIO. São Carlos, SP: IFSC/USP; 2017.

[9] 9.
Lopez-Ramirez N, Volke-Sepulveda T, Gaime-Perraud I, Saucedo-Castañeda G, Favela-Torres E. Effect of stirring on growth and cellulolytic enzymes production by Trichoderma harzianum in a novel bench-scale solid-state fermentation bioreactor. Bioresour Technol. 2018;265:291–8.
OpenUrl

[10] 10.↵
Santos CA, Morais MAB, Terrett OM, Lyczakowski JJ, Zanphorlin LM, Ferreira-Filho JA, et al. An engineered GH1 beta-glucosidase displays enhanced glucose tolerance and increased sugar release from lignocellulosic materials. Sci Rep. 2019;9:4903.
OpenUrl

[11] 11.↵
Bischof RH, Ramoni J, Seiboth B. Cellulases and beyond: the first 70 years of the enzyme producer Trichoderma reesei. Microb Cell Fact. 2016;15:106.
OpenUrl CrossRef

[12] 12.↵
Martinez D, Berka RM, Henrissat B, Saloheimo M, Arvas M, Baker SE, et al. Genome sequencing and analysis of the biomass-degrading fungus Trichoderma reesei(syn. Hypocrea jecorina). Nat Biotechnol. 2008;26:553–60.
OpenUrl CrossRef PubMed Web of Science

[13] 13.
Jun H, Kieselbach T, Jönsson LJ. Enzyme production by filamentous fungi: analysis of the secretome of Trichoderma reesei grown on unconventional carbon source. Microb Cell Fact. 2011;10:68.
OpenUrl CrossRef PubMed

[14] 14.↵
Häkkinen M, Arvas M, Oja M, Aro N, Penttilä M, Saloheimo M, et al. Re-annotation of the CAZy genes of Trichoderma reesei and transcription in the presence of lignocellulosic substrates. Microb Cell Fact. 2012;11:134.
OpenUrl CrossRef PubMed

[15] 15.
Alvira P, Gyalai-Korpos M, Barta Z, Oliva JM, Réczey K, Ballesteros M. Production and hydrolytic efficiency of enzymes from Trichoderma reese iRUTC30 using steam pretreated wheat straw as carbon source. J Chem Technol Biotechnol. 2013;88:1150–6.
OpenUrl

[16] 16.↵
Li Y-H, Zhang X-Y, Zhang F, Peng L-C, Zhang D-B, Kondo A, et al. Optimization of cellulolytic enzyme components through engineering Trichoderma reesei and on-site fermentation using the soluble inducer for cellulosic ethanol production from corn stover. Biotechnol Biofuels. 2018;11:49.
OpenUrl

[17] 17.↵
Bischof R, Fourtis L, Limbeck A, Gamauf C, Seiboth B, Kubicek CP. Comparative analysis of the Trichoderma reesei transcriptome during growth on the cellulase inducing substrates wheat straw and lactose. Biotechnol Biofuels. 2013;6:127.
OpenUrl CrossRef

[18] 18.
Gupta VK, Steindorff AS, de Paula RG, Silva-Rocha R, Mach-Aigner AR, Mach RL, et al. The post-genomic era of Trichoderma reesei: what’s next? Trends Biotechnol. 2016;34:970–82.
OpenUrl

[19] 19.
Santos CA, Ferreira-Filho JA, O’Donovan A, Gupta VK, Tuohy MG, Souza AP. Production of a recombinant swollenin from Trichoderma harzianum in Escherichia coli and its potential synergistic role in biomass degradation. Microb Cell Fact. 2017;16:83.
OpenUrl

[20] 20.↵
Villares A, Moreau C, Bennati-Granier C, Garajova S, Foucat L, Falourd X, et al. Lytic polysaccharide monooxygenases disrupt the cellulose fibers structure. Sci Rep. 2017;7:40262.
OpenUrl CrossRef

[21] 21.↵
Benocci T, Aguilar-Pontes MV, Zhou M, Seiboth B, de Vries RP. Regulators of plant biomass degradation in Ascomycetous fungi. Biotechnol Biofuels. 2017;10:152.
OpenUrl CrossRef

[22] 22.↵
Coradetti ST, Craig JP, Xiong Y, Shock T, Tian C, Glass NL. Conserved and essential transcription factors for cellulase gene expression in ascomycete Fungi. Proc Natl Acad Sci U S A. 2012;109:7397–402.
OpenUrl Abstract/FREE Full Text

[23] 23.↵
Liu F, Xue Y, Liu J, Gan L, Long M. ACE3 as a master transcriptional factor regulates cellulase and xylanase production in Trichoderma orientalis EU7-22. BioResources. 2018;13:6790–801.
OpenUrl

[24] 24.↵
Crucello A, Sforça DA, Horta MAC, dos Santos CA, Viana AJC, Beloti LL, et al. Analysis of genomic regions of Trichoderma harzianum IOC-3844 related to biomass degradation. PLoS One. 2015;10:e0122122.
OpenUrl

[25] 25.↵
Himmel ME, Ding S-Y, Johnson DK, Adney WS, Nimlos MR, Brady JW, et al. Biomass recalcitrance: engineering plants and enzymes for biofuels production. Science. 2007;315:804–7.
OpenUrl Abstract/FREE Full Text

[26] 26.
van den Brink J, de Vries RP. Fungal enzyme sets for plant polysaccharide degradation. Appl Microbiol Biotechnol. 2011;91:1477–92.
OpenUrl CrossRef PubMed Web of Science

[27] 27.↵
Van Dyk JS, Pletschke BI. A review of lignocellulose bioconversion using enzymatic hydrolysis and synergistic cooperation between enzymes—Factors affecting enzymes, conversion and synergy. Biotechnol Adv. 2012;30:1458–80.
OpenUrl CrossRef PubMed

[28] 28.↵
Nitta M, Furukawa T, Shida Y, Mori K, Kuhara S, Morikawa Y, et al. A new Zn (II) 2Cys6-type transcription factor BglR regulates-glucosidase expression in Trichoderma reesei. Fungal Genet Biol. 2012;49:388–97.
OpenUrl CrossRef PubMed

[29] 29.↵
Häkkinen M, Valkonen MJ, Westerholm-Parvinen A, Aro N, Arvas M, Vitikainen M, et al. Screening of candidate regulators for cellulase and hemicellulase production in Trichoderma reesei and identification of a factor essential for cellulase production. Biotechnol Biofuels. 2014;7:14.
OpenUrl CrossRef PubMed

[30] 30.↵
Westereng B, Loose JS, Vaaje-Kolstad G, Aachmann FL, Sørlie M, Eijsink VG. Analytical tools for characterizing cellulose-active lytic polysaccharide monooxygenases (LPMOs). Methods Mol Biol. 2018;1796:219–46.
OpenUrl

[31] 31.↵
Toyotome T, Hamada S, Yamaguchi S, Takahashi H, Kondoh D, Takino M, et al. Comparative genome analysis of Aspergillus flavus clinically isolated in Japan. DNA Res. 2018;26:95–103.
OpenUrl

[32] 32.↵
Baroncelli R, Piaggeschi G, Fiorini L, Bertolini E, Zapparata A, Pè ME, et al. Draft whole-genome sequence of the biocontrol agent Trichoderma harzianum T6776. Genome Announc. 2015;3:e00647–15.
OpenUrl

[33] 33.↵
Kubicek CP, Steindorff AS, Chenthamara K, Manganiello G, Henrissat B, Zhang J, et al. Evolution and comparative genomics of the most common Trichoderma species. BMC Genomics. 2019;20:485.
OpenUrl

[34] 34.↵
Li W-C, Huang C-H, Chen C-L, Chuang Y-C, Tung S-Y, Wang T-F. Trichoderma reesei complete genome sequence, repeat-induced point mutation, and partitioning of CAZyme gene clusters. Biotechnol Biofuels. 2017;10:170.
OpenUrl CrossRef

[35] 35.↵
Horta MAC, Filho JAF, Murad NF, Santos EO, dos Santos CA, Mendes JS, et al. Network of proteins, enzymes and genes linked to biomass degradation shared by Trichoderma species. Sci Rep. 2018;8:1341.
OpenUrl CrossRef

[36] 36.↵
Kuan CS, Yew SM, Toh YF, Chan CL, Ngeow YF, Lee KW, et al. Dissecting the fungal biology of Bipolaris papendorfii: from phylogenetic to comparative genomic analysis. DNA Res. 2015;22:219–32.
OpenUrl CrossRef PubMed

[37] 37.
Haitjema CH, Gilmore SP, Henske JK, Solomon KV, de Groot R, Kuo A, et al. A parts list for fungal cellulosomes revealed by comparative genomics. Nat Microbiol. 2017;2:17087.
OpenUrl

[38] 38.↵
Wang R, Dong L, He R, Wang Q, Chen Y, Qu L, et al. Comparative genomic analyses reveal the features for adaptation to nematodes in Fungi. DNA Res. 2018;25:245–56.
OpenUrl

[39] 39.↵
Xie B-B, Qin Q-L, Shi M, Chen L-L, Shu Y-L, Luo Y, et al. Comparative genomics provide insights into evolution of Trichoderma nutrition style. Genome Biol Evol. 2014;6:379–90.
OpenUrl CrossRef PubMed

[40] 40.↵
Hassan L, Lin L, Sorek H, Sperl LE, Goudoulas T, Hagn F, et al. Crosstalk of cellulose and mannan perception pathways leads to inhibition of cellulase production in several Filamentous fungi. MBio. 2019;10:e00277–19.
OpenUrl

[41] 41.↵
Ivanova C, Ramoni J, Aouam T, Frischmann A, Seiboth B, Baker SE, et al. Genome sequencing and transcriptome analysis of Trichoderma reesei QM9978 strain reveals a distal chromosome translocation to be responsible for loss of vib1 expression and loss of cellulase induction. Biotechnol Biofuels. 2017;10:209.
OpenUrl

[42] 42.↵
Lawler K, Hammond-Kosack K, Brazma A, Coulson RM. Genomic clustering and co-regulation of transcriptional networks in the pathogenic fungus Fusarium graminearum. BMC Syst Biol. 2013;7:52.
OpenUrl CrossRef

[43] 43.↵
Castro LDS, Pedersoli WR, Antoniêto ACC, Steindorff AS, Silva-Rocha R, Martinez-Rossi NM, et al. Comparative metabolism of cellulose, sophorose and glucose in Trichoderma reesei using high-throughput genomic and proteomic analyses. Biotechnol Biofuels. 2014;7:41.
OpenUrl CrossRef PubMed

[44] 44.↵
Berlin K, Koren S, Chin C-S, Drake JP, Landolin JM, Phillippy AM. Assembling large genomes with single-molecule sequencing and locality-sensitive hashing. Nat Biotechnol. 2015;33:623–30.
OpenUrl CrossRef PubMed

[45] 45.↵
Myers EW, Sutton GG, Delcher AL, Dew IM, Fasulo DP, Flanigan MJ, et al. A whole-genome assembly of Drosophila. Science. 2000;287:2196–204.
OpenUrl Abstract/FREE Full Text

[46] 46.↵
Chaisson MJ, Tesler G. Mapping single molecule sequencing reads using basic local alignment with successive refinement (BLASR): application and theory. BMC Bioinformatics. 2012;13:238.
OpenUrl CrossRef PubMed

[47] 47.↵
Chin C-S, Alexander DH, Marks P, Klammer AA, Drake J, Heiner C, et al. Nonhybrid, finished microbial genome assemblies from long-read SMRT sequencing data. Nat Methods. 2013;10:563–9.
OpenUrl CrossRef PubMed Web of Science

[48] 48.↵
Salamov AA, Solovyev VV. Ab initio gene finding in Drosophila genomic DNA. Genome Res. 2000;10:516–22.
OpenUrl Abstract/FREE Full Text

[49] 49.↵
Conesa A, Götz S, García-Gómez JM, Terol J, Talón M, Robles M. Blast2GO: a universal tool for annotation, visualization and analysis in functional genomics research. Bioinformatics. 2005;21:3674–6.
OpenUrl CrossRef PubMed Web of Science

[50] 50.↵
Mitchell A, Chang H-Y, Daugherty L, Fraser M, Hunter S, Lopez R, et al. The InterPro protein families database: the classification resource after 15 years. Nucleic Acids Res. 2015;43:D213–21.
OpenUrl CrossRef PubMed

[51] 51.↵
Kurtz S, Phillippy A, Delcher AL, Smoot M, Shumway M, Antonescu C, et al. Versatile and open software for comparing large genomes. Genome Biol. 2004;5:R12.
OpenUrl CrossRef PubMed

[52] 52.↵
Veltri D, Wight MM, Crouch JA. Simple synteny: a web-based tool for visualization of microsynteny across multiple species. Nucleic Acids Res. 2016;44:W41–5.
OpenUrl CrossRef PubMed

[53] 53.↵
Thompson JD, Higgins DG, Gibson TJ. CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 1994;22:4673–80.
OpenUrl CrossRef PubMed Web of Science

[54] 54.↵
Kumar S, Stecher G, Tamura K. MEGA7: molecular evolutionary genetics analysis version 7.0 for bigger datasets. Mol Biol Evol. 2016;33:1870–4.
OpenUrl CrossRef PubMed

[55] 55.↵
Jones DT, Taylor WR, Thornton JM. The rapid generation of mutation data matrices from protein sequences. Comput Appl Biosci. 1992;8:275–82.
OpenUrl CrossRef PubMed

[56] 56.↵
Felsenstein J. Confidence limits on phylogenies: an approach using the bootstrap. Evolution. 1985;39:783–91.
OpenUrl CrossRef GeoRef PubMed Web of Science

[57] 57.↵
Kihara D
Källberg M, Margaryan G, Wang S, Ma J, Xu J. RaptorX server: a resource for template-based protein structure modeling. In: Kihara D, editor. Protein structure prediction. New York, NY: Springer; 2014. p. 17–27.

[58] Kihara D

[59] 58.↵
CLC Genomics Workbench 9.0. Qiagen (Aarhus A/S). Manual for CLC genomics workbench 9.0 windows, Mac OS X and Linux Denmark. 2016.

[60] 59.↵
Wilczynski B, Dojer N. BNFinder: exact and efficient method for learning bayesian networks. Bioinformatics. 2009;25:286–7.
OpenUrl CrossRef PubMed Web of Science

[61] 60.↵
Shannon P, Markiel A, Ozier O, Baliga NS, Wang JT, Ramage D, et al. Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res. 2003;13:2498–504.
OpenUrl Abstract/FREE Full Text