SUMMARY
Bud dormancy is a crucial stage in perennial trees and allows survival over winter to ensure optimal flowering and fruit production. Recent work highlighted physiological and molecular events occurring during bud dormancy in trees and we aimed to further explore the global transcriptional changes happening throughout dormancy progression.
Using next-generation sequencing and modelling, we conducted an in-depth transcriptomic analysis for all stages of flower buds in sweet cherry (Prunus avium L.) cultivars displaying contrasted stages of bud dormancy.
We observed that buds in organogenesis, paradormancy, endodormancy and ecodormancy stages are characterised by specific transcriptional states, associated with different pathways. We further identified that endodormancy can be separated in several phases based on the transcriptomic state. We also found that transcriptional profiles of just seven genes are enough to predict the main cherry tree flower bud dormancy stages.
Our results indicate that transcriptional changes happening during dormancy are robust and conserved between different sweet cherry cultivars. Our work also sets the stage for the development of a fast and cost effective diagnostic tool to molecularly define the flower bud stages in cherry trees.
INTRODUCTION
Temperate trees face a wide range of environmental conditions including highly contrasted seasonal changes. Among the strategies to enhance survival under unfavourable climatic conditions, bud dormancy is crucial for perennial plants since its progression over winter is determinant for optimal growth, flowering and fruit production during the subsequent season. Bud dormancy has long been compared to an unresponsive physiological phase, in which metabolic processes within the buds are halted by cold temperature. However, several studies have shown that bud dormancy progression can be affected in a complex way by temperature and photoperiod (Heide & Prestrud, 2005; Allona et al., 2008; Olsen, 2010; Cooke et al., 2012; Maurya et al., 2018). Bud dormancy has traditionally been separated into three main phases: (i) paradormancy, also named “summer dormancy” (Cline & Deppong, 1999); (ii) endodormancy, mostly triggered by internal factors; and (iii) ecodormancy, controlled by external factors (Lang et al., 1987; Considine & Considine, 2016). Progression through endodormancy requires cold accumulation whereas warmer temperatures, i.e. heat accumulation, drive the competence to resume growth over the ecodormancy phase. Dormancy is thus highly dependent on external temperatures, and changes in seasonal timing of bud break and blooming have been reported in relation with global warming. Notably, advances in bud break and blooming dates in spring have been observed in the northern hemisphere, thus increasing the risk of late frost damages (Badeck et al., 2004; Menzel et al., 2006; Vitasse et al., 2014; Fu et al., 2015; Bigler & Bugmann, 2018) while insufficient cold accumulation during winter may lead to incomplete dormancy release associated with bud break delay and low bud break rate (Erez, 2000; Atkinson et al., 2013). These phenological changes directly impact the production of fruit crops, leading to large potential economic losses (Snyder & de Melo-abreu, 2005). Consequently, it becomes urgent to acquire a better understanding of bud responses to temperature stimuli in the context of climate change in order to tackle fruit losses and anticipate future production changes.
In the recent years, an increasing number of studies have investigated the physiological and molecular mechanisms of bud dormancy transitions in perennials using RNA sequencing technology, thereby giving a new insight into potential pathways involved in dormancy. The results suggest that the transitions between the three main bud dormancy phases (para-, endo- and eco- dormancy) are mediated by pathways related to phytohormones (Zhong et al., 2013; Chao et al., 2017; Khalil-Ur-Rehman et al., 2017; Zhang et al., 2018), carbohydrates (Min et al., 2017; Zhang et al., 2018), temperature (Ueno et al., 2013; Paul et al., 2014), photoperiod (Lesur et al., 2015), reactive oxygen species (Takemura et al., 2015; Zhu et al., 2015), water deprivation (Lesur et al., 2015), cold acclimation and epigenetic regulation (Kumar et al., 2016). Owing to these studies, a better understanding of bud dormancy has been established in different perennial species (see for example, the recent reviews (Beauvieux et al., 2018; Lloret et al., 2018; Falavigna et al., 2019). However we are still missing a fine-resolution temporal understanding of transcriptomic changes happening over the entire bud development, from bud organogenesis to bud break.
Indeed, the small number of sampling dates in existing studies seems to be insufficient to capture all the information about changes occurring throughout the dormancy cycle as it most likely corresponds to a chain of biological events rather than an on/off mechanism. Many unresolved questions remain: What are the fine-resolution dynamics of gene expression related to dormancy? Are specific sets of genes associated with dormancy stages? Since the timing for the response to environmental cues is cultivar-dependant (Campoy et al., 2011; Wenden et al., 2017), are transcriptomic profiles during dormancy different in cultivars with contrasted flowering date?
To explore these mechanisms, we conducted a transcriptomic analysis of sweet cherry (Prunus avium L.) flower buds from bud organogenesis until the end of bud dormancy using next-generation sequencing. Sweet cherry is a perennial species highly sensitive to temperature (Heide, 2008) and we focused on three sweet cherry cultivars displaying contrasted flowering dates and response to environmental conditions. We carried out a fine-resolution time-course spanning the entire bud development, from flower organogenesis in July to spring in the following year when flowering occurs, encompassing para-, enco- and ecodormancy phases. Our results indicate that transcriptional changes happening during dormancy are conserved between different sweet cherry cultivars, opening the way to the identification of key factors involved in the progression through bud dormancy.
MATERIAL AND METHODS
Plant material
Branches and flower buds were collected from four different sweet cherry cultivars with contrasted flowering dates: ‘Cristobalina’, ‘Garnet’, ‘Regina’ and ‘Fertard’, which display extra-early, early, late and very late flowering dates, respectively. ‘Cristobalina’, ‘Garnet’, ‘Regina’ trees were grown in an orchard located at the Fruit Experimental Unit of INRA in Bourran (South West of France, 44° 19′ 56′′ N, 0° 24′ 47′′ E), under the same agricultural practices. ‘Fertard’ trees were grown in a nearby orchard at the Fruit Experimental Unit of INRA in Toulenne, near Bordeaux (48° 51′ 46′′ N, 2° 17′ 15′′ E). During the first sampling season (2015/2016), ten or eleven dates spanning the entire period from flower bud organogenesis (July 2015) to bud break (March 2016) were chosen for RNA sequencing (Table S1; Fig. 1a), while bud tissues from ‘Fertard’ were sampled in 2015/2016 (12 dates) and 2017/2018 (7 dates) for validation by qRT-PCR (Table S1). For each date, flower buds were sampled from different trees, each tree corresponding to a biological replicate. Upon harvesting, buds were flash frozen in liquid nitrogen and stored at −80°C prior to performing RNA-seq.
Measurements of bud break and estimation of the dormancy release date
For the two sampling seasons, 2015/2016 and 2017/2018, three branches bearing floral buds were randomly chosen fortnightly from ‘Cristobalina’, ‘Garnet’, ‘Regina’ and ‘Fertard’ trees, between November and flowering time (March-April). Branches were incubated in water pots placed under forcing conditions in a growth chamber (25°C, 16h light/ 8h dark, 60-70% humidity). The water was replaced every 3-4 days. After ten days under forcing conditions, the total number of flower buds that reached the BBCH stage 53 (Meier, 2001; Fadón et al., 2015) was recorded. The date of dormancy release was estimated as the date when the percentage of buds at BBCH stage 53 was above 50% after ten days under forcing conditions (Fig. 1a).
RNA extraction and library preparation
Total RNA was extracted from 50-60 mg of frozen and pulverised flower buds using RNeasy Plant Mini kit (Qiagen) with minor modification: 1.5% PVP-40 was added in the extraction buffer RLT. RNA quality was evaluated using Tapestation 4200 (Agilent Genomics). Library preparation was performed on 1 μg of high quality RNA (RNA integrity number equivalent superior or equivalent to 8.5) using the TruSeq Stranded mRNA Library Prep Kit High Throughput (Illumina cat. no. RS-122-2103) for ‘Cristobalina’, ‘Garnet’ and ‘Regina’ cultivars. DNA quality from libraries was evaluated using Tapestation 4200. The libraries were sequenced on a NextSeq500 (Illumina), at the Sainsbury Laboratory Cambridge University (SLCU), using paired-end sequencing of 75 bp in length.
Mapping and differential expression analysis
The raw reads obtained from the sequencing were analysed using several publicly available software and in-house scripts. The quality of reads was assessed using FastQC (www.bioinformatics.babraham.ac.uk/projects/fastqc/) and possible adaptor contaminations and low quality trailing sequences were removed using Trimmomatic (Bolger et al., 2014). Trimmed reads were mapped to the peach (Prunus persica (L) Batsch) reference genome v.2 (Verde et al., 2017) using Tophat (Trapnell et al., 2009). Possible optical duplicates were removed using Picard tools (https://github.com/broadinstitute/picard). The total number of mapped reads of each samples are given in Table S2. For each gene, raw read counts and TPM (Transcripts Per Million) numbers were calculated (Wagner, 2003).
We performed a differential expression analysis on data obtained from the ‘Garnet’ samples. First, data were filtered by removing lowly expressed genes (average read count < 3), genes not expressed in most samples (read counts = 0 in more than 75% of the samples) and genes presenting little ratio change (coefficient of variation < 0.3). Then, differentially expressed genes (DEGs) between bud stages (organogenesis, paradormancy, endodormancy, dormancy breaking, ecodormancy, see Table S1) were assessed using DEseq2 R Bioconductor package (Love et al., 2014), in the statistical software R (R Core Team 2018), on filtered data. Genes with an adjusted p-value (padj) < 0.05 were assigned as DEGs (Table S3). To enable researchers to access this resource, we have created a graphical web interface to allow easy visualisation of transcriptional profiles throughout flower bud dormancy in the three cultivars for genes of interest (bwenden.shinyapps.io/DorPatterns).
Principal component analyses and hierarchical clustering
Distances between the DEGs expression patterns over the time course were calculated based on Pearson’s correlation on ‘Garnet’ TPM values. We applied a hierarchical clustering analysis on the distance matrix to define ten clusters (Table S3). For expression patterns representation, we normalized the data using z-score for each gene: where TPMij is the TPM value of the gene i in the sample j, meani and standard deviationi are the mean and standard deviation of the TPM values for the gene i over all samples.
Principal component analyses (PCA) were performed on TPM values from different datasets using the prcomp function from R.
For each cluster, using data for ‘Garnet’, ‘Regina’ and ‘Cristobalina’, mean expression pattern was calculated as the mean z-score value for all genes belonging to the cluster. We then calculated the Pearson’s correlation between the z-score values for each gene and the mean z-score for each cluster. We defined the marker genes as genes with the highest correlation values, i.e. genes that represent the best the average pattern of the clusters. Keeping in mind that the marker genes should be easy to handle, we then selected the optimal marker genes displaying high expression levels while not belonging to extended protein families.
Motif and transcription factor targets enrichment analysis
We performed enrichment analysis on the DEG in the different clusters for transcription factor targets genes and target motifs.
Motif discovery on the DEG set was performed using Find Individual Motif occurrences (FIMO) (Grant et al., 2011). Motif list available for peach was obtained from PlantTFDB 4.0 (Jin et al., 2017). To calculate the overrepresentation of motifs, DEGs were grouped by motif (grouping several genes and transcripts in which the motif was found). Overrepresentation of motifs was performed using hypergeometric tests using Hypergeometric {stats} available in R. Comparison was performed for the number of appearances of a motif in one cluster against the number of appearances on the overall set of DEG. As multiple testing implies the increment of false positives, p-values obtained were corrected using False Discovery Rate (Benjamini & Hochberg, 1995) correction method using p.adjust{stats} function available in R.
A list of predicted regulation between transcription factors and target genes is available for peach in PlantTFDB (Jin et al., 2017). We collected the list and used it to analyse the overrepresentation of genes targeted by TF, using Hypergeometric {stats} available in R, comparing the number of appearances of a gene controlled by one TF in one cluster against the number of appearances on the overall set of DEG. p-values obtained were corrected using a false discovery rate as described above. Predicted gene homology to Arabidopsis thaliana and functions were retrieved from the data files available for Prunus persica (GDR, https://www.rosaceae.org/species/prunus_persica/genome_v2.0.a1).
GO enrichment analysis
The list for the gene ontology (GO) terms was retrieved from the database resource PlantRegMap (Jin et al., 2017). Using the topGO package (Alexa & Rahnenführer, 2018), we performed an enrichment analysis on GO terms for biological processes, cellular components and molecular functions based on a classic Fisher algorithm. Enriched GO terms were filtered with a p-value < 0.005 and the ten GO terms with the lowest p-value were selected for representation.
Marker genes qRT-PCR analyses
cDNA was synthetised from 1µg of total RNA using the iscript Reverse Transcriptase Kit (Bio-rad Cat no 1708891) in 20 µl of final volume. 2 µL of cDNA diluted to a third was used to perform the qPCR in a 20 µL total reaction volume. qPCRs were performed using a Roche LightCycler 480. Three biological replicates for each sample were performed. Primers used in this study for qPCR are: PavCSLG3 F:CCAACCAACAAAGTTGACGA, R:CAACTCCCCCAAAAAGATGA; PavMEE9: F:CTGCAGCTGAACTGGAACAG, R:ACTCATCCATGGCACTCTCC; PavSRP: F:ACAGGATCTGGAAAGCCAAG, R:AGGGTGGCTCTGAAACACAG; PavTCX2: F:CTTCCCACAACGCCTTTACG, R:GGCTATGTCTCTCAAACTTGGA; PavGH127: F:GCCATTGGTTGTAGGGTTTG, R:ATCCCATTCAGCATTCGTTC; PavUDP-GALT1 F:CAATGTTGCTGGAAACCTCA, R:GTTATTCCACATCCGACAGC; PavPP2C F:CTGTGCCTGAAGTGACACAGA, R:CTGCACTGCTTCTTGATTTG; PavRPII F:TGAAGCATACACCTATGATGATGAAG, R:CTTTGACAGCACCAGTAGATTCC; PavEF1 F:CCCTTCGACTTCCACTTCAG, R:CACAAGCATACCAGGCTTCA. Primers were tested for non-specific products previously by separation on 1.5% agarose gel electrophoresis and by sequencing each amplicon. Real-time data were analysed using custom R scripts.
Bud stage predictive modelling
In order to predict the bud stage based on the marker genes transcriptomic data, we used TPM values for the marker genes to train a multinomial logistic regression. First, all samples were projected into a 2-dimension plan using PCA. The new coordinates were used to train and test the model to predict the five bud stage categories (function multinom from the nnet R package, (Ripley & Venables, 2016). Model accuracy is calculated as the percentage of correct predicted stages in the testing set. In addition, we tested the model on qRT-PCR data for ‘Fertard’ samples. Relative expression was estimated for each gene in each sample using a cDNA standard curve and normalized by the expression corresponding to the October sample. We chose the date of October as the reference because it corresponds to the beginning of dormancy and it was available for all cultivars. For each date, the mean expression values of the seven marker genes were projected in the PCA 2-dimension plan calculated for the RNA-seq data and they were tested against the model trained on ‘Cristobalina’, ‘Garnet’ and ‘Regina’ RNA-seq data.
RESULTS
Transcriptome accurately captures the dormancy state
In order to define transcriptional changes happening over the sweet cherry flower bud development, we performed a transcriptomic-wide analysis using next-generation sequencing from bud organogenesis to flowering. According to bud break percentage (Fig. 1a), morphological observations (Fig. 1b), average temperatures (Fig. S1) and descriptions from Lang et al., (1987), we assigned five main stages to the early flowering cultivar ‘Garnet’ flower buds samples (Fig. 1b): i) flower bud organogenesis occurs in July and August, ii) paradormancy corresponds to the period of growth cessation in September, iii) during the endodormancy phase, initiated in October, buds are unresponsive to forcing conditions therefore the increasing bud break percentage under forcing conditions suggests that endodormancy was released on January 29th, 2016, thus corresponding to iv) dormancy breaking, and v) ecodormancy starting from the estimated dormancy release date until flowering.
We identified 6,683 genes that are differentially expressed (DEGs) between the defined bud stages for the sweet cherry cultivar ‘Garnet’ (Table S3). When projected into a two-dimensional space (Principal Component Analysis, PCA), data for these DEGs show that transcriptomes of samples harvested at a given date are projected together (Fig. 2), showing the high quality of the biological replicates and that different trees are in a very similar transcriptional state at the same date. Very interestingly, we also observe that flower bud states are clearly separated on the PCA, with the exception of organogenesis and paradormancy, which are projected together (Fig. 2). The first dimension of the analysis (PC1) explains 41,63% of the variance and clearly represents the strength of bud dormancy where samples on the right of the axis are in endodormancy or dormancy breaking stages. The second dimension of the analysis (PC2) explains 20.24% of the variance and distinguishes two main phases of the bud development: before and after dormancy breaking. We obtain very similar results when performing the PCA on all genes (Fig. S2). These results indicate that the transcriptional state of DEGs accurately captures the dormancy state of flower buds.
Bud stage-dependent transcriptional activation and repression are associated with different pathways
We further investigated whether specific genes or signalling pathways could be associated with the different flower bud stages. Indeed, the expression of genes grouped in ten clusters clearly shows distinct expression profiles throughout the bud development (Fig. 3). Overall, three main types of clusters can be discriminated: the ones with a maximum expression level during organogenesis and paradormancy (cluster 1: 1,549 genes; cluster 2: 70 genes; cluster 3: 113 genes; cluster 4: 884 genes and cluster 10: 739 genes, Fig. 3), the clusters with a maximum expression level during endodormancy and around the time of dormancy breaking (cluster 5: 156 genes; cluster 6: 989 genes; cluster 7: 648 genes and cluster 8: 612 genes, Fig. 3), and finally the clusters with a maximum expression level during ecodormancy (cluster 9: 924 genes and cluster 10, Fig. 3). This result shows that different groups of genes are associated with these three main flower bud phases. Interestingly, we also observed that, during the endodormancy phase, some genes are expressed in October and November then repressed in December (cluster 4, Fig. 3), whereas another group of genes is expressed in December (clusters 8, 5, 6 and 7, Fig. 3) therefore separating endodormancy in two distinct phases.
In order to explore the functions and pathways associated with the gene clusters, we performed a GO enrichment analysis (Fig. 4, Fig. S3). GO terms associated with the response to stress as well as biotic and abiotic stimuli were enriched in the clusters 2, 3 and 4, with genes mainly expressed during organogenesis and paradormancy. During endodormancy (cluster 5), an enrichment for genes involved in response to nitrate and nitrogen compounds was spotted. On the opposite, at the end of the endodormancy phase (cluster 6, 7 and 8), we highlighted different enrichments in GO terms linked to basic metabolisms such as nucleic acid metabolic processes or DNA replication but also to response to alcohol and abscisic acid. Finally, during ecodormancy, genes in cluster 9 and 10 are enriched in functions associated with transport, cell wall biogenesis as well as oxidation-reduction processes (Fig. 4, Fig. S3). These results show that different functions and pathways are specific to flower bud development stages.
Specific transcription factor target genes are expressed during the main flower bud stages
To better understand the regulation of genes that are expressed at different flower bud stages, we investigated the TFs with enriched targets (Table 1) as well as the enriched target promoter motifs (Table S4) in the different gene clusters. Among the genes expressed during the organogenesis and paradormancy phases (clusters 1, 2, 3 and 4), we observed an enrichment for motifs of several MADS-box TFs such as AGAMOUS (AG), APETALA3 (AP3) and SEPALLATA3/AGAMOUS-like 9 (SEP3/AGL9) (Table S4), several of them potentially involved in flower organogenesis (Causier et al., 2010). On the other hand, for the same clusters, results show an enrichment in MYB-related targets, WRKY and ethylene-responsive element (ERF) binding TFs (Table 1, Table S4). Several members of these TF families have been shown to participate in the response to abiotic factors. Similarly, we found in the cluster 4 target motifs enriched for PavDREB2C (Table S4), potentially involved in the response to cold (Lee et al., 2010). Interestingly, we identified an enrichment in the cluster 5 of targets for CBF4, and of genes with motifs for several ethylene-responsive element binding TFs such as PavDREB2C. We also observed an enrichment in the same cluster for genes with motifs for ABI5 (Table S4). All these TFs are involved in the response to cold, in agreement with the fact that genes in the cluster 5 are expressed during endodormancy.
Genes belonging to the clusters 6, 7 and 8 are highly expressed during deep dormancy and we found targets and target motifs for many TFs involved in the response to abiotic stresses. For example, we found motifs enriched in the cluster 7 for many TFs of the C2H2 family, which is involved in the response of wide spectrum of stress conditions, such as extreme temperatures, salinity, drought or oxidative stress (Table S4, (Kiełbowicz-Matuk, 2012; Liu et al., 2015). Similarly, in the cluster 8, we also identified an enrichment in targets and motifs of many genes involved in the response to ABA and to abiotic stimulus, such as PavABF2, PavAREB3, PavABI5 and PavDREB2C (Koornneef et al., 1998; Lee et al., 2010). We also observe in this same cluster an enrichment for targets of TFs involved in the response to light and temperature, such as PavPIL5, PavSPT, PavRVE1 and PavPIF4 (Table 1, Penfield et al., 2005; Olsen, 2010; Franklin et al., 2011; Doğramacı et al., 2014). Interestingly, we found that among the TFs with enriched targets in the clusters, only ten display changes in expression during flower bud development (Table 1, Table S4, Fig. S4), including PavABF2, PavABI5 and PavRVE1. Expression profiles for these three genes are very similar, and are also similar to their target genes, with a peak of expression around the estimated dormancy release date, indicating that these TFs are positively regulating their targets (Fig. S4).
Finally, genes belonging to the cluster 10 are expressed during ecodormancy and we find an enrichment for targets of PavMYB14 (Table 1). Expression profiles suggest that PavMYB14 represses expression of its target genes during endodormancy (Fig. S4), consistently with the functions of Arabidopsis thaliana MYB14 that negatively regulates the response to cold (Chen et al., 2013). Overall, these results show that a small number of TFs specifically regulate target genes during the different flower bud stages.
Expression patterns highlight bud dormancy similarities and disparities between three cherry tree cultivars
Since temperature changes and progression through the flower bud stages are happening synchronously, it is challenging to discriminate transcriptional changes that are mainly associated with one or the other. In this context, we also analysed the transcriptome of two other sweet cherry cultivars: ‘Cristobalina’, characterized by very early flowering dates, and ‘Regina’, with a late flowering time. The span between flowering periods for the three cultivars is also found in the transition between endodormancy and ecodormancy since ten weeks separated the estimated dates of dormancy release between the cultivars: 9th December 2015 for ‘Cristobalina’, 29th January 2016 for ‘Garnet’ and 26th February 2016 for ‘Regina’ (Fig. 1a). The transition from organogenesis to paradormancy is not well documented and many studies suggest that endodormancy onset is under the strict control of environment. Therefore, we considered that these two transitions occurred at the same time in all three cultivars. However, the two months and half difference in the date of transition from endodormancy to ecodormancy between the cultivars allow us to look for transcriptional changes associated with this transition independently of environmental conditions. To do so, we compared the expression patterns of the previously identified DEGs between the three contrasted cultivars throughout flower bud stages (Fig. 1b). When projected into a PCA 2-components plane, all samples harvested from buds at the same stage cluster together, whatever the cultivar (Fig. 5), suggesting that the stage of the bud has more impact on the transcriptional state than time or external conditions.
To go further, we compared transcriptional profiles throughout the time course in all cultivars. For this we analysed the expression profiles in each cultivar for the clusters previously identified for the cultivar ‘Garnet’ (Fig. 6). Due to the low number of genes, clusters 2, 3 were not further studied in the three cultivars and we considered that the expression patterns for the genes in cluster 6 were redundant with clusters 5 and 7 therefore we simplified the analysis on seven clusters. In general, averaged expression profiles for all clusters are very similar in all three varieties, with the peak of expression happening at a similar period of the year. However, we can distinguish two main phases according to similarities or disparities between cultivars. First, averaged expression profiles are almost similar in all cultivars between July and November. This is especially the case for clusters 1, 4, 7, 8 and 9. On the other hand, we can observe a temporal shift in the peak of expression between varieties from December onward for genes in clusters 1, 5, 8 and 10. Indeed, in these clusters, the peak or drop in expression happens earlier in ‘Cristobalina’, and slightly later in ‘Regina’ compared to ‘Garnet’ (Fig. 6), in correlation with their dormancy release dates. These results seem to confirm that the organogenesis and paradormancy phases occur concomitantly in the three cultivars while temporal shifts between cultivars are observed after endodormancy onset. Therefore, similarly to the PCA results (Fig. 5), the expression profile of these genes is more associated with the flower bud stage than with external environmental conditions.
Flower bud stage can be predicted using a small set of marker genes
We have shown that flower buds in organogenesis, paradormancy, endodormancy and ecodormancy are characterised by specific transcriptional states. In theory, we could therefore use transcriptional data to infer the flower bud stage. For this, we selected seven marker genes, for clusters 1, 4, 5, 7, 8, 9 and 10, that best represent the average expression profiles of their cluster (Fig. 6). Expression for these marker genes not only recapitulates the average profile of the cluster they originate from, but also temporal shifts in the profiles between the three cultivars (Fig. 6b). In order to define if these genes encompass as much information as the full transcriptome, or all DEGs, we performed a PCA of all samples harvested for all three cultivars using expression levels of these seven markers (Fig. S7). The clustering of samples along the two main axes of the PCA using these seven markers is very similar, if not almost identical, to the PCA results obtained using expression for all DEGs (Fig. 5). This indicates that the transcriptomic data can be reduced to only seven genes and still provides accurate information about the flower bud stages.
To test if these seven markers can be used to define the flower bud stage, we used a multinomial logistic regression modelling approach to predict the flower bud stage in our dataset based on the expression levels for these seven genes (Fig. 7 and Table S5). We obtain a very high model accuracy (90%) when the training and testing sets are randomly picked. The model also shows a high accuracy (82 to 87%) when predicting the bud stage of samples from the ‘Garnet’ or ‘Regina’ cultivars and trained on the two other cultivars (Table S5). These results indicate that the bud stage can be accurately predicted based on expression data by just using seven genes. In order to go further and test our model in an independent experiment, we analysed expression for the seven marker genes by RT-qPCR on buds sampled from another sweet cherry tree cultivar ‘Fertard’ for two consecutive years (Fig. 7a). We find a high accuracy of 71% for our model, trained on our data for all three cultivars ‘Regina’, ‘Garnet’ and ‘Cristobalina’, to predict the flower bud stage for the ‘Fertard’ cultivar (Fig. 7c). In particular, the chronology of bud stages was very well predicted. This result indicates that these seven genes can be used as a diagnostic tool in order to infer the flower bud stage in sweet cherry trees.
Discussion
In this work, we have characterised transcriptional changes at a genome-wide scale happening throughout cherry tree flower bud dormancy, from organogenesis to the end of dormancy. To do this, we have analysed expression in flower buds at 11 dates from July 2015 to March 2016 for three cultivars displaying different dates of dormancy release, generating 82 transcriptomes in total. This resource, with a fine time resolution, reveals key aspects of the regulation of cherry tree flower buds during dormancy (Fig. 8). We have shown that buds in organogenesis, paradormancy, endodormancy and ecodormancy are characterised by distinct transcriptional states (Fig. 2, 3) and we highlighted the different pathways activated during the main cherry tree flower bud dormancy stages (Fig. 4 and Table 1). Finally, we found that just seven genes are enough to accurately predict the main cherry tree flower bud dormancy stages (Fig. 6, 7).
Global lessons from transcriptomic data on the definition of flower bud dormancy stages
Our results show that buds in organogenesis, paradormancy, endodormancy and ecodormancy are characterised by distinct transcriptional states. This result is further supported by the fact that we detected different groups of genes that are specifically expressed at these bud stages (Fig. 3). Specifically, we found that the transcriptional states of flower buds during endodormancy and ecodormancy are very different, indicating that different pathways are involved in these two types of dormancy. This is further supporting previous observations that buds remain in endodormancy and ecodormancy states under the control of different regulation pathways. Indeed, ecodormancy is under the control of external signals and can therefore be reversed by exposure to growth-promotive signals (Lang et al., 1987). On the opposite, endogenous signals control endodormancy onset and maintenance and a complex array of signalling pathways seem to be involved in the response to cold temperatures that subsequently leads to dormancy breaking (see for example (Ophir et al., 2009; Horvath, 2009; Considine & Considine, 2016; Singh et al., 2016; Lloret et al., 2018; Falavigna et al., 2019).
Another interesting observation is the fact that samples harvested during endodormancy can be separated into two groups based on their transcriptional state: early endodormancy (October and November), and late endodormancy (from December to dormancy breaking). These two groups of samples are forming two distinct clusters in the PCA (Fig. 5), and are associated with different groups of expressed genes. These results indicate that endodormancy could potentially be separated into two periods: early and late endodormancy. However, we have to keep in mind that cold temperatures, below 10°C, only started at the end of November. It is thus difficult to discriminate between transcriptional changes associated with a difference in the bud stage during endodormancy, an effect of the pronounced change in temperatures, or a combination of both. Alternative experiments under controlled environments, similarly to studies conducted on hybrid aspen for example (Ruttink et al., 2007), could improve our knowledge on the different levels of endodormancy.
We also show that we can accurately predict the different bud stages using expression levels for only seven marker genes (Fig. 7). This suggests that the definition of the different bud stages based on physiological observation is consistent with transcriptomic profiles. However, we could detect substantial discrepancies suggesting that the definition of the bud stages can be improved. Indeed, we observe that samples harvested from buds during phases that we defined as organogenesis and paradormancy cluster together in the PCA, but away from samples harvested during endodormancy. Moreover, most of the genes highly expressed during paradormancy are also highly expressed during organogenesis. This is further supported by the fact that paradormancy is a flower bud stage predicted with less accuracy based on expression level of the seven marker genes. In details, paradormancy is defined as a stage of growth inhibition originating from surrounding organs (Lang et al., 1987) therefore it is strongly dependant on the position of the buds within the tree and the branch. Our results suggest that defining paradormancy for multiple cherry flower buds based on transcriptomic data is difficult and even raise the question of whether paradormancy can be considered as a specific flower bud stage. Alternatively, we propose that the pre-dormancy period should rather be defined as a continuum between organogenesis, growth and/or growth cessation phases. Further physiological observations, including flower primordia developmental context (Fadón et al., 2015), could provide crucial information to precisely link the transcriptomic environment to these bud stages.
Highlight on main functions enriched during dormancy: organogenesis, response to cold, to ABA and to the circadian clock
We determined different functions and pathways enriched during flower bud organogenesis, paradormancy, endodormancy and ecodormancy. We notably observe an enrichment for GO involved in the response to abiotic and biotic responses, as well as an enrichment for targets of many TFs involved in the response to environmental factors. In particular, our results suggest that PavMYB14, which has a peak of expression in November just before the cold period starts, is repressing genes that are expressed during ecodormancy. This is in agreement with the fact that AtMYB14, the PavMYB14 homolog in Arabidopsis thaliana, is involved in cold stress response regulation (Chen et al., 2013). Although these results were not confirmed in Populus (Howe et al., 2015), two MYB DOMAIN PROTEIN genes (MYB4 and MYB14) were up-regulated during the induction phase of dormancy in grapevine (Fennell et al., 2015). Similarly, we identified an enrichment in target motifs for a transcription factor belonging to the C-REPEAT/DRE BINDING FACTOR 2/DEHYDRATION RESPONSE ELEMENT-BINDING PROTEIN (CBF/DREB) family in genes highly expressed during endodormancy. These TFs have previously been implicated in cold acclimation and endodormancy in several perennial species (Doǧramaci et al., 2010; Leida et al., 2012). These results are in agreement with the previous observation showing that genes responding to cold are differentially expressed during dormancy in other tree species (Ueno et al., 2013). Interestingly, we also identified an enrichment in targets for four TFs involved in ABA-dependent signalling. First, PavWRKY40 is mostly expressed during organogenesis, and its expression profile is very similar to the one of its target genes. Several studies have highlighted a role of PavWRKY40 homolog in Arabidopsis in ABA signalling, in relation with light transduction (Liu et al., 2013; Geilen & Böhmer, 2015) and biotic stresses (Pandey et al., 2010). On the other hand, PavABI5 and PavABF2 are mainly expressed around the time of dormancy release, like their target, and their homologs in Arabidopsis are involved in key ABA processes, especially during seed dormancy (Lopez-Molina et al., 2002). These results are further confirmed by the enrichment of GO terms related to ABA pathway found in the genes highly expressed during endodormancy. Our observations suggest that genes potentially linked to ABA signalling are expressed either during organogenesis or during dormancy release. These results are supported by previous reports where genes involved in ABA signalling are differentially expressed during dormancy in other tree species (Ruttink et al., 2007; Ueno et al., 2013; Zhong et al., 2013; Khalil-Ur-Rehman et al., 2017; Zhang et al., 2018). It has also been shown that genes involved in other phytohormones pathways, including auxin, ethylene, gibberellin and jasmonic acid, are differentially expressed between bud stages in other perennial species (Zhong et al., 2013; Khalil-Ur-Rehman et al., 2017). This is in agreement with our observation of an enrichment for GO terms for the response to jasmonic acid, and of targets of TFs involved in the response to ethylene, in genes specifically expressed at different flower bud stages.
In addition, we also identified an enrichment of targets for PavRVE8 and PavRVE1 among the genes expressed around the time of dormancy release. These TFs are homologs of Arabidopsis MYB transcription factors involved in the circadian clock. In particular, AtRVE1 seems to integrate several signalling pathways including cold acclimation and auxin (Rawat et al., 2009; Meissner et al., 2013; Jiang et al., 2016) while AtRVE8 is involved in the regulation of circadian clock by modulating the pattern of H3 acetylation (Farinas & Mas, 2011). Our findings that genes involved in the circadian clock are expressed and potentially regulate genes at the time of dormancy release are in agreement with previous work indicating a role of the circadian clock in dormancy in poplar (Ibáñez et al., 2010). To our knowledge, this is the first report on the transcriptional regulation of early stages of flower bud development. We highlighted the upregulation of several pathways linked to organogenesis during the summer months, including PavMYB63 and PavMYB93, expressed during early organogenesis, along their targets, with potential roles in the secondary wall formation (Zhou et al., 2009) and root development (Gibbs et al., 2014).
Development of a diagnostic tool to define the flower bud dormancy stage using seven genes
We find that sweet cherry flower bud stage can be accurately predicted with the expression of just seven genes. It indicates that combining expression profiles of just seven genes is enough to recapitulate all transcriptional states in our study. This is in agreement with previous work showing that transcriptomic states can be accurately predicted using a relatively low number of markers (Biswas et al., 2017). Interestingly, when there are discrepancies between the predicted bud stages and the ones defined by physiological observations, the model always predicts that stages happen earlier than the actual observations. For example, the model predicts that dormancy breaking occurs instead of endodormancy, or ecodormancy instead of dormancy breaking. This could suggest that transcriptional changes happen before we can observe physiological changes. This is indeed consistent with the indirect phenotyping method currently used, based on the observation of the response to growth-inducible conditions after ten days. Using these seven genes to predict the flower bud stage would thus potentially allow to identify these important transitions when they actually happen.
We also show that the expression level of these seven genes can be used to predict the flower bud stage in other conditions by performing RT-qPCR. This independent experiment has also been done on two consecutive years and shows that RT-qPCR for these seven marker genes as well as two control genes are enough to predict the flower bud stage in cherry trees. It shows that performing a full transcriptomic analysis is not necessary if the only aim is to define the dormancy stage of flower buds. This would offer an alternative approach to methods currently used such as assessing the date of dormancy release by using forcing conditions. In addition, this result sets the stage for the development of a fast and cost effective diagnostic tool to molecularly define the flower bud state in cherry trees. Such diagnostic tool would be very valuable for researchers working on cherry trees as well as for plant growers, notably to define the best time for the application of dormancy breaking agents, whose efficiency highly depends on the state of dormancy progression.
Author contributions
SC, BW, ED and PAW designed the original research. MA and JCY participated to the project design. NV performed the RNA-seq and analysed the RNA-seq with CS and BW. MF performed the RT-qPCR. JAC performed the TF and motifs enrichment analysis. MT developed the model. NV, SC and BW wrote the article with the assistance of all the authors.
Supporting information
Graphical web interface DorPatterns: http://bwenden.shinyapps.io/DorPatterns
Fig. S1 Field temperature during the sampling season
Fig. S2 Separation of samples by dormancy stage using read counts for all genes
Fig. S3 Enrichments in gene ontology terms in the ten clusters
Fig. S4 Expression patterns for the transcription factors and their targets
Fig. S5 Separation of samples by dormancy stage and cultivar using all genes
Fig. S6 Clusters of expression patterns for differentially expressed genes in the sweet cherry cultivars ‘Regina’, ‘Cristobalina’ and ‘Garnet’
Fig. S7 Separation of samples by dormancy stage and cultivar using the seven marker genes
Table S1 Description of the flower bud samples used for RNA-seq and qRT-PCR
Table S2 RNA-seq mapped reads and gene count information
Table S3 ‘Garnet’ differentially expressed genes and their assigned clusters.
Table S4 Transcription factors with motif enrichment in the clusters.
Table S5 Model information for the different modelling assays corresponding to different training and testing sets.
Acknowledgments
We thank the Fruit Experimental Unit of INRA (Bordeaux-France) for growing and managing the trees, and Teresa Barreneche, Lydie Fouilhaux, Jacques Joly, Hélène Christman and Rémi Beauvieux for the help during the harvest and for the pictures. Many thanks to Dr Varodom Charoensawan (Mahidol University, Thailand) for providing scripts for mapping and gene expression count extraction. The PhD of Noemie Vimont was supported by a CIFRE grant funded by the Roullier Group (St Malo-France) and ANRT (France).