Introduction

Almost all organisms across kingdoms have internal biological clocks mediating annual, seasonal and daily changes in the environment [21]. The circadian clock controls daily biological rhythms and confers fitness to an organism by synchronizing internal biology with that of the rhythmic external environment [6, 27]. The circadian clock is an endogenous self-sustaining timing mechanism with a period of approximately 24 h that can be entrained to the exact timing of daily environmental cycles over a range of physiologically relevant temperatures. In addition, in plants there are at least two circadian clocks that can be distinguished based on their ability to synchronize with light or temperature [25].

The circadian clocks of bacteria, fungi, plants, and animals are thought to have evolved independently, but all are comprised of negative feedback loops of transcription and regulated protein turnover [21]. The current model of the plant circadian clock has been worked out in Arabidopsis thaliana and consists of three interlocking feedback loops [19, 40]. Three protein families define the plant circadian clock with unique combinations of domains conserved across multiple species: single MYB transcription factors (sMYB); pseudo-response regulators with a CONSTANS domain (PRR/CCT); and PAS/FBOX/KELCH (PFK). In addition there are multiple proteins that play a role in the circadian clock, light signaling and flowering time: GIGANTEA (GI); EARLY FLOWERING 3 (ELF3); ELF4; TIME FOR COFFEE (TIC); TEJ; and casein kinase beta subunit 3 (CKB3) [21].

The two homologous morning-expressed single Myb transcription factors CIRCADIAN CLOCK ASSOCIATED 1 (CCA1) [39] and LATE ELONGATED HYPOCOTYL (LHY) [35] repress the expression of the pseudo-response regulator TIMING OF CAB2 EXPRESSION 1 (TOC1) [37] and LUX ARRYTHMO (LUX) by binding to evening elements (EE) in their promoters [13]. TOC1 and LUX levels increase toward the end of the day and directly or indirectly up-regulate expression of CCA1 and LHY [1, 13]. Again, through binding of the EE, CCA1 and LHY activate the expression of their repressors, pseudo-response regulators PRR7 and PRR9 [9]. The third interlocked loop, CCA1, LHY, and TOC1 act to repress the likely activator of TOC1, GI [19].

Based on microarrays, 89% of the Arabidopsis transcriptome is expressed at different levels over the day depending on the environmental conditions and for any specific condition 15–30% of transcripts cycle under circadian conditions while 30–50% cycle under diurnal conditions [2, 5, 7, 22]. These numbers are consistent with an estimate that 35% of the transcriptome is circadian regulated based on enhancer trapping [24, 30]. Processes such as growth are controlled by time-of-day coordination of phytohormone expression pathways by the circadian clock and light signaling [26]. Furthermore, three time-of-day specific transcriptional modules were identified that are conserved across Arabidopsis, poplar and rice [22]. This latter finding suggests that daily environmental cycles have contributed significantly to shaping the fabric of the plant genome.

One of the driving questions of the current study was whether or not circadian timing would be conserved between Arabidopsis and papaya because as a tropical plant papaya primarily grows at latitudes with less seasonal variation in day length and temperature than Arabidopsis. Papaya has about half the circadian clock and light signaling genes as Arabidopsis, yet an expansion of the COP1 gene family that mediates degradation of light signaling proteins. This led to the hypothesis that papaya may spend less energy measuring time and more energy degrading proteins in direct response to changes in light [28]. This later hypothesis would be consistent with the idea that synchronous flowering near the equator is governed by the perception of variation in sunrise and sunset [3]. In this study we address the question of whether the circadian clock is conserved in papaya, a predominately tropical plant. We find that the transcriptional networks and expression are conserved in papaya. The results presented here suggest that circadian timing has played a major role in the evolution of plant genomes.

Results

Carica Papaya Circadian Clock and Light Signaling Orthologs

The draft Carica papaya genome sequence provided an opportunity to investigate a tropical circadian clock at the molecular level [28]. As a first approximation of the gene content of papaya we used a protein mutual best-blast match strategy to identify putative orthologs with Arabidopsis (Material and Methods). We focused on this comparison as Arabidopsis circadian clock research provides the most extensive information at the molecular and genetic level. Putative papaya-rice, papaya-poplar and papaya-sorghum orthologs were identified and included in a searchable database called ORTHOMAP (http://orthomap.cgrb.oregonstate.edu/).

Many circadian clock and light signaling gene families are smaller in papaya compared to Arabidopsis, rice and poplar, which is consistent with the lack of genome duplication in papaya [28]. For instance, there is only one homolog in papaya of the PAS-PAC/FBOX/KELCH (PFK) family gene ZTL compared to three in Arabidopsis (ZTL, FKF1 and LKP2; Table 1). In addition, there is only one homolog of the single MYB transcription factor LHY compared to two paralogs in Arabidopsis (LHY and CCA1; Table 1).

Table 1 Carica papaya light and circadian orthologs

In contrast, papaya has the five pseudo-response regulators (PRRs) just like Arabidopsis [28]. Since most circadian genes are reduced in papaya compared to Arabidopsis and the PRRs were not, we took a closer look at the PRR gene family in papaya. Similar to the situation in rice, which also has five PRR proteins [33], the papaya PRRs can be separated into three groups: PRR1/TOC1, PRR5/9 and PRR7/3 (Fig. 1). As in Arabidopsis and rice, there is only one PRR1/TOC1 gene in papaya that we designated CpPRR1. To date, fully sequenced plant genomes contain only one copy of the PRR1/TOC1 gene if at all (TP. Michael, unpublished observations). In papaya, neither PRR9 nor PRR3 had mutual best-blast orthologs based on our criteria (Material and Methods). In contrast, both PRR5 and PRR7 had mutual best-blast matches in addition to closely related homologues (one way blast), which also clustered with PRR9 and PRR3 based on multiple alignment (Fig. 1). In rice, due to this close relationship and the inability to separate the two PRR5/PRR9 and PRR7/PRR3 paralogs, they were named OsPRR5/9, OsPRR9/5, OsPRR7/3 and OsPRR3/7 [32]. Based on our mutual best-blast criteria we chose to name the papaya PRR genes CpPRR5A, CpPRR5B, CpPRR7A and CpPRR7B where A and B represent paralogous proteins. Regardless of the our naming strategy, our results are consistent with a trend in the expansion of the PRR gene family across species where three clades emerged and specific members in at least two clades, PRR5/9 and PRR7/3, expanded.

Fig. 1
figure 1

Three PRR gene branches in papaya PRR family: PRR1, PRR5 (A and B) and PRR7 (A and B). Arabidopsis thaliana (At), Populus trichocarpa (Pt), Orzyza sativa (Os) and Carica papaya (Cp) PRR proteins were aligned with clustalX and the tree was constructed with TreeView

Conserved Time-of-day Cis-Acting Elements

In Arabidopsis there are three cis-acting modules controlling time of day expression: the morning module, morning element (ME, CCACAC)/Gbox (CACGTG); the evening module, evening element (EE, AAATATCT)/GATA (GATA); and the midnight module, telobox (TBX, AAACCCT)/starch synthesis box (SBX, AAGCCC)/ protein box (PBX, ATGCCC) [22]. These three modules are also conserved across divergent species such as poplar and rice, suggesting that time-of-day signaling has specifically shaped the evolution of transcriptional networks in higher plants, and possibly photosynthetic organisms in general [22].

To address whether these time-of-day cis-acting modules are conserved in papaya, we assigned the phase of expression from the Arabidopsis orthologs to papaya and searched for time-of-day specific overrepresented elements in the promoters of the papaya orthologs [22]. We utilized the phase of Arabidopsis genes from eight diurnal and circadian conditions. Every 3–8 bp word from 500 bp of papaya promoters was queried for overrepresentation in each of the phase-specific gene lists using the ELEMENT promoter-searching tool [30]. For each word, plotting the Z-score at each phase over the day generated a Z-score profile. Only words with Z-scores that were significant at more than two consecutive phases over the day were retained in the analysis (Material and Methods). Across eight diurnal and circadian conditions there were between 250 and 578 significant words (Fig. 2a). We clustered significant 3–8mer words based on the time-of-day (hrs from lights-on/subjective dawn) that the Z-score most significant (highest peak), i.e. the words were grouped by the time-of-day that they were most overrepresented and presumably active. We found a similar number of words at each time of the day (phase), except in two conditions where we found more words later in the night (Fig. 2b).

Fig. 2
figure 2

Time-of-day words (3–8mers) identified in papaya. a Number of time-of-day specific words identified per condition in papaya promoters. Best-blast orthologs were identified between Arabidopsis-papaya, the phase of Arabidopsis genes was assigned to the papaya ortholog, Z-scores were calculated for every 3–8mer from 500 bp of the papaya promoter by looking for enrichment over observed in a similar sample size from the genome, Z-scores were plotted over the day and only words with two consecutive Z-scores above the threshold (~3, P < 0.05) were retained. Between 200 and 600 words were retained across the eight conditions tested. Conditions are described [22]; LDHH: 12 h Light (L)-12 h dark (D) and continuous temperature (HH); LDHC: 12 h Light (L)-12 h dark (D) and 12 h hot (H)- 12 h cold (C); LLHC: continuous light (LL) and 12 h H-12 h C; long day: 16 h light-8 h dark and continuous temperature; short day: 8 h light-16 h dark and continuous temperature; LL_LDHH, LL_LDHC and LL_LLHC are sampled under continuous light (LL) and grown under the specified condition. b Words arranged by their time-of-day of overrepresentation (phase). Words were arranged by the time-of-day for two or more consecutive Z-scores that were greater than the Z-score threshold (~3, P < 0.05). Two conditions LDHC (white) and LLHC (black) are plotted as representative of the eight conditions tested

We took a closer look at the words that were overrepresented around midnight to early morning. We found that many (30–55%) of the words could be summarized into two elements, the ME and TBX (Fig. 3), which we have previously identified in Arabidopsis, poplar and rice [22]. The remaining words similar to words that we have found previously in Arabidopsis (discussed below), while other words are specific to papaya and may represent novel papaya specific elements. However, we did note words that were “AT” rich that seem to be specific to papaya. These words could represent a new class of time-of-day specific elements, or could be an artifact of our in silico analysis. More experimentation in papaya will be required to resolve these possibilities.

Fig. 3
figure 3

The morning element (ME) and telobox (TBX) are predicted to be active at dawn and midnight, respectively. Words that were overrepresented under the condition LDHC with the same phase were grouped by sequence similarity. a The morning element (ME: CCACAC) was overrepresented at dawn. Multiple words CAC, CCAC, GCCAC, and CGCCAC were summarized as having dawn-specific overrepresentation. b The telobox (TBX: AAACCCT) was overrepresented around midnight. Multiple words AACCC, ACCCT, AAACCC, CCCTA, AACCCT, AAACCCT and ACCCTA were summarized as having midnight overrepresentation. Z-score threshold ~3, P < 0.05

The words that make up the ME were overrepresented late at night and at the beginning of the day: phases 22, 23, 24/0, 1 and 2. In contrast, the words that make up the TBX were overrepresented around midnight, phases 17, 18, 19 and 20. When we grouped overrepresented words over the day by phase, and then aligned the words based on sequence similarity, we were able to summarize them into consensus words or elements. We think of the summary element as related to a transcription factor binding site, or cis-element. We then plotted out the summarized elements, grouped them by time-of-day of peak Z-score significance and compared their Z-score profiles to those obtained for Arabidopsis, rice and poplar (Fig. 4, Material and Methods). Consistent with our previous results [22], the pattern of overrepresentation (Z-score profile) for the TBX, Gbox, ME, and EE shared consistent time-of-day Z-score peaks across these distantly related species. In addition, we found that the conservation of the Z-score profile was highly specific across conditions. For instance, in Arabidopsis we found that the phase of overrepresentation for the TBX is dependent on condition [22]. Similar to Arabidopsis, in papaya we found that under conditions without temperature cycles, the TBX is overrepresented before dusk, while under any condition that includes a temperature cycle the TBX is overrepresented before dawn (Fig. 5). This finding is consistent with circadian transcriptional networks being highly conserved across these distantly related species.

Fig. 4
figure 4

Time-of-day overrepresentation circadian cis-elements conserved in papaya, Arabidopsis, poplar and rice. Z-score profiles were summarized into elements for words sharing both sequence similarity and time-of-day overrepresentation. Z-score profiles for the GBOX (blue, CACGTG), TBX (orange, AAACCCT), EE (black, AAATATCT) and ME (yellow, CCACAC) were conserved across papaya a, Arabidopsis b, poplar c and rice d. Z-score threshold ~3, P < 0.05. The reverse complement of each element is presented in the legend

Fig. 5
figure 5

The papaya TBX displays a condition dependent Z-score profile shift. The papaya TBX (AAACCCT) Z-score profile displays a distinct time-of-day overrepresentation depending on the Arabidopsis condition used to assign phase to papaya. a LLHC (dotted line) and LDHH (solid line); b LDHC (dotted line) and short day (solid line); Z-score threshold (thin dotted line, ~3, P < 0.05). Conditions explained in Fig. 2 legend

Predicted Circadian Clock Genes Cycle Under Intermediate Day Conditions in C. Papaya

One assumption underpinning comparative genome analysis is that orthologous genes behave in a similar way across species. To date, the circadian clock of Arabidopsis is the best described in higher plants. Multiple groups have studied the expression of circadian clock orthologs to determine if the timing of expression is conserved across species [4, 8, 14, 18, 29, 32-34]. In almost all cases, the phase of expression is conserved across species (Table 2).

Table 2 Time-of-day (phase) expression of circadian clock orthologs across species

To test this in papaya, we performed two independent 48-hour time courses in mature papaya trees. We collected young emerging leaves and measured expression by quantitative real-time PCR (qPCR). We found that all of the predicted circadian clock orthologs cycle with the same phase in papaya as in Arabidopsis under both diurnal and circadian conditions (Table 2; Fig. 6, Fig. S1). CpLHY peaks in the morning while CpTOC1 peaks in the evening under both diurnal and circadian conditions (Fig. 6, Fig. S1). These results are consistent with our findings that the transcriptional networks are conserved between papaya and Arabidopsis, and also consistent with the idea that time-of-day networks may be conserved across higher plants. In addition, we tested our Arabidopsis-papaya ortholog phase predictions on a several randomly selected genes and found that the phase of expression was the same in papaya as in Arabidopsis (Fig. S2). Together with the in silico promoter analysis, these findings support global conservation of time-of-day expression in papaya.

Fig. 6
figure 6

Papaya circadian clock genes cycle with the same phase as Arabidopsis genes. Papaya circadian clock orthologs were used to design quantitative real-time PCR (qPCR) primers to detect relative transcript levels. New leaves from three-month-old trees were sampled under diurnal conditions every four hours over two days. a Diurnal expression of AtTOC1/PRR1 (solid black line) and AtLHY (dotted black line). b Diurnal expression of CpTOC1/PRR1 (solid black line) and CpLHY (dotted black line). At, Arabidopsis thaliana and Cp, Carica papaya. Data represent two independent biological replicates. Bars at top represent diurnal cycle (light, white box and dark, black box)

However, in rice, the phasing of some of the PRR family members is distinct from their Arabidopsis orthologs. Whereas AtPRR9 peaks at dawn and AtPRR5 peaks in the late afternoon, OsPRR95 and OsPRR59 peak at the same time in the evening. We found that CpPRR5A and CpPRR5B, similar to AtPRR5 and AtPRR9 respectively, displayed distinct time of day expression (Fig. 7a). CpPRR5A peaked in the late afternoon/dusk, similar to AtPRR5, while CpPRR5B peaks in the morning, similar to AtPRR9. The expression patterns of the papaya PRR5 genes is consistent with the multiple alignment in Fig. 1, which suggests that CpPRR5B or more closely related to AtPRR9.

Fig. 7
figure 7

Papaya PRR5 promoter structure evolution predict morning and evening expression for CpPRR5A and CpPRR5B respectively. a Expression of CpPRR5A (solid black line) and CpPRR5B (dotted black line). b CpPRR5A and CpPRR5B promoter structure reveals similar spacing and expansion of known elements EE (blue), ME (red square), TGA (green triangle), Gbox (pink) and CBS (yellow square). ATG represented by arrow

We were very interested in the phasing of the CpPRR5 paralogs, so we looked more closely at the CpPRR5 promoter regions. We compared 500 bp from CpPRR5A and CpPRR5B and found a concentration of circadian response elements (EE/GATA, ME/Gbox) and the TGA element (TGACGTGG), which is found in multiple copies in both Arabidopsis [24] and poplar PRR promoters (T.P. Michael, unpublished observation). Both promoter regions were highly similar for circadian response element position and number, with two exceptions (Fig. 7b). We found that there was an expansion of an EE cluster in the CpPRR5A promoter and an ME expansion in the CpPRR5B promoter. Based on our promoter analysis here and published [22, 24], and empirical in vivo promoter studies [12, 23], we would predict that an increased number of EE or ME would confer evening-specific or morning-specific expression respectively. Therefore, the difference in timing of expression between CpPRR5A and CpPRR5B is consistent with their promoter composition. Moreover, we found it striking that the expansions of the EE and ME occurred without much disruption to the linear arrangement of elements in the papaya PRR5 promoters. This suggested to us that the papaya PRR promoters may have diverged very recently and that there was some selective pressure to maintain the expression difference between these two genes. We hypothesize that in addition to the post-translational regulation reported for the members of the PRR family [10, 11, 15] time-of-day expression of the PRR family may represent another important layer of biological regulation, which is a substrate for selective pressure.

Discussion

In this study we utilized the draft papaya genome sequence to elucidate conservation of the circadian clock across distantly related plant species. We established a searchable database of papaya orthologs between Arabidopsis, poplar and rice and utilized this database to identify papaya circadian clock orthologs. We suggest that the PRR gene family in papaya reflects a recent gene family expansion compared with that in Arabidopsis. This was further verified with the functional diversification we found in the CpPRR5 promoters where time-of-day specific elements have expanded to specify different phases of expression. We defined time-of-day cis-elements using a novel in silico promoter searching technique and demonstrated that time-of-day elements are conserved between papaya, Arabidopsis, poplar, and rice, consistent with selection acting directly on time-of-day activities. Finally, we demonstrate that circadian expression timing as well as sequence is conserved in circadian clock genes in papaya.

Almost universally across species the circadian clock orthologs of CCA1/LHY and TOC1/PRR1 peak in the morning and evening respectively [4, 8, 14, 18, 29, 3234]. Papaya is no exception and many genes cycle with similar phases as other species, suggesting that much of the basic circadian machinery in papaya is conserved. Differences between species do exist, both CpZTL and McZTL cycle with peak gene expression in the evening [4] (Table 2). In contrast, in Arabidopsis ZTL protein abundance cycles while its transcript does not [16, 36]. Considering the importance of ZTL in the circadian system [17], it will be interesting to see how ZTL expression impacts its functions in these other plant species.

The expression differences that we observed between the PRR family of genes in papaya, Arabidopsis and rice suggested that these expression differences contribute to functionality. In contrast to PRR59 and PRR95, which have the same phase of expression in both duckweed and rice [29, 33], the gene expression of the paralogs in both papaya and Arabidopsis display distinct phases at midday and dusk. Similarly, the gene expression of the PRR7/3 paralogs in papaya and Arabidopsis displays distinct phasing. This could represent a distinct functionality between monocots and dicots. In monocots, PRR7 plays a prominent role in flowering time and has been identified in two different QTL studies with barley and rice [32, 38]. In contrast, QTL or induced mutants in AtPRR7 result in modest changes in the circadian clock [9, 27]. The fact that the CpPRR5A and CpPRR5B promoters are similar in their linear arrangement of elements, and that increased numbers of specific cis-elements correlate with their distinct time-of-day expression, suggests that there must be some pressure to cause the expression of these genes to diverge from that found in the more basal monocots. This finding suggests that despite the importance of post-translational modification on the PRR proteins [10, 11, 15,], there must be evolutionary pressure on the temporal expression on these genes.

Consistent with idea that there is evolutionary pressure at the level of temporal expression, we have identified conserved cis-acting elements across papaya, Arabidopsis, rice and poplar using an in silico approach. These results extend our previous results between Arabidopsis, rice, and poplar [22], further confirming the power of this technique to identify conserved non-coding sequence between species and elucidate time-of-day transcriptional networks across plants. Recently we have verified these cis-elements empirically in both rice and poplar (T.C. Mockler and T.P. Michael, unpublished data). The fact that the in silico technique works between distantly related species with limited genomic colinearity suggests that there are groups of genes whose time-of-day co-expression is essential to plant fitness. Therefore, the genes that fall out of these co-expression clusters may represent novel activities in these species, providing a platform for identify diverging classes of genes between species. Time-of-day expression profiling facilitates the annotation of non-coding sequence and identification of novel functional gene clusters in newly sequenced plant species.

It was somewhat surprising that the time-of-day networks were conserved in papaya considering its tropical habitat and distinct life-style from Arabidopsis, rice and poplar. Yet it may be that the mechanism controlling synchronous flowering in tree at the equator is more of a circadian regulated system [3], and that the findings in temperate plants are broadly applicable. This conservation across distantly related plant species suggests that we could generate general principles that would apply to a host of plants for altering plant growth pathways for specific environments. For instance, one of these conserved time-of-day networks coordinates the growth promoting expression of phytohormones to coincide with the important environmental signal of the rising sun at dawn [26]. Recently, a forward genetic screen in the green algae Chlamydomonas reinhardtii revealed mutants in key plant circadian clock homologues, which cycle with similar phase of expression as Arabidopsis [20]. If time-of-day pathways are conserved across dicots, monocots, and single celled algae, this could provide new opportunities to engineer generalized strategies to control growth in an environment specific fashion from algae to higher plants.

Methods

Papaya Growth Conditions

Carica papaya transgenic variety ‘SunUp’ seeds were germinated and grown under intermediate days (12 h of light and 12 h dark, 12L12D) and continuous temperature (22°C) for three months to maturity. For the time courses, half of the trees were moved to a Continuous Light (LL) chamber at time 0 (T = 0 h), lights on. The first leaf tissue collection began at T = 24 h under continuous light and T = 0 h under light/dark cycles. Tissue was collected every 4 h for two days under both circadian and diurnal conditions. Samples were immediately frozen in liquid nitrogen and stored at −80 C prior to RNA extraction.

Quantitative Real-time PCR

Quantitative realtime PCR (qPCR) was carried out as described [31]. Briefly, frozen papaya tissue was ground in 2 ml tubes with ball bearings. RNA was extracted using RNeasy (QIAGEN) with on column DNAase treatment, first strand cDNA was synthesized (Invitrogen, Carlsbad, CA) and used directly for qPCR assay on a myIQ (BioRad). Expression values were calculated as a function of CT values normalized to a standard dilution series over all samples assayed. Papaya primer sequences are described in Table S1.

Arabidopsis-Papaya Circadian Clock Orthologs

Arabidopsis-papaya orthologs were identified using a mutual best-blast hit strategy as described for Arabidopsis-rice and Arabidopsis-poplar [22]. In brief, papaya protein Y was blasted against all Arabidopsis proteins, which yields protein X as its best blast match, and then blasting protein X against all papaya proteins yields protein Y as protein X’s best-blast match. In this case, the two proteins are called best-blast matches (BBM) and referred to as putative orthologs. We further imposed a filter requiring all blast matches to be less than 1e-5, to reduce spurious poor (but still mutual best) blast matches. An Arabidopsis-papaya ortholog pair represents two proteins, which are mutual best blast matches. Arabidopsis-papaya, poplar-papaya, rice-papaya and sorghum-papaya orthologs can be searched using our online tool called ORTHOMAP (http://orthomap.cgrb.oregonstate.edu/).

Papaya In Silico Promoter Analysis

Papaya in silico promoter analysis was carried out as described for poplar and rice [22]. In brief, the phase of the best-blast Arabidopsis ortholog was assigned to its papaya ortholog, and papaya gene lists for each phase of the day (0–23 hrs) were assembled. Each phase gene list contained hundreds of papaya genes and served as the input for promoter searching using ELEMENT (http://element.cgrb.oregonstate.edu) [30]. ELEMENT stores 500, 1000, 2000 and 3000 bp upstream of the predicted ATG from each papaya gene as putative promoter sequence; for this study we used 500 bp as the papaya promoter length. Using the papaya in silico phase gene lists, ELEMENT was used to assign an overrepresentation score, Z-score, for each 3–8mer in the papaya promoters; there are a total of 43,847 3–8mers. Every 3–8mer was assigned a Z-score for every phase of the day and was plotted as a function of time. We refer to the resulting graph as a “Z-score profile.” The significance level of the Z-score profile was established as described [22]. Briefly, ELEMENT was used to assign a significance z-score to each word for each phase bin. The z-scores were then plotted for each phase bin over the day creating a ‘z-score profile’ for each time course. To adjust for multiple testing, we applied the Benjamini & Hochberg method to the one-tailed p-values corresponding to the observed z-scores. This allowed us to establish a z-score threshold based on the equivalent corrected p-value. Only Z-scores profiles with Z-scores greater than the threshold at more than two consecutive times over the day were retained in the analysis. The hypothesis for filtering Z-score profiles in this last step is that if an element were to be active based on its overrepresentation, then it would be overrepresented in adjacent phases of the day. Then the 3–8mer was assigned a phase value (time in hrs from lights on or subjective dawn) based on the time of day that adjacent Z-scores are significant. The phase Z-score profile phase was then used to cluster similar Z-score profiles, and compare between species.

Papaya Sequence Used In This Study

Carica genome sequence, promoter sequence and predicted proteins can be found at our web site: http://diurnal-files.cgrb.oregonstate.edu/papaya_sequence/