ABSTRACT
To what extent can ancestral function be deduced from the current roles of duplicated genes? Insect Hox3/zen genes represent an evolutionary hotspot, with single orthologues required either for early specification or late morphogenesis of the extraembryonic tissues. The zen paralogues of the beetle Tribolium castaneum present a unique opportunity to investigate both functions in a single species. We show that despite high sequence similarity the paralogues’ expression dynamics (transcript and protein) and transcriptional targets (RNA-seq after RNAi) are non-redundant. Rather, we find that Tc-Zen2 represses Tc-zen1, producing an evolutionarily novel negative feedback loop that confers a high level of temporal precision to early specification. For late morphogenesis, our molecular profiling of Tc-Zen2 also uncovers a transcriptional delay and partial recovery that underpins the spectrum of morphological phenotypes. Altogether, our molecular dissection reveals that complementary roles and mutual regulation reinforce paralogue retention, implying an evolutionary scenario of iterative subfunctionalization.
INTRODUCTION
Change over macroevolutionary time scales can produce new gene functions. The Hox3/zen genes of insects represent a case in point. Across the bilaterian animals, Hox genes are conserved in genomic organization, expression, and function, with roles in tissue specification along the anterior-posterior body axis of the developing embryo (Krumlauf 1992). Instead, the Hox3 genes in winged insects, known as zen, are prone to genomic microinversions (Negre and Ruiz 2007; McKenna, et al. 2016; Armisen, et al. 2018), and they are required in the novel tissue domain of the extraembryonic membranes (EEMs) (Panfilio 2008).
The EEMs, simple (monolayer) epithelia, are an evolutionary innovation to protect the developing insect. Initially they surround the early embryo, forming a multilayered barrier from the environment, with an outer serosa and inner amnion (Panfilio 2008). In particular, the serosa is capable of innate immune responses (Chen, et al. 2000; Jacobs, et al. 2014) and it secretes a thick chitin-based cuticle that mechanically reinforces the eggshell and provides desiccation resistance (Rezende, et al. 2008; Jacobs, et al. 2013; Farnesi, et al. 2015). However, in later development the EEMs must actively withdraw to ensure correct closure of the body (Panfilio, et al. 2013; Hilbrant, et al. 2016).
Functional studies have identified roles for zen in either early EEM specification or late EEM withdrawal (reviewed in (Horn, et al. 2015)). Although the Hox3 locus is prone to lineage-specific duplications (Panfilio and Akam 2007; Ferguson, et al. 2014), to date a single EEM function – specification or morphogenesis (tissue remodeling for withdrawal) – is known per species in bugs and flies (Wakimoto, et al. 1984; Panfilio, et al. 2006; Rafiqi, et al. 2008). This is even true in the derived case of the fruit fly Drosophila melanogaster, which has three functionally distinct paralogues: zen itself is involved in EEM specification, the duplicate z2 is not required for embryogenesis, and the dipteran-specific bicoid has become a maternal determinant with no extraembryonic role (Pultz, et al. 1988; Rushlow and Levine 1990; Stauber, et al. 1999; McGregor 2005; Rafiqi, et al. 2008). Furthermore, secondary tissue simplification of the EEMs in Drosophila obviated the requirement for the late withdrawal function (Horn, et al. 2015). Thus, the original role of zen within the extraembryonic domain has been obscured by ongoing evolutionary changes in both zen and the EEMs.
There is a notable exception to the pattern of a single EEM role of zen per species. In the red flour beetle, Tribolium castaneum, zen has undergone a tandem duplication. Tc-zen1 was first cloned from cDNA (Falciani, et al. 1996), while Tc-zen2 was later identified by sequencing the Hox cluster directly (Brown, et al. 2002). The paralogues are striking for their compact, shared gene structure and for their proximity: within the 58-kb region between Hox2/mxp and Hox4/Dfd, the paralogues occupy a <3-kb interval, with only 216 bp between the 3′ UTR of Tc-zen1 and the initiation codon of Tc-zen2 (Brown, et al. 2002). Nonetheless, subsequent functional diversification has equipped the paralogues with either of the two known EEM functions: early-acting Tc-zen1 specifies the serosal tissue, while Tc-zen2 is required for late EEM withdrawal morphogenesis (van der Zee, et al. 2005; Hilbrant, et al. 2016). We thus asked to what extent a detailed molecular characterization of the beetle paralogues could elucidate the evolutionary history of changes between the specification and morphogenesis functions of zen orthologues.
Here we present differences in the regulation of Tc-zen1 and Tc-zen2 as well as in their own transcriptional signatures as homeodomain transcription factors. Surprisingly, peak expression does not coincide with the time of primary function – detectable morphologically and transcriptionally – for Tc-zen2, which despite its late role has strong early expression like Tc-zen1. Yet, instead of a lack of function or shadow redundancy to Tc-Zen1, we uncover a distinct early role of Tc-Zen2 in the regulation of a key subset of genes. The RNA-seq data also reveal subtle aspects of temporal variability (heterochrony) after Tc-zen2 RNAi that affect late morphogenesis. Our validation of specific transcriptional targets opens new avenues into serosal tissue biology and identifies a novel, paralogue-based regulatory circuit at the developmental transition from specification to maturation of the serosa. This now raises the question of how species with a single zen gene compare for the precision and progression of EEM development, and whether their molecular phenotypes support early Tc-Zen2 function as an instance of both neofunctionalization and subfunctionalization events.
RESULTS
Recent tandem duplication of zen in the Tribolium lineage
We first surveyed Tribolium beetle genomes to assess sequence conservation at the Hox3 locus. Using the T. castaneum paralogues as BLASTn queries, we find that the tandem duplication of zen is conserved across three closely related congenerics: T. freemani, T. madens, and T. confusum (Fig. 1A, 14-61 million years divergence (Angelini and Jockusch 2008)). Consistent with a recent event, phylogenetic analysis supports a single duplication at the base of the Tribolium lineage, and sequence alignments show particularly strong conservation in the homeobox, encoding the DNA-binding homeodomain (Fig. 1B, S1).
Next, we investigated levels of coding sequence conservation between the T. castaneum zen (hereafter “Tc-zen”) paralogues. Strongest nucleotide conservation occurs within the homeobox, where three conservation peaks correspond to the three encoded α-helices (Fig. 1C: >80% identity). In fact, within the coding sequence for the third α-helix there is a 20-bp stretch with 100% nucleotide identity (Fig. 1C), which is roughly the effective length of sequence for achieving systemic knockdown by RNA interference (RNAi) (Svobodova, et al. 2016). Indeed, Tc-zen1-specific double-stranded RNA (dsRNA) that spans the homeobox is sufficient to effect cross-paralogue knockdown of Tc-zen2 (Fig. 1D; beta regression, z=4.718, p<0.001), although a short fragment alone is sufficient to strongly knock down Tc-zen1 itself (no significant change in knockdown efficiency between the long and short fragments: beta regression, z=0.558, p=0.577). For all subsequent paralogue-specific functional testing, we thus designed our dsRNA fragments to exclude the homeobox and thereby avoid off-target effects (Fig. 1C: Tc-zen1 short fragment: yellow; Tc-zen2: green).
Distinct roles of the Tc-zen paralogues at different developmental stages
EEM development has been well characterized morphologically in the beetle (Handel, et al. 2000; Benton, et al. 2013; Panfilio, et al. 2013; Koelzer, et al. 2014), including the Tc-zen paralogues’ roles. Briefly, the first differentiation event distinguishes the serosa from the rest of the blastodermal cell sheet (Fig. 2A, at ~10% embryonic development). Tissue reorganization then involves serosal expansion and internalization of the embryo and amnion (EEM formation: subdivided into the “primitive pit” and “serosal window” stages). This topology is later reversed when the EEMs actively rupture and contract (“withdrawal”), coordinated with expansion of the embryo’s flanks for dorsal closure of the body (Fig. 2C, at ~75% development). After Tc-zen1 RNAi, presumptive serosal cells are respecified to other anterior fates, leading to an early enlargement of the head and amnion (Fig. 2B) (van der Zee, et al. 2005). Tc-zen2 RNAi impairs or wholly blocks late EEM withdrawal (van der Zee, et al. 2005; Hilbrant, et al. 2016), confining the embryonic flanks such that the epidermis encloses the embryo’s own legs instead of closing the back, leading to an everted (inside out) configuration (Fig. 2D) (Truckenbrodt 1979; Hilbrant, et al. 2016).
Here, we were able to reproduce the morphological phenotypes after RNAi for each Tc-zen paralogue (Fig. 2A′-D′). RNAi is particularly efficient for Tc-zen1 (98.8% knockdown, Fig. 2E). Specific phenotypes after Tc-zen2 RNAi (73.8% knockdown) include complete eversion (20.5%, Fig. 2D’) as well as milder defects in EEM withdrawal (53.3%, Figs. 2F, S2). Furthermore, we newly explored how the paralogues’ functions relate to their transcript expression profiles across embryogenesis. Consistent with their functions, Tc-zen1 has early expression while only Tc-zen2 persists until the membrane rupture stage (Fig. 2G). Unexpectedly, late-acting Tc-zen2 also has strong expression during early development.
The Tc-zen paralogues exhibit subtle differences in expression during early development
To gain insight into Tc-zen gene regulation and to determine the developmental stages of primary transcription factor function for each paralogue, we undertook a fine-scale spatiotemporal characterization of Tc-zen1 and Tc-zen2 expression for both transcript and protein (RT-qPCR, in situ hybridization, western blotting, immunohistochemistry).
As both paralogues are strongly expressed in early development (Fig. 2G), we first examined these stages in detail. Tc-zen1 transcript arises as an anterior gradient during blastoderm formation (4-6 hours after egg lay, hAEL), peaks at the differentiated blastoderm stage with uniform expression throughout the presumptive serosa (6-10 hAEL), and then becomes patchy and retracts to a narrow region at the tissue’s border during EEM formation (10-14 hAEL; Fig. 3A-F). After the EEMs have fully enclosed the early embryo, Tc-zen1 transcript is no longer detected (Figs. 2G, 3A). Peak Tc-zen1 transcript expression is followed shortly by detectable protein for Tc-Zen1, although this, too, only occurs during early development (Figs. 4A, S3A).
Tc-zen2 expression begins slightly later, at the differentiated blastoderm stage (6-8 hAEL), with peak levels occurring during EEM formation (10-14 hAEL; Fig. 3A). We also observed spatial differences between the paralogues. Tc-zen2 is first detected only in an anterior subset of the serosa when Tc-zen1 is expressed in the entire tissue (compare Fig. 3C,H). Then, Tc-zen2 transcript expands throughout the serosa while Tc-zen1 transcript retracts, concomitant with the expansion of the entire serosal tissue during EEM formation (compare Fig. 3D-F,I-K). Notably, the Tc-zen paralogues are expressed consecutively, but not concurrently, at the rim of the serosa. It is only during late EEM formation, at the serosal window stage, that we first observe Tc-zen2 expression throughout the entire serosal tissue (Fig. 3K). By this time, Tc-Zen2 protein is also strongly expressed and persists (Figs. 4A, S3B, and see below), while Tc-zen2 transcript wanes gradually (from 14 hAEL; Fig. 3A).
Transcriptional impact of Tc-zen1 and Tc-zen2 during early embryogenesis
Since protein expression follows shortly after peak transcript expression for both paralogues (Figs. 3A, 4A), we used the high sensitivity of our RT-qPCR survey (Fig. 3A) to inform our staging for functional testing by RNAi. To identify transcriptional targets for each zen gene, our RNA-seq after RNAi approach assessed differential expression (DE) between age-matched wild type and knockdown samples. We focused specifically on the time windows of peak gene expression: 6-10 hAEL for Tc-zen1 and 10-14 hAEL for Tc-zen2 (curly brackets in Fig. 3A). These four-hour windows were chosen to maximize the number of identified target genes while prioritizing direct targets for Zen transcription factor binding.
The RNA-seq data are consistent with a priori expectations based on the morphological consequences of RNAi for each zen gene (Fig. 2A-D). That is, Tc-zen1 has a clear early role in tissue specification, and its knockdown at these stages has a strong transcriptional impact, wherein principal component analysis (PCA) clearly distinguishes experimental treatments (Fig. 5A). In contrast, Tc-zen2 has an early expression peak but its manifest role in late EEM withdrawal occurs nearly two days later (56% development later). Accordingly, we find a negligible effect on the early egg’s total transcriptome after Tc-zen2 RNAi (Fig. 5A), despite verification of efficient knockdown (Fig. 2F). RNAi efficiency was also confirmed directly with the RNA-seq data: both Tc-zen1 and Tc-zen2 exhibit DE reduction after their respective knockdown. Overall, we obtained 338 DE genes after Tc-zen1 RNAi compared to only 26 DE genes after Tc-zen2 RNAi, while global transcriptional changes affect nearly 12% of the OGS during early embryogenesis (2221 DE genes: Fig. 5F-a,c,d, Tables S1A-C).
Given the recent nature of the duplication, evident in the similarity of the Tc-zen paralogues’ DNA-binding homeodomains and early expression profiles, we asked whether there is a legacy of shared early function. If this is the case, Tc-zen2 might exhibit a subtle regulatory profile similar to Tc-zen1. However, even with relaxed thresholds for differential expression, we find few shared targets between the paralogues, particularly when the direction of regulation is considered (Fig. 5B, Tables S2A-B). Thus, we conclude that Tc-zen2 has a minimal effect on early development, and that this does not constitute a transcriptional “echo” of co-regulation with Tc-zen1 due to common ancestry. Why, then, is Tc-zen2 strongly expressed during early development?
The Tc-zen paralogues are mutual regulatory targets
We next considered the Tc-zen paralogues as factors necessary for defining the serosal tissue, indicated by specific transcriptional targets. Tc-zen1 is strictly required for serosal tissue identity (van der Zee, et al. 2005). Differentiation of the serosa involves an early switch from mitosis to the endocycle (Handel, et al. 2000; Benton, et al. 2013), resulting in polyploidy (Panfilio, et al. 2013). Consistent with this, we identified a homologue of the endocycle factor fizzy-related among DE genes upregulated by Tc-Zen1 (Table S1A) (Schaeffer, et al. 2004; Cohen, et al. 2018). From known targets of Tc-Zen1, we also recovered Dorsocross and hindsight, involved in EEM formation (Horn and Panfilio 2016), and chitin synthase 1, required for production of the protective cuticle (Jacobs, et al. 2013). Additionally, we hypothesized that the slight offset whereby Tc-zen1 expression precedes Tc-zen2 is consistent with Tc-zen1 activating Tc-zen2. We could confirm this regulatory interaction both by RNA-seq and RT-qPCR after Tc-zen1 RNAi (Fig. 6A-B). Thus, Tc-Zen1 as a serosal specifier upregulates factors for definitive tissue differentiation, including Tc-zen2 as a candidate (Fig. 6I).
Are there Tc-Zen2 transcriptional targets that could support an early role in the serosa? Among the few genes with strong differential expression (Fig. 5F-d), we validated several as likely targets. These genes are expressed in the early serosa and/or their transcript levels are first strongly upregulated during peak Tc-zen2 expression (12-14 hAEL; e.g., Fig. S4). Their putative functions as enzymes or structural components for chitin-based cuticle (Cpr’s) and as signaling molecules support a role for Tc-zen2 in the physiological maturation of the serosa, providing complementary regulatory control to Tc-Zen1.
In performing reciprocal validation assays, we then uncovered an unexpected early function of Tc-Zen2 in the repression of its own paralogous activator. After Tc-zen2 RNAi, we detect an upregulation of Tc-zen1 that was only weakly suggested by the RNA-seq data but then strongly supported in RT-qPCR assays (Fig. 6A-B). We confirmed this observation by in situ hybridization. After Tc-zen2 RNAi, Tc-zen1 transcript is expressed at higher levels than in wild type (compare Fig. 6C-D,F-G). Tc-zen1 also remains strongly expressed throughout the serosa at stages when wild type expression is restricted to low levels at the tissue rim (compare Fig. 6E,H). In fact, the abrupt reduction in Tc-zen1 transcript and protein levels in wild type correlates with increasing Tc-zen2 levels, and spatially their dynamic expression is largely complementary, if not outright mutually exclusive (Figs. 3,4). Together, these results suggest that Tc-Zen1 upregulates Tc-zen2 in its wake, and that in turn early Tc-Zen2 represses Tc-zen1. Thus, the Tribolium paralogues function as mutual regulatory targets, comprising an integrated regulatory module for early serosal development (Fig. 6I).
Tc-Zen2 is exclusively serosal, with persistent nuclear localization
To complete our analysis of Tc-zen2, we also examined its activity at later stages. We could detect both transcript (weakly, Figs. 2G, 3A) and protein (particularly strongly in mid-embryogenesis, Fig. 4A) continuously until the stage of EEM withdrawal, spanning 14-75% of development (10-54 hAEL, assayed in two-hour intervals; see also Fig. S3B). Moreover, we find that Tc-Zen2 is persistently localized to the nucleus, demonstrated by fluorescent immunohistochemistry on cryosectioned material of selected stages (Fig. 4B-E,G,H). This contrasts with some species’ orthologues, which show stage-specific exclusion from the nucleus (Dearden, et al. 2000). We could also refine the spatial scope of Tc-zen2 activity: in contrast to earlier reports (van der Zee, et al. 2005), we found no evidence for Tc-zen2 transcript or protein in the amnion (Fig. 4D-H″), indicating that this factor is strictly serosal.
Late transcriptional dynamics are largely serosa-specific and Tc-zen2-dependent
The early RNA-seq after RNAi experiment examined the time of peak Tc-zen2 expression. Complementing this, we used the same approach to examine the stage of known Tc-zen2 function in late EEM withdrawal. Withdrawal begins with rupture of the EEMs, at 52.1 ± 2.3 hAEL as determined by live imaging (Koelzer, et al. 2014). Here, we assayed the four-hour intervals just before (48-52 hAEL) and after (52-56 hAEL) rupture, to assess Tc-zen2 transcriptional regulation that precedes and then accompanies withdrawal. Consistent with Tc-zen2’s known role, we detect >16× more DE genes after Tc-zen2 RNAi in late development (>430 DE genes, compare Fig. 5F-e,f with 5F-d). PCA also clearly separates knockdown and wild type samples at late stages (Fig. 5C).
Our staging helps to contextualize Tc-zen2 and EEM-specific processes relative to concurrent embryonic development. We thus evaluated differential expression in pairwise comparisons not only between wild type and RNAi samples, but also over time in both backgrounds (Fig. 5D,F-b,e-g). Comparisons across consecutive stages in early and late development (Fig. 5F-a,b) reveal two general changes in the wild type transcriptional landscape. There is far less dynamic change in gene expression in late development (5.8× fewer DE genes), consistent with steady state and ongoing processes in later embryogenesis compared to the rapid changes of early development. Also, whereas early development shows a fairly even balance between up- (48%) and downregulation (52%), late development is predominantly characterized by increasing expression levels over time (79%).
Against this backdrop, the transcriptional impact of Tc-zen2 is quite pronounced. Most genes with changing expression over time in the late wild type background are also affected by Tc-zen2 RNAi (Fig. 5D: 77%, 293/383 DE genes from green Venn diagram set). We detect this strong effect even though Tc-zen2 is restricted to the serosa (Fig. 4), a tissue that ceased mitosis (Fig. 6I) and comprises only a small cell population within our whole-egg samples. This suggests that most dynamic transcription in late development pertains to EEM morphogenesis, with the global transcriptional impact of Tc-zen2 at these stages even greater than for Tc-zen1 in early development (Fig. 5F-e,f, cf. 5F-c). Most candidate Tc-zen2 targets are differentially expressed at a single stage (72%), although a substantial fraction (26%) exhibits consistent activation or repression, while an intriguing handful of genes shows changing, stage-specific regulation (Fig. 5E). These patterns imply that the persistent nuclear localization of Tc-Zen2 (Fig. 4) reflects active transcriptional control, not merely localization to the nucleus or DNA binding in a paused, non-functional state (Banks, et al. 2016).
To characterize late Tc-Zen2 activity, we functionally annotated and validated candidate transcriptional targets. Gene ontology enrichment tests confirmed that ongoing cuticle regulation is a primary role, including remodeling as the serosa detaches from its own cuticle in preparation for withdrawal (Fig. S5, Table S4A-B). For validation, we selected a dozen genes based on known biological processes for tissue remodeling (e.g., cytoskeleton and morphogenesis), prominent GO categories (e.g., transmembrane transporters), and evidence of dynamic regulation (Figs. 5E, S6A, Tables S5-S6). All tested candidates were confirmed by RT-qPCR (Fig. S6B). This included two of the genes that are first activated and then repressed by Tc-Zen2, where both genes encode proteins with conserved domains of unknown function (Table S6). Lastly, we evaluated Tc-Zen2 regulation of serosal immune genes (Jacobs, et al. 2014). Although our samples were not pathogen challenged, we could detect expression for 83% of these genes (89 of 107 genes), with 20% showing differential expression after Tc-zen2 RNAi (Table S3A-B). Thus, while Tc-Zen2 is not a global effector, it may regulate subsets of immune genes. Notably, transcripts of most serosal immune genes (87 genes) continue to be detected during withdrawal, supporting their expression as an inherent feature of the serosa – even when it is no longer a protective layer enclosing the embryo.
Evidence of variable developmental delay after Tc-zen2 RNAi
The Tc-zen2RNAi molecular phenotype also provides new insight into the physical phenotype of defective EEM withdrawal, suggesting that a variable, partial delay in preparatory transcriptional changes is the underlying cause.
Several observations are consistent with a delay. As noted above, all late RNA-seq biological replicates cluster by treatment in PCA. Interestingly, the older Tc-zen2RNAi samples (52-56 hAEL) have intermediate component scores compared to the clusters for the younger Tc-zen2RNAi and younger wild type samples (48-52 hAEL, Fig. 5C). Similarly, DE comparisons identify noticeably fewer DE genes between the older Tc-zen2RNAi sample and either of the younger samples (Fig. 5F-g,h, Tables S3D-E). In fact, the very low number of DE genes implies that there is virtually no difference in the transcriptional profile of the older Tc-zen2RNAi sample compared to the younger wild type sample (Fig. 5F-h). At the same time, nearly all genes that change in expression over time in the Tc-zen2RNAi background are also candidate targets of Tc-Zen2 at the pre-rupture stage (95%, Fig. 5D: inset Venn diagram). In other words, Tc-zen2RNAi eggs generally require an additional four hours (5.6% development) to attain a transcriptional profile comparable to the wild type pre-rupture stage, and this is achieved by belated activation of Tc-Zen2 target genes. However, only a subset of genes exhibit delayed recovery (34%, Fig. 5D inset). These target genes may thus be independently activated by other factors, in addition to activation by Tc-Zen2.
Our RNA-seq data also reveal increased variability after Tc-zen2 RNAi. The pre-rupture Tc-zen2RNAi biological replicates show comparably tight clustering to their age-matched wild type counterparts (48-52 hAEL, Fig. 5C). This suggests that pre-rupture is the stage of primary Tc-Zen2 function, also supported by our detection of the greatest number of DE genes at this stage (compare Fig. 5F-d,e,f). In contrast, the older Tc-zen2RNAi samples have a noticeably greater spread along the vectors of the first two principal components (52-56 hAEL, Fig. 5C), consistent with cumulative variability as the RNAi phenotype develops, presumably in part due to the observed partial transcriptional recovery (Fig. 5D). This variability may in itself provide explanatory power for the spectrum of end-stage Tc-zen2RNAi phenotypes (Fig. 2F, see below).
DISCUSSION
Our analysis of regulation upstream and downstream of the beetle zen genes reveals several unexpected features regarding the evolution and biological roles of these unusual paralogues.
Sequence conservation belies the extent of zen paralogue functional divergence
Fine-tuned transcriptional regulation is required to restrict regulatory crosstalk, and conserved non-coding regions may contribute to this. The region upstream of zen1 has particularly high conservation and was recently tested as an in vivo Tc-zen1 reporter (Fig. 1A: dashed line; Strobl, et al. 2018). This construct recapitulates expression at the rim of the serosal window, a feature common to both paralogues (as in Figs. 3F,K, 6E). However, early blastoderm Tc-zen1 expression is absent (cf., Fig. 3B-C), while subsequent embryonic/amniotic expression represents a wholly ectopic domain. Thus, regulation of the Tc-zen genes requires multiple inputs that remain to be elucidated.
Specificity of regulation by the Tc-zen genes is also elusive. In the Tc-zen homeobox, sequence similarity is particularly high in the third α-helix, which confers DNA-binding specificity (Fig. 1C-D) (Passner, et al. 1999; McGregor 2005; Liu, et al. 2018). Yet, the paralogues’ shared ancestry is not reflected in redundant activity (Fig. 5A-B). Rather, strong conservation, particularly of zen2 (Fig. S1), may indicate not only limited divergence but also positive, purifying selection (Lynch and Conery 2000). How, then, do the paralogues regulate different targets? In canonical Hox3 proteins, DNA-binding specificity can be enhanced by the common Hox co-factor Extradenticle (Passner, et al. 1999). In contrast, insect Zen proteins have lost the hexapeptide motif required for this interaction, and no other co-factor binding motifs are known (Panfilio and Akam 2007), deepening the long recognized “Hox specificity paradox” (Crocker, et al. 2015) in the case of the beetle zen paralogues. Nonetheless, our molecular dissection of the Tc-zen paralogues elucidates the strong extent of their functional divergence.
Mutual regulation has implications for the paralogues’ network logic and confers temporal precision
The newly discovered negative feedback loop of Tc-Zen1 activation leading to repression by Tc-Zen2 constitutes a tight linkage. To what extent could Tc-zen1 overexpression bypass upregulation of Tc-zen2 as its target, resulting in repression of Tc-zen1 and thus cancelling out the manipulation? In fact Tc-zen2 RNAi does confer overexpression of Tc-zen1 and reduced Tc-zen2 (Fig. 6, Table S1B). However, consistently lower knockdown efficiency for Tc-zen2 than for Tc-zen1 (Fig. 2, (van der Zee, et al. 2005)) may reflect a dose-limiting lack of regulatory disentanglement. Arguably, Tc-zen1 and Tc-zen2 together satisfy the criteria of a minimal gene regulatory network (GRN) kernel (Davidson and Erwin 2006), including “recursive wiring” and the experimental challenges this entails. Alternatively, the Tc-zen paralogues could be viewed as a single unit in a serosal GRN and thus qualify as a “paradoxical component” that both activates and inhibits (Fig. 6I) (Hart and Alon 2013). Consistent with theoretical expectations, delayed inhibition produces a discrete pulse of Tc-zen1 (Figs. 3,4). As the pulse is non-oscillatory, this may also imply that Tc-zen2 is a positive autoregulator (Hart and Alon 2013).
Furthermore, Tc-Zen2 was previously implicated in the unusual role of translational repression of the early embryonic factor caudal (Schoppmeier, et al. 2009). Conceivably, Tc-Zen2 repression of Tc-zen1 could act in a composite fashion, at both the transcriptional and translational levels. Composite activity could expedite repression – consistent with Tc-zen1’s abrupt decline (Figs. 3A, 4A) – and enhance stability of the system (Alon 2007). Similar regulatory dynamics have also been found in other contexts. Feedback loops with activation leading to inhibition can promote robustness (Gavin-Smyth, et al. 2013) and rapidly, precisely restrict expression (Hoffmann, et al. 2002; Nunes da Fonseca, et al. 2008). Whereas spatial precision contributes to patterning of distinct tissues (Nunes da Fonseca, et al. 2008), the serosal Zen feedback loop generates temporal precision.
Paralogue divergence promotes the progression of serosal development
Negative feedback implies a strong developmental requirement to repress Tc-zen1, even before the serosa has fully enclosed the embryo. Since Tc-zen2 persists in this same domain, why is this necessary? In Drosophila, expression of Dm-zen is also short-lived (Schmidt-Ott, et al. 2010), and its overexpression causes an increase in amnioserosal cell and nuclear size (Rafiqi, et al. 2010). The insect EEMs are known to be polyploid to characteristic levels (Reim, et al. 2003; Panfilio and Roth 2010; Panfilio, et al. 2013), and excessive ploidy could interfere with the tissues’ structure and function as barrier epithelia (Orr-Weaver 2015). Our RNA-seq DE analyses support a role for Tc-zen1, but not Tc-zen2, in promoting serosal endoreplication (Fig. 6I). Thus Tc-Zen2’s repression of Tc-zen1 may ensure a limited time window for this transition. The temporal offset also limits the amount of gene product of the paralogues’ sole shared target, the cuticle maturation factor aaNAT, and overall effects temporally graded cuticle production (Figs. 5B,6I,S4). Thus, the distinct roles of the Tc-zen paralogues offer a novel opportunity for regulatory refinement in the early serosa, with a finely tuned genetic separation of specification and maturation functions that fosters developmental progression.
New functions of Tc-Zen2 led to iterative subfunctionalization in Tribolium
What are the implications of the Tc-zen paralogues for the evolution of insect zen? Although a specification function has only been demonstrated in the Holometabola (Wakimoto, et al. 1984; van der Zee, et al. 2005; Panfilio, et al. 2006; Rafiqi, et al. 2008; Rafiqi, et al. 2010), early expression is also known from some hemimetabolous species (Dearden, et al. 2000; Hughes, et al. 2004). This suggests that the ancestral zen may have fulfilled both early specification and late morphogenesis roles. Then, the prominent functions of the Tc-zen paralogues would represent an instance of subfunctionalization (Force, et al. 1999). Furthermore, Tc-zen1 and Dm-zen differ from all other known homologues in lacking persistent serosal expression. Although implications of Dm-zen temporal restriction have been extensively discussed (Schmidt-Ott, et al. 2010; Schmidt-Ott and Kwan 2016), its downregulation is likely passive (Podos, et al. 2001; Miles, et al. 2008) and occurs much later than for Tc-zen1. Meanwhile, the Tribolium innovation of having two functional, early extraembryonic copies of zen may have originated as redundant early expression, but Tc-zen2’s active repression of Tc-zen1 constitutes a new role (neofunctionalization). In turn, this repression subdivides serosal specification between the paralogues (Fig. 6I) – a second instance of subfunctionalization – while ensuring that the late function is still only carried out by a single copy of zen (Fig. 6J).
Diverse functions of a diverged Hox gene in a novel tissue
We have uncovered multiple roles of Tc-zen2 as a diverged Hox gene throughout the lifetime of the serosal tissue, itself a morphological innovation (Panfilio 2008). Early, Tc-Zen2’s repression of Tc-zen1 (Fig. 6) and Tc-caudal (Schoppmeier, et al. 2009) is noteworthy. A predominantly repressive role contrasts with Hox genes typically serving as activators, as do both Tc-zen paralogues at the stages of their primary function (Fig. 5F-c,e). Also, the precise mechanism and targets of potential Tc-Zen2 translational repression remain open questions. Future work will clarify whether such a function arose independently in Zen2 and Bicoid (Stauber, et al. 1999; McGregor 2005; Liu, et al. 2018) as distinct Hox3/zen derivatives. In later development, the serosa is the cellular interface with the outer environment. Our data elucidate Tc-Zen2’s roles in the known protective functions of cuticle formation (Jacobs, et al. 2013) and innate immunity (Jacobs, et al. 2014). Beyond this, our DE gene sets comprise a large, unbiased sample of candidate targets, laying the foundation for investigating wider roles of Tc-zen2 in this critical tissue.
Finally, we identify Tc-zen2-dependent EEM withdrawal as the major transcriptionally regulated event in late development and assess its precision (Fig. 5D). Temporal and molecular variability after Tc-zen2 RNAi underpins observed variability in EEM tissue structure, integrity, and morphogenetic competence, defining the broad spectrum of end-stage phenotypes (Figs. 2, S2). This ranges from mild defects in dorsal closure after transient EEM obstruction to persistently closed EEMs that cause complete eversion of the embryo (Figs. 2, S2). The unifying feature is a heterochronic shift of extraembryonic compared to embryonic developmental processes (delayed EEM withdrawal compared to epidermal outgrowth for dorsal closure).
There may also be species-specific differences in timing. The sole zen orthologue in the milkweed bug Oncopeltus fasciatus has a similarly persistent expression profile and specific role in withdrawal, termed “katatrepsis” in this and other hemimetabolous insects (Panfilio, et al. 2006). We previously observed a number of Of-zen-dependent, long-term morphological changes prior to rupture (Panfilio 2009), contrasting with the more proximate effect of Tc-zen2 (discussed above). Taking the work forward, it will be interesting to compare Tc-zen2 and Of-zen transcriptional targets. Evaluating conserved regulatory features of EEM withdrawal across the breadth of the insects will clarify macroevolutionary patterns of change in the very process of epithelial morphogenesis.
METHODS
Tribolium castaneum stock husbandry
All experiments were conducted with the San Bernardino wild type strain, maintained under standard culturing conditions at 30 °C and 40-60% relative humidity (Brown, et al. 2009).
In silico analyses
Draft genome assemblies for T. freemani, T. madens, and T. confusum were obtained as assembled scaffolds in FASTA-format (version 26 March 2013 for each species), accessed from the http://BeetleBase.org FTP site at Kansas State University (ftp://ftp.bioinformatics.ksu.edu/pub/BeetleBase/). Transcripts for Tc-zen1 (TC000921-RA) and Tc-zen2 (TC000922-RA) were obtained from the T. castaneum official gene set 3 (OGS3, http://bioinf.uni-greifswald.de/tcas/genes/tcas5_annotation/). These sequences were used as queries for BLASTn searches in the other species’ genomes (BLAST+ 2.2.30) (Altschul, et al. 1997; Camacho, et al. 2009). Sequences were extracted to comprise the Hox3/zen genomic loci, spanning the interval from 5 kb upstream of the BLASTn hit for the 5’ UTR of Tc-zen1 to 5 kb downstream of the BLASTn hit for the 3’ UTR of Tc-zen2. These genomic loci were then aligned with the mVista tool (Mayor, et al. 2000; Frazer, et al. 2004) using default parameters. Nucleotide identities were calculated for a sliding window of 100 bp.
The maximum likelihood phylogenetic tree (Fig. 1B) was constructed based on an alignment of full-length Zen proteins, with gaps permitted, using the Phylogeny.fr default pipeline settings, with MUSCLE v3.8.31 alignment and PhyML v3.1 phylogenetic reconstruction (Dereeper, et al. 2008). The same topology and comparable support values were also obtained with additional sequences and other methods. This includes Drosophila Zen and/or Z2 and/or Drosophila and Megaselia Bicoid, and/or insect Zen proteins with known expression but uncharacterized function (Dearden, et al. 2000; Hughes, et al. 2004). This also holds for trees generated with Bayesian methods from the same interface (MrBayes program v3.2.6, 1000 generations, 100 burn-in trees). That is, the Tribolium Zen proteins form a clear clade with Oncopeltus Zen as an immediate outgroup, and then with the fly proteins as long branch outgroups.
Coding sequence for the Tc-zen paralogues was aligned with ClustalW (Larkin, et al. 2007), with manually curation to ensure a gap-free alignment of the homeobox. Nucleotide identities were calculated for a sliding window of 20 bp, using Simple Plot (Stothard 2000).
RT-qPCR
RNA was extracted using TRIzol Reagent (Ambion) according to the manufacturer’s protocol. RNA quality was assessed by spectrophotometry (NanoDrop 2000, Thermo Fisher Scientific). cDNA was synthesized using the SuperScript VILO cDNA Synthesis Kit (Invitrogen). RT-qPCR was performed as described (Horn and Panfilio 2016), using SYBR Green Master Mix (Life Technologies) and GoTaq qPCR Master Mix (Promega), with Tc-RpS3 as the reference gene. Note that for Tc-zen2 more consistent results were obtained using SYBR Green Master Mix. “Relative abundance” was calculated for each sample as the ratio relative to a pooled template control with cDNA from all depicted samples (method as in (Horn and Panfilio 2016)). Samples were measured for the Tc-zen paralogues’ wild type expression profiles (four biological replicates: Figs. 2G,3A) and evaluation of knockdown strength (three biological replicates: Figs. 1D,6B,S6B). Intron-spanning primers were used for each Tc-zen paralogue and the selected candidate target genes (Table S7).
Parental RNAi and knockdown assessments
Parental RNAi was performed as described (van der Zee, et al. 2005), with dsRNA synthesized with specific primers (Table S7) and resuspended in double-distilled water (ddH2O). Generally, 0.3-0.4 μg of dsRNA was used to inject one pupa.
Analysis of knockdown efficiency with different Tc-zen1 dsRNA fragments involved statistical tests on RT-qPCR data. The strength of the Tc-zen paralogues’ knockdown using short and long Tc-zen1 dsRNA fragments (Fig. 1C-D) was tested with a beta regression analysis in R v3.3.2 (R Core Team 2016) using the package betareg v3.1-0 (Cribari-Neto and Zeileis 2010). Relative expression of the Tc-zen paralogues in knockdown samples relative to wild type was used as the response variable and dsRNA fragment length as the explanatory variable.
For Tc-zen1RNAi phenotypic scoring (Fig. 2E), serosal cuticle presence/absence was determined by piercing the fixed, dechorionated egg with a disposable needle (Braun Sterican 23G, 0.60 x 25 mm): mechanically resistant eggs were scored for presence of the serosal cuticle while soft eggs that collapsed lacked serosal cuticle.
For Tc-zen2RNAi phenotypic scoring, larval cuticle preparations (Figs. 2C′,D′,F, S2) were produced as previously described (van der Zee, et al. 2005).
Histology: in situ hybridization, cryosectioning, immunohistochemistry
Whole mount in situ hybridization was performed as described (Koelzer, et al. 2014), with probes synthesized from gene specific primers (Table S7) and colorimetric detection with NBT/BCIP. Specimens were imaged in Vectashield mountant with DAPI (Vector Laboratories) for nuclear counterstaining. Images were acquired on an Axio Plan 2 microscope (Zeiss). Image projections were generated with AxioVision (Zeiss) and HeliconFocus 6.7.1 (Helicon Soft).
For cryosectioning, embryos were embedded in liquid sucrose-agarose embedding medium (15% sucrose, 2% agarose, [my-Budget Universal Agarose, Bio-Budget], PBS). Solid blocks of embedding medium containing embryos were stored overnight in 30% sucrose solution in PBS at 4 °C. The blocks were then embedded in Tissue Freezing Medium (Leica Biosystems) and flash-frozen in ice-cold isopentene (2-methylbutane). Samples were serially sectioned (20 μm, longitudinal; 30 μm, transverse) with a CM1850 cryostat (Leica Biosystems).
Protein was detected for both Tc-Zen1 and Tc-Zen2 with specific peptide antibodies (gift from the laboratory of Michael Schoppmeier) (Mackrodt 2016). Immunohistochemistry on whole mounts and on sectioned material was performed by washing the samples six times for 10 min. in blocking solution (2% BSA, 1% NGS, 0.1% Tween-20, PBS) followed by overnight incubation with the first antibody (rabbit anti-Tc-Zen1 and anti-Tc-Zen2, 1:1,000) at 4 °C. Next, the samples were washed six times for 10 min. in the blocking solution, followed by incubation with the secondary antibody (anti-rabbit Alexa Fluor 488 conjugate, 1:400, Invitrogen) for 3 h at room temperature (RT). Last, the samples were washed six times for 10 min. in the blocking solution. Samples were then mounted in Vectashield mountant with DAPI. Low magnification images were acquired with an Axio Imager 2 equipped with an ApoTome 2 (Zeiss) structured illumination module, and maximum intensity projections were generated with ZEN blue software (Zeiss). High magnification images were acquired with an LSM 700 confocal microscope (Zeiss) and the projections were generated with ZEN 2 black software (Zeiss).
Western blots
For each two-hour developmental interval, 50 μg of protein extract was separated by SDS-PAGE. Separated proteins were transferred onto nitrocellulose membrane (Thermo Fisher Scientific), which was blocked for 1 h in the blocking solution (100 mM Tris, 150 mM NaCl, pH 7.5, 0.1% Tween-20, 3% milk powder [Bebivita, Anfangsmilch]). Next, the membrane was incubated overnight at 4 °C with the first antibody (rabbit anti-Tc-Zen1 and anti-Tc-Zen2, 1:1,000; mouse anti-Tubulin [Sigma-Aldrich #T7451: Monoclonal anti-acetylated tubulin], 1:10,000). Afterwards, the membrane was washed three times for 10 min. with the blocking solution at RT. The membrane was then incubated with the secondary antibodies (anti-rabbit and anti-mouse, HRP, 1:10,000, Novex) for 1 h at RT. Last, after the membrane was washed three times for 10 min. with the blocking solution at RT, the membrane was incubated with ECL substrate according to the manufacturer’s protocol (WesternSure ECL Substrate, LI-COR) and digital detection was performed on a western blot developing machine (C-DIGIT, LI-COR) with the high sensitivity settings.
RNA-sequencing after RNAi
For transcriptomic profiling, a total of six Tc-zen1RNAi experiments were conducted: three performed with the short and three with the long dsRNA fragment (Fig. 1D). A total of seven Tc-zen2RNAi experiments were conducted: one for each biological replicate at each developmental stage. Samples chosen for sequencing were assessed by RT-qPCR for level of knockdown in RNAi samples, with Tc-zen1 reduced to ~10% of wild type levels and Tc-zen2 to ~24% across biological replicates. For early development (6-14 hAEL), three biological replicates were sequenced for each experimental treatment, with 100-bp paired end reads on an Illumina HiSeq2000 machine. For late development (48-56 hAEL), four biological replicates were sequenced with 75-bp paired end reads on a HiSeq4000 machine. All sequencing was performed at the Cologne Center for Genomics (CCG), with six (HiSeq2000) or eight (HiSeq4000) multiplexed samples per lane yielding ≥6.6 Gbp per sample.
The quality of raw Illumina reads was examined with FastQC (Andrews 2010). The adaptor sequences and low quality bases were removed with Trimmomatic v0.36 (Bolger, et al. 2014). Trimmomatic was also used to shorten 100-bp reads from the 3’ end to 75-bp reads to increase mapping efficiency (Table S8) (Li, et al. 2010). The overrepresented sequences of mitochondrial and ribosomal RNA were filtered out by mapping to a database of 1266 T. castaneum mitochondrial and ribosomal sequences extracted from the NCBI nucleotide database (accessed 21 October 2016, search query “’tribolium [organism] AND (ribosomal OR mitochondrial OR mitochondrion) NOT (whole genome shotgun) NOT (Karroochloa purpurea)”) with Bowtie2 v2.2.9 (Langmead and Salzberg 2012). Trimmed and filtered reads were mapped to the T. castaneum OGS3 (see above, file name: Tcas5.2_GenBank.corrected_v5.renamed.mrna.fa) with RSEM (Li and Dewey 2011). The raw read count output from RSEM was compiled into count tables.
Both principal component and differential expression analyses were performed in R using the package DESeq2 v1.14.1 (Love, et al. 2014) with default parameters. For PCA, raw (unfiltered) read counts were used. For DE analyses, to eliminate noise all genes with very low read counts were filtered out by sorting in Microsoft Excel, following recommendations (Busby, et al. 2013). Specifically, genes were excluded from DE analysis if read counts ≤10 in ≥1 biological replicates for both the knockdown and wild type samples. Note that, throughout, our reporting of “DE genes” refers to analyses across all isoforms (18,536 isoform models) in the T. castaneum official gene set OGS3.
Gene ontology (GO) analyses
GO enrichment analysis was performed by Blast2GO (Conesa, et al. 2005) using two-tailed Fisher’s exact test with a threshold false discovery rate (FDR) of 0.05.
GO term analysis was performed by Blast2GO against the Drosophila database (accessed 9 June 2017). Only GO terms from the level 5 were considered. Next, GO terms were grouped into categories of interest based on similarity in function (Table S5).
Afterwards a unique count of T. castaneum gene sequences was calculated for each category of interest and the percentage was compared to the rest of the GO terms in the level 5 for each GO domain (Fig. S6).
FUNDING
This work was supported by funding from the German Research Foundation (Deutsche Forschungsgemeinschaft) through SFB 680 project A12 and Emmy Noether Program grant PA 2044/1-1 to KAP.
AUTHOR CONTRIBUTIONS
DG designed experiments, collected and analyzed data, established the bioinformatic pipeline for the RNA-seq data, wrote the paper.
IMVJ analyzed data, established the bioinformatic pipeline for the RNA-seq data, edited the manuscript.
KAP conceived the project, designed experiments, analyzed data, established the bioinformatic pipeline for the RNA-seq data, wrote the paper.
ACKNOWLEDGMENTS
We thank Denise Mackrodt and Michael Schoppmeier for the kind gift of the Tc-Zen1 and Tc-Zen2 peptide antibodies, Viera Kovacova for bioinformatic program recommendations, Luigi Pontieri for assistance with statistical analyses, Thorsten Horn for sharing unpublished data on cuticle gene expression, Gustavo Lazzaro Rezende for discussions on cuticle regulation, and Hilary Ashe and Chris Rushlow for insights into zen regulation in Drosophila. We also thank Miltos Tsiantis and Siegfried Roth for helpful discussions and recommendations throughout the course of this research project. Siegfried Roth, Peter Heger, and Matthias Pechmann provided helpful feedback on the manuscript.