Abstract
Researchers have long debated the origin of insect wings. One theory proposes that the proximal portion of the ancestral crustacean leg became incorporated into the body1, which moved the leg's epipod (multi-functional lobe, e.g. gill) dorsally, up onto the back to form insect wings2. Another theory proposes that the dorsal insect body wall co-opted crustacean epipod genes to form wings3. Alternatively, wings may be derived from both leg and body wall (dual origin)4. To determine whether wings can be traced to ancestral, pre-insect structures, or arose by co-option, comparisons are necessary between insects and other arthropods more representative of the ancestral state, where the hypothesized proximal leg region is not fused to the body wall. To do so, we examined the function of five leg gap genes in the crustacean Parhyale hawaiensis and compared this to previous functional data from insects. Here we show, using CRISPR-Cas9 mutagenesis, that leg segment deletion phenotypes of all five leg gap genes in Parhyale align to those of insects only by including the hypothesized fused ancestral proximal leg region. We also argue that possession of eight leg segments is the ancestral state for crustaceans. Thus, Parhyale incorporated one leg segment into the body, which now bears the tergal plate, while insects incorporated two leg segments into the body, the most proximal one bearing the wing. We propose a model wherein much of the body wall of insects, including the entire wing, is derived from these two ancestral proximal leg segments, giving the appearance of a “dual origin” 4-9. This model explains many observations in favor of either the body wall origin, proximal leg origin, or dual origin of insect wings.
Arthropod appendages are key to their spectacular success, but their incredible diversity has complicated comparisons between distantly related species. The origin of the most debated appendage, insect wings, pivots on the alignment of leg segments, because wings may be derived from an epipod (e.g. gill or plate, Fig. 1b)5 of ancestral leg segments that fused to the body2, or alternatively, may represent a co-option of the epipod-patterning pathway by the insect body wall3, or a combination of both4,6-10. To answer this, functional comparisons are necessary between insects and arthropods more representative of the ancestral state, where the hypothesized proximal leg region is not fused to the body wall.
Towards this aim, we examined five leg gap genes, Distalless (Dll), Sp6-9, dachshund (dac), extradenticle (exd), and homothorax (hth), in an amphipod crustacean, Parhyale hawaiensis. While we have documented their expression at several developmental stages (Fig. S1), our comparative analysis does not rely solely on these expression patterns, given that expression is not always a reliable indication of function, and expression is often temporally dynamic11. Instead, we have systematically knocked out these genes in Parhyale using CRISPR-Cas9 mutagenesis and compared this to our understanding of their function in Drosophila and other insects (Figs. 2, S2).
Insects have six leg segments, while Parhyale has seven (Fig. 1). In insects, Dll is required for the development of leg segments 2 - 612-15. In Parhyale, the canonical Dll gene, Dll-e16-18, is required for the development of leg segments 3 - 7 (Fig. 2b). In insects, Sp6-919 is required for the development of leg segments 1 - 6 20, and in addition in Drosophila, loss of Sp6-9 (i.e. D-Sp119) occasionally transforms the leg towards wing and lateral body wall identity20. In Parhyale, Sp6-919,21 is required for the development of leg segments 2 - 7 (Fig. 2c), and in some legs, segment 2 is occasionally transformed towards a leg segment 1 identity (Fig S3). In Drosophila, dac is required in the trochanter through proximal tarsus (leg segments 2 - 4, and first tarsus)21,22. Parhyale has two dac paralogs. dac1 does not seem to be expressed in the legs or have a morphologically visible knockout phenotype. dac2 is required to pattern leg segments 3 - 5 (Fig. 2d). Exd and hth are expressed in the body wall and proximal leg segments of insects23-26 and Parhyale27 (Fig S1). They form heterodimers and therefore have similar phenotypes 23-26. In insects, exd or hth knockout results in deletions/fusions of the coxa through proximal tibia (leg segments 1 - 3, and proximal tibia)24-28. In Parhyale, exd or hth knockout results in deletions/fusions of the coxa through proximal carpus (leg segments 1 - 4, and proximal carpus; Figs. 2e f). In both insects 23,24,28 and Parhyale, the remaining distal leg segments are sometimes transformed towards a generalized thoracic leg identity (compare Fig. 2 e, f and Fig S4). In both insects23-26 and Parhyale (Fig. S4), exd or hth knockout results in deletions/fusions of body segments.
In summary, the expression and function of Dll, Sp6-9, dac, exd, and hth in Parhyale are shifted distally by one segment relative to insects. This shift is accounted for if insects fused an ancestral proximal leg segment to the body wall (Fig. 2g). Thus, there is a one-to-one homology between insect and Parhyale legs, displaced by one segment, such that the insect coxa is homologous to the crustacean basis, the insect femur is the crustacean ischium, and so on for all leg segments. This also means that at least part of the insect body wall is homologous to the crustacean coxa.
While our CRISPR-Cas9 data is agnostic regarding the origin of the insect wing, we noted that Parhyale has what appears to be an epipod, the tergal plate, emerging dorsal to the coxa (Fig. 3). Since epipods are characteristic of leg segments29, this suggested to us that Parhyale, like most groups of crustaceans, might retain an additional proximal leg segment, the ancestral precoxa (Fig. 3a). We therefore examined carefully dissected Parhyale using confocal and brightfield microscopy and identified what appears to be a precoxa (Fig. 3). In fact, this structure has been previously described in amphipods30, but it was assumed to be body wall. However, this structure meets the criteria for a true leg segment: it protrudes from the body wall; it forms a true, muscled joint; and it extends musculature to another leg segment (Figs. 3 and S5)29,31,32. Importantly, the tergal plate emerges not from the body wall, but from this precoxa (Fig. 3e). Molecular evidence of a precoxa in Parhyale is provided by Clark-Hachtel in the accompanying manuscript. They show that the tergal plate is indeed an epipod just like the coxal plate, and that nubbin, a marker of arthropod leg segments, is expressed in a distinct stripe above the Parhyale tergal plate, suggesting there is indeed a precoxa leg segment here. Thus, much of what appears to be lateral body wall in Parhyale is in fact proximal leg.
Since insects evolved from crustaceans, if the insect coxa is homologous to the crustacean basis, then one would expect to find two leg segments incorporated into the insect body wall, each equipped with an epipod (Fig. 4). As predicted, two leg-like segments can be observed proximal to the coxa in basal hexapods1 including collembolans33, as well as in the embryos of many insects9,34,35. In insect embryos, these two leg-like segments flatten out before hatching to form the lateral body wall1,9,33-37 (Fig 1c). Furthermore, insects indeed appear to have two epipods proximal to the insect coxa. When “wing” genes are depleted in insects via RNAi, two distinct regions are affected: the wing, but also the protruding plate adjacent to the leg (Fig. 1c)38-41. Based on these data, we hypothesize that insects have incorporated the ancestral precoxa and crustacean coxa into the body wall, with the precoxa epipod later forming the wing and the crustacean coxa epipod later forming the plate.
The results presented here may settle the long-standing debate regarding the origin of insect wings as derived from (a) the epipod of the leg, (b) the body wall, or, more recently, (c) from both (dual-origin4,6-10). Our model accounts for all observations in favor of either of these hypotheses, including the dorsal position of insect wings relative to their legs, the loss of ancestral leg segments in insects, the two-segmented morphology of the insect subcoxa in both embryos and adults, the complex musculature for flight, and the shared gene expression between wings and epipods. The realization that crustaceans have a precoxa accounts for the apparent “dual origin” of insect wings: much of what appears to be insect body wall is in fact the remnant of two ancestral leg segments.
METHODS
BIOINFORMATICS
Partial or complete sequences for Parhyale Dll, Sp6-9, Exd, and Hth have been previously identified. These were >99% identical at the nucleotide level to sequences in the Parhyale assembled transcriptome. In order to confirm their orthology, identify potential Parhyale paralogs and identify Parhyale dac, we ran reciprocal best Blast hit searches. For each gene, orthologs from several arthropods and vertebrates were downloaded from NCBI and EMBL and aligned against the Parhyale transcriptome42 using standalone NCBI blastp. The Parhyale hits with the lowest E-values were used to run a blastp against the NCBI database, restricted to Arthropoda. We confirmed that the original set of orthologs from several arthropods were the best hits to our Parhyale candidates (i.e. were each other's reciprocal best Blast hits). These reciprocal best Blast hits are listed in the tables below, and were deposited in Genbank under Accession Numbers MG457799 - MG457804.
No Parhyale buttonhead/Sp5 was recovered in the assembled transcriptome. Buttonhead/Sp5 was also not found in the genome of the related amphipod Hyalella azteca. The assembled transcriptome only recovered fragments of Parhyale Sp1-4, so the previously sequenced Parhyale Sp1-4 (CBH30980.1) was used for the table below (asterisk).
Parhyale has three Dll paralogs, which appear to be an amphipod-specific duplication, because a related amphipod, Hyalella azteca, also has these same three Dll paralogs. The three Parhyale Dll paralogs had the lowest E-values to all Dll orthologs examined, but which of the three Parhyale Dll paralogs had the lowest E-value was variable, as expected for a clade-specific duplication.
The coding region for Parhyale exd and hth in the assembled transcriptome are longer than those previously identified. Exd is 204 amino acids longer, and hth is 166 amino acids longer. This explains the higher-than-expected E-values between the Parhyale exd and hth sequences identified previously and the Parhyale exd and hth sequences used in this study.
IN SITU PRIMER SEQUENCES
CLONING AND RNA PROBE SYNTHESIS
Total RNA was extracted from a large pool of Parhyale embryos at multiple stages of embryogenesis, from Stages 12 to 26 using Trizol. cDNA was generated using Superscript III. Primers were generated with Primer3 (http://bioinfo.ut.ee/primer3-0.4.0), with a preferred product size of 700bp, and did not include the DNA binding domain. Inserts were amplified with Platinum Taq (ThermoFisher 10966026), ligated into pGem T-Easy vectors (ProMega A1360), and transformed into E coli. The resulting plasmids were cleaned with a QiaPrep mini-prep kit (Qiagen A1360), and sequenced to verify the correct insert and determine sense and anti-sense promoters. In situ templates were generated by PCR from these plasmids using M13F/R primers and purified with Qiagen PCR Purification kit (Qiagen 28104). The resulting PCR products were used to make DIG-labeled RNA probes (Roche 11175025910) using either T7 or Sp6 RNA polymerase. RNA probes were precipitated with LiCl, resuspended in water, and run on an agarose gel to check that probes were the correct size, and concentration was determined using a Nanodrop 10000. Probes were used at 1-5ng/uL concentration.
IN SITU PROTOCOL
Embryo collection, fixation, and dissection as previously described43. In situ performed as previously described44. In brief, embryos were fixed in 4% paraformaldehyde (PFA) in artificial seawater for 45 minutes, dehydrated to methanol, and stored overnight at-20C to discourage embryos from floating in later hybridization solution (Hyb) step. Embryos were rehydrated to 1xPBS with 0.1% Tween 20 (PTw), post-fixed for 30 minutes in 9:1 PTw:PFA, and washed in PTw. Embryos were incubated in Hyb at 55C for at least 36 hours. Embryos were blocked with 5% normal goat serum and 1x Roche blocking reagent (Roche 11096176001) in PTw for 30 minutes. Sheep anti-DIG-AP antibody (Roche 11093274910) was added at 1:2000 and incubated for 2 hours at room temperature. Embryos were developed in BM Purple (Roche 11442074001) for a few hours to overnight. After embryos were sufficiently developed, they were dehydrated to methanol to remove any pink background, then rehydrated to PTw. Embryos were then moved to 1:1 PBS:glycerol with 0.1mg/mL DAPI, then 70% glycerol in PBS.
CRISPR-CAS9 GUIDE RNA GENERATION, INJECTION, AND IMAGING
Guide RNAs were generated using ZiFit45,46 as previously described47. sgRNAs were ordered from Synthego. Injection mixes had a final concentration of 333ng/uL Cas9 protein, 150ng/uL sgRNA (for both single and double guide injection mixes), and 0.05% phenol red for visualization during injection, all suspended in water. One-or two-cell embryos were injected with approximately 40 - 60 picoliters of sgRNA mixture as previously described47. Resulting knockout hatchlings were fixed in 4% paraformaldehyde in artificial seawater at 4C for 1 - 2 days, then moved to 70% glycerol in 1xPBS. Dissected hatchling limbs were visualized with Zeiss 700 and 780 confocal microscopes using the autofluorescence in the DAPI channel. Z-stacks were assembled with Volocity. Hatchling images were desaturated, levels adjusted, and false-colored using Overlay with Adobe Photoshop CS6.
T7 ENDONUCLEASE I ASSAY
Genomic primers were designed using Primer3, and flanked the target site by at least 400bp to either side. DNA isolation and subsequent PCR amplification of the region of interest was modified from previously described protocols48. Genomic DNA was amplified directly from fixed hatchlings in 70% glycerol using ExTaq (Takara RR001A). The resulting PCR products were purified with the Qiaquick PCR purification kit (Qiagen 28104). Heteroduplexes were annealed and digested by T7 endonuclease I according to NEB protocols (NEB M0302L). The digested products were run out on a 1.5% agarose gel. Genomic primers used for the T7 endonuclease I assay are listed below.
GENOMIC DNA PRIMERS
AUTHOR CONTRIBUTIONS
H.S.B. and N.H.P. conceived of the experiments. H.S.B. performed all experiments, conceived of model, and wrote the manuscript. N.H.P. edited and revised the manuscript.