Abstract
Background Understanding the genomic basis of phenotypic diversity can be greatly facilitated by examining adaptive radiations with hypervariable traits. In this study, we focus on a rapidly diverged species group of mormyrid electric fish in the genus Paramormyrops, which are characterized by extensive phenotypic variation in electric organ discharges (EODs). The main components of EOD diversity are waveform duration, complexity and polarity. Using an RNA-sequencing based approach, we sought to identify gene expression correlates for each of these EOD waveform features by comparing 11 specimens of Paramormyrops that exhibit variation in these features.
Results Patterns of gene expression among Paramormyrops are highly correlated, and 3,274 genes (16%) were differentially expressed. Using our most restrictive criteria, we detected 71-144 differentially expressed genes correlated with each EOD feature, with little overlap between them. The predicted functions of several of these genes are related to extracellular matrix, cation homeostasis, lipid metabolism, and cytoskeletal and sarcomeric proteins. These genes are of significant interest given the known morphological differences between electric organs that underlie differences in the EOD waveform features studied.
Conclusions In this study, we identified plausible candidate genes that may contribute to phenotypic differences in EOD waveforms among a rapidly diverged group of mormyrid electric fish. These genes may be important targets of selection in the evolution of species-specific differences in mate-recognition signals.
Introduction
Understanding the genomic basis of phenotypic diversity is a major goal of evolutionary biology [1]. Adaptive radiations and explosive diversification of species [2] are frequently characterized by interspecific phenotypic differences in divergence of few, hypervariable phenotypic traits [3–6]. Such systems offer exceptional advantages to study the genomic bases of phenotypic diversity: they can provide replication under a controlled phylogenetic framework [7], and couple ample phenotypic differentiation with relatively “clean” genomic signals between recently diverged species [8]. Study of the genomic mechanisms underlying hypervariable phenotypic traits has identified, in some cases relatively simple genetic architectures [9–13]. More often, the genetic architecture underlying such traits can be complex and polygenic [14–17]. It has long been recognized that changes in gene expression can affect phenotypic differences between species [18], and RNA-seq based approaches have greatly facilitated the study of this relationship [19]. A growing number of studies have examined differences in gene expression in phenotypic evolution (e.g., [19–27]). While these studies do not implicate mutational causes, analysis of differential gene expression (DGE) can be a useful approach in examining the genomic basis of divergent phenotypes.
African weakly electric fish (Teleostei: Mormyridae) are among the most rapidly speciating groups of ray-finned fishes [28,29]. This is partly due to the diversification of the genus Paramormyrops [30,31] in the watersheds of West-Central Africa, where more than 20 estimated species [32] have evolved within the last 0.5-2 million years [30]. Extensive evidence has demonstrated that electric organ discharges (EODs) exhibit little intraspecific variation, yet differ substantially among mormyrid species [33–35]. This pattern is particularly evident in Paramormyrops [30,36], in which EOD waveforms evolve much faster than morphology, size, and trophic ecology [37].
Mormyrid EODs are a behavior with a dual role in electrolocation [38,39] and intraspecific communication [40,41]. EOD waveforms vary between species principally in terms of their complexity, polarity, and duration [30,42], and all three dimensions of variation are evident among Paramormyrops (Fig. 1). Furthermore, recent discoveries of intraspecific polymorphism in EOD waveform in P. kingsleyae [43] and polarity among P. sp. ‘magnostipes’ [35] present a unique opportunity to study the genomic basis of phenotypic traits within a rapidly diverging species group.
EODs have a well-understood morphological (Fig. 1) and neurophysiological basis [44,45]. EODs are generated by specialized cells (electrocytes) that constitute the electric organ (EO), located in the caudal peduncle [46]. Mormyrid EOs are comprised of 80-360 electrocytes [34], and an individual EOD is produced when the electrocytes discharge synchronously. EODs are multiphasic because they result from action potentials produced by two excitable membranes: the two large phases of the EOD, called P1 and P2, are produced by spikes generated by the posterior and anterior electrocyte faces, respectively [47]. There is a relationship between EODs of longer duration and increased surface membrane area [48], likely mediated at least in part by an increase in membrane capacitance [49,50]. The duration of EODs is highly variable within mormyrids-- some EODs are extremely long (>15 ms) and others are very brief (0.2 ms) [32].
Within the Mormyridae, triphasic EODs evolved early from biphasic EODs; however, there have been multiple parallel reversions to biphasic EODs across mormyrids and within Paramormyrops [36,43]. Triphasic (P0-present) EODs are produced by electrocytes that are innervated on the anterior face and have penetrating stalks (Pa, P-type), whereas biphasic (P0-absent) EODs are produced by electrocytes innervated on the posterior face and lack penetrating stalks (NPp, N-type) (for more details see [42,43,47,48,51,52]). We refer to triphasic EODs as more ‘complex’ than biphasic EODs. In some cases, triphasic EODs display an unusually large P0 phase, which gives the appearance of an ‘inverted’ polarity. This is exemplified by the type I EODs of P. sp. ‘magnostipes’ (Fig. 1) [35]. The number [47] and diameter [34,43] of stalk penetrations are positively correlated with the magnitude of P0. We refer to individuals with large penetrations as ‘inverted’ polarity and individuals with small penetrations as ‘normal’ polarity.
Recent studies in mormyrids [53–57] have adopted a candidate gene approach to examine the molecular basis of variation in EOD duration on macroevolutionary scales, implicating voltage gated sodium channels (e.g. scn4aa) and potassium channels (e.g. kcna7a) as key targets of selection during EOD evolution. Beyond this recent attention to ion channels, several studies have described the importance of structural differences between EOs as an important component of EOD variation [43,48,50]. In this study, we took a transcriptome-wide approach to characterizing the molecular basis of electric signal diversity in Paramormyrops species divergent for EOD complexity, duration and polarity. We used RNA-sequencing to comprehensively examine DGE in the adult EOs of five Paramormyrops operational taxonomic units (OTUs), leveraging a recently sequenced and annotated genome assembly from the species P. kingsleyae (N-type) [58], and identify gene expression correlates of each of the three main EOD waveform features of electric signal diversity in Paramormyrops. Our results emphasize genes that influence the shape and structure of the electrocyte cytoskeleton, membrane and extracellular matrix (ECM) to exhibit predictable differences between Paramormyrops species with divergent EOD phenotypes.
Methods
Sample collection
We captured 11 Paramormyrops individuals from Gabon, West Central Africa in 2009: five P. kingsleyae (n=3 N-type and n=2 P-type), four P. sp. ‘magnostipes’ (n=2 Type I and n=2 Type II), and two P. sp. ‘SN3’. Within 1-12 hours of capture, individual specimens were euthanized by overdose with MS-222. The caudal peduncle was excised and skinned, and immediately immersed in RNA-later for 24h at 4°C, before being transferred to −20°C for long-term storage. As two of these species (P. sp. ‘magnostipes’, P. sp. ‘SN3’) are presently undescribed, we note that these specimens were identified by their EOD waveform, head morphology and collecting locality [30,31,35,59]. All specimens, including vouchers materials, are deposited in the Cornell University Museum of Vertebrates. Collection information and the phenotypes per EOD feature of each sample are detailed in Table 1.
RNA extraction, cDNA library preparation and Illumina Sequencing
Total RNA was extracted from EOs using RNA-easy Kit (Qiagen, Inc) after homogenization with a bead-beater (Biospec, Inc.) in homogenization buffer. mRNA was isolated from total RNA using a NEBNext mRNA Isolation Kit (New England Biolabs, Inc.). Libraries for RNA-seq were prepared using the NEBNext mRNA Sample Prep Master Mix Set, following manufacturer’s instructions. Final libraries after size selection ranged from 250-367 bp. Libraries were pooled and sequenced by the Cornell University Biotechnology Resource Center Genomics Core on an Illumina HiSeq 2000 in a 2×100bp paired end format. Raw sequence reads were deposited in the NCBI SRA (Supplemental File 1).
Read processing and data exploration
FastQC v0.11.3 (Babraham Bioinformatics) was used to manually inspect raw and processed reads. We used Trimmomatic v.0.32 [60] to remove library adaptors, low quality reads, and filter small reads; following the suggested settings of MacManes [61]: 2:30:10 SLIDINGWINDOW:4:5 LEADING:5 TRAILING:5 MINLEN:25. After trimming, reads from each specimen were aligned to the predicted transcripts of the NCBI-annotated (Release 100) P. kingsleyae (N-type) genome [58] using bowtie 2 v2.3.4.1 [62]. Expression quantification was estimated at the gene level using RSEM v1.3.0 [63], followed by exploration of the data with a gene expression correlation matrix based on Euclidean distances and Pearson’s correlation coefficient (for genes with read counts >10, Trinity’s default parameters). All these steps were executed using scripts included with Trinity v2.6.6 [64,65].
Data Analysis
We began by examining DGE between all possible pairwise comparisons of OTUs (n = 10, Table 2) using edgeR v3.20.9 [66] through a script provided with Trinity. We restricted our consideration of genes to those where CPM-transformed counts were > 1 in at least two samples for each comparison (edgeR default parameters). We modified this to use the function estimateDisp() instead of the functions estimateCommonDisp() and estimateTagwiseDisp(). For each comparison, we conservatively considered genes to be differentially expressed with a minimum fold change of 4 and p-value of 0.001 after FDR correction. We compiled a non-redundant list of genes that were differentially expressed in at least one comparison based on these criteria (Fig. 2, Set A).
For each of the differentially expressed genes (DEGs) in Set A, we TMM normalized, log2(TMM +1) transformed, and mean-centered their expression values. We used the transformed values to compare gene expression between groups of OTUs with alternative EOD waveform phenotypes (i.e. long duration EOD vs. short duration EOD, biphasic vs. triphasic and small penetrations vs. large penetrations, see Table 1. Note that waveform polarity phenotypes only apply to triphasic individuals). For each of the three phenotype pairs, we extracted the genes that were on average more than four times more highly expressed in one phenotype than the other. This resulted in six lists of upregulated genes, one for each EOD feature across all OTUs and samples (Fig. 2, Set B).
In order to assess enrichment of particular gene pathways, biological functions, and cellular locations using a controlled vocabulary, we performed Gene Ontology (GO) [67,68] enrichment tests on every list of upregulated genes from (1) the ten pairwise comparisons (n=20, two per comparison) and (2) Set B (n=6), for each of the three ontology domains: Biological Process, Cellular Component, and Molecular Function. First, we identified homologous proteins predicted from the P. kingsleyae (N-type) reference genome and those predicted from Danio rerio (GRCz11) by blastp (BLAST+ v2.6, [69]). For each protein, the top hit (e-value ≤ 1e-10) was used for annotation. Next, we used mygene v1.14.0 [70,71] to match the D. rerio proteins to D. rerio genes and extract their GO annotations (zebrafish Zv9). This resulted in GO annotations for each of the three ontology domains for P. kingsleyae (N-type) genes. Finally, we carried out the GO enrichment tests using topGO v2.30.1 [72] and the following parameters: nodeSize = 10, statistic = fisher, algorithm = weight01, p-value ≤ 0.02. The ‘universe’ for each enrichment test on gene lists from the pairwise comparisons was all the genes deemed expressed in the respective comparison, whereas the non-redundant list of genes in these ten ‘universes’ was the ‘universe’ for all enrichment tests on the gene lists from Set B.
Interpretation of lists of genes from Set A and Set B each suffered limitations for the overall goals of this analysis, which is to identify the DEGs most strongly associated with each waveform feature (duration, complexity, and polarity). The ten comparisons made to construct Set A were not equally informative for two primary reasons: (1) the OTUs in this analysis vary in terms of their phylogenetic relatedness (see [30,31]) and (2) several OTU comparisons varied in more than one waveform characteristic (Table 2). As such, we elected to focus on the most informative comparison for each EOD feature: the comparison that contrasted only the given feature and that minimized phylogenetic distance between OTUs. Of the ten pairwise comparisons, we classified three as the most informative comparisons, one per EOD feature (Table 2). The six lists of upregulated genes from these three comparisons constitute Set A’.
Comparisons in Set A’; however, lack biological replication. In contrast, interpretations of Set B were potentially limited in that many of the OTUs in this analysis differed in more than one EOD feature. To circumvent the limitations of Sets A’ and B within the limits of our study design, we constructed a third set (Set C). Set C is defined as the intersection of the upregulated genes and their enriched GO terms from Sets A’ and B, for each phenotype. Since there were six phenotypes in our study, Set C encompasses six lists of upregulated genes and their respective enriched GO terms (Fig. 2). Therefore, Set C represents the DEGs that are (1) differentially expressed between closely-related OTUs that vary in a single waveform characteristic, and (2) are consistently differentially expressed among all OTUs that share that waveform feature. We focus our attention on Set C: We retrieved GO term definitions from QuickGO [73] and descriptions of gene function of the functional annotations from UniProt [74]; and to facilitate the discussion, we classified the more interesting genes in Set C into “general” functional classes, or themes. All source code necessary to perform the methods described here is provided in a GitHub repository: http://github.com/msuefishlab/paramormyrops_rnaseq.
Results
Overall Results
Overall alignment rates to the Paramormyrops kingsleyae reference transcriptome ranged from 28-74% (>375 million sequenced reads in total, 50% aligned), with no clear differences among OTUs (Supplemental File 1). On inspection, we concluded that these rates are a consequence of the presence of overrepresented sequences from rRNA, mtDNA and bacterial contamination in the RNA-seq reads.
Fig. 3 shows a heatmap of pairwise correlations of gene expression for 24,960 genes across all 11 samples. Our ten DGE comparisons detected a range of 16,420-19,273 expressed genes. Intersection of these lists resulted in a non-redundant list of 20,197 genes expressed in EO across all DGE comparisons. We found that 3,274 (16%) were differentially expressed in at least one comparison, and expression patterns across all OTUs were highly correlated (Pearson’s r > 0.89, Fig. 3). Despite this, correlation values were higher among recognized OTUs, except for the P. sp. ‘magnostipes type II’ 6768 sample (Fig. 3). Thus, we excluded comparisons with the P. sp. ‘magnostipes type II’ OTU from the informative comparisons for Set A’.
Set A: Differential Expression Analysis
We found between 489-1542 DEGs (50-128 enriched GO terms) in every comparison except P. sp. ‘magnostipes type I’ vs P. sp. ‘magnostipes type II’, which had only nine DEGs with seven enriched GO terms (Table 2). Supplemental File 2 provides a tabular list of DEGs for each comparison, and Supplemental File 3 provides a tabular list of enriched GO terms for each comparison.
We chose the phylogenetically most informative comparisons (see methods) to construct Set A’, which are indicated in Table 2. We found: 507 DEG and 69 enriched GO terms comparing P. kingsleyae (N-type) vs P. sp. ‘SN3’ (EOD duration); 1322 DEG and 77 enriched GO terms comparing P. kingsleyae (P-type) vs P. sp. ‘magnostipes type I’ (waveform polarity); and 530 DEG and 75 enriched GO terms comparing P. kingsleyae (N-type) vs P. kingsleyae (P-type) (waveform complexity).
Set B: Expression Based Clustering
For each EOD feature (n=3), we grouped OTUs by phenotype (Table 1), and calculated normalized expression values for Set A genes (n = 3,274). For each EOD feature, we selected genes exhibiting a greater than four-fold difference in averaged, normalized expression between phenotypes to construct Set B. The expression profiles of the genes in the clusters for each EOD feature, along with the enriched GO terms for Biological Process and Cellular Component, are shown in Figs. 4–6. Supplemental File 4 lists the identities of these DEG and Supplemental File 5 lists their enriched GO terms for all three GO ontologies.
Contrast of waveform duration identified 181 DEG and 40 enriched GO terms. 121 of the DEG were upregulated in the short EOD phenotype (Fig. 4A, purple lines). These genes were enriched with GO terms that include ‘extracellular space,’ ‘extracellular region,’ ‘muscle contraction,’ and ‘phosphatidylinositol phosphorylation’ (Fig. 4B, purple bars), whereas 61 genes were upregulated in samples with long EODs (Fig. 4A, yellow lines). These genes were enriched with GO terms like ‘negative regulation of cation transmembrane transport,’ ‘regulation of voltage-gated calcium channel activity,’ and ‘neuropeptide hormone activity’ (Fig. 4B, yellow bars).
Contrast of waveform polarity identified 147 DEG and 35 enriched GO terms. We found 40 upregulated genes (Fig. 5A, grey lines) in individuals with small penetrations. These genes were enriched with GO terms such as ‘response to mechanical stimulus,’ ‘extracellular matrix organization,’ and ‘collagen trimer’ (Fig. 5B, grey bars). In large penetrations phenotype, we found 107 upregulated genes (Fig. 5A, salmon lines). These genes were enriched with GO terms that included ‘sarcomere organization,’ ‘plasma membrane bounded cell projection morphogenesis,’ ‘calcium ion binding,’ and ‘structural constituent of cytoskeleton’ (Fig. 5B, salmon bars).
Finally, contrast of waveform complexity identified 174 DEG and 16 enriched GO terms. We detected 82 upregulated genes in individuals with biphasic EODs (Fig. 6A, blue lines). These genes were enriched with GO terms like ‘positive regulation of canonical Wnt signaling pathway,’ ‘mucopolysaccharide metabolic process,’ and ‘integral component of membrane’ (Fig. 6B, blue bars). We detected 92 upregulated genes in individuals with triphasic EODs (Fig. 6A, orange lines). These genes were enriched with GO terms that included ‘cell-matrix adhesion,’ ‘regulation of ion transmembrane transport,’ and ‘neuronal cell body’ (Fig. 6B, orange bars).
Set C: Intersection of Phylogenetically Informative Comparisons and Expression Based Clustering
We were motivated to obtain the DEGs and enriched GO terms that were most likely to be associated with divergent EOD phenotypes. To obtain this list, we constructed Set C, which is the intersection of Set A’ and Set B described above.
Contrast of waveform duration identified 105 DEG and 11 enriched GO terms. 79 of the DEG were upregulated in the short EOD phenotype, and 26 genes were upregulated in samples with long EODs. Contrast of waveform polarity identified 144 DEG and 9 enriched GO terms. We found 38 upregulated genes in individuals with small penetrations and 106 upregulated genes in individuals with large penetrations. Finally, contrast of waveform complexity identified 71 DEG and 4 enriched GO terms. We detected 46 upregulated genes in individuals with biphasic EODs and 25 upregulated genes in individuals with triphasic EODs. These results are further detailed in Table 3. The enriched GO terms in Set C for Biological Process and Cellular Component are emphasized in boldface for waveform duration (Fig. 4B), polarity (Fig. 5B), and complexity (Fig. 6B). The individual genes in Set C are listed in Supplemental File 6 and their associated GO terms are listed in Supplemental File 7.
Discussion
It has long been recognized that changes in gene expression can affect phenotypic differences between species [18], and RNA-seq has facilitated the study of this relationship [19]. The goal of this study was to determine DEGs associated with divergent EOD features within Paramormyrops. Expression patterns across all OTUs were highly correlated (Pearson’s r > 0.89, Fig. 3) and we detected differential expression of only 3,274 (16%) genes between any two OTUs. Thus, a major finding of this study is that EO gene expression is overall quite similar across Paramormyrops species with divergent EODs, and relatively few genes are associated with phenotypic differences in EOD waveform between OTUs. Given generally high levels of genetic distances observed between geographically proximate populations of these Paramormyrops species [35,75] this was a somewhat unexpected finding.
Despite the relatively small number of DEGs compared to the total number of genes expressed in the EO, we constructed our analysis to extract genes that were highly associated with particular phenotypes. Set A’ represents a formal statistical test that contrasted OTUs. Each comparison contrasted samples from OTUs that were divergent in only one EOD feature, while minimizing phylogenetic distance. The tradeoff of this approach is small sample sizes without biological replication and potentially confounding variables, such as collection sites. Set B followed the opposite approach by minimizing the problem of biological replication at the expense of confounding phylogenetic relatedness and phenotypic heterogeneity. To mitigate this, we constructed Set C, which represents genes and GO terms that are differentially expressed/enriched between closely related OTUs divergent in only one phenotypic character and that are also consistently differentially expressed/enriched among representatives with similar EOD phenotypes. As such, we focus our discussion on the results of Set C. We classified the genes in Set C into “general” functional classes, or themes; and focus our attention on the ones that relate to the known morphological underpinnings of waveform duration (Table 4), polarity (Table 5), and complexity (Table 6). These functional classes were genes related to the ECM, cation homeostasis, lipid metabolism, and cytoskeletal and sarcomeric genes.
Waveform Duration
Several researchers have implicated the role of ion channels in the evolution of duration changes in mormyrid signals [53–57]. We did not find evidence of large changes in expression of potassium or sodium channels between short-duration P. sp. ‘SN3’ and other Paramormyrops species. We note that, while no differential expression of potassium or sodium channels was discovered, the GO term ‘regulation of voltage-gated calcium channel activity’ was enriched in long EOD phenotypes, although there were no genes annotated to this GO term common to the lists of DEGs from Sets A’ and B. In particular, individuals with short EODs upregulate two calcium-binding proteins: parvalbumin-2, and parvalbumin-2-like. Parvalbumins are highly expressed in skeletal muscle where they sequester calcium after contraction, thus facilitating relaxation. Frequently, muscles with fast relaxation rates express higher levels of parvalbumins [76]. The upregulated parvalbumin genes we detected may somehow be related to shorter EODs by sequestering calcium at a faster rate, which could affect action potentials directly or indirectly through calcium-activated ion channels.
Previous studies have demonstrated that changes in EOD duration result from changes in electrocyte ultrastructure. The two major phases of the EOD waveform are caused by action potentials generated by the anterior and posterior faces [47]. Bennett [48] demonstrated a relationship between EOD duration and increased surface membrane area, and Bass et al. [50] showed that differences in surface area are more readily noticeable on the anterior face. Membrane surface area is increased by folding the electrocyte membrane into papillae and other tube-like invaginations [77]. Testosterone can induce increases in EOD duration in several mormyrids [49,50,78,79], and it also increases membrane surface area, either particularly on the anterior face [50] or on both anterior and posterior faces [80]. A larger surface area may increase the capacitance of the membrane, thus delaying spike initiation [49,50]. Consequently, genes involved in the synthesis of membranes could influence EOD duration.
We found the most prominent differences in gene expression between the EOD duration phenotypes in genes that code for cytoskeletal, sarcomeric, and lipid metabolism proteins (Table 4). We emphasize the last group: no lipid metabolism genes were upregulated in individuals with long EODs, whereas samples with short EODs upregulated protein EFR3 homolog B-like (a regulator of phosphatidylinositol 4-phosphate synthesis), retinoic acid receptor responder protein 3-like (hydrolysis of phosphatidylcholines and phosphatidylethanolamines), PTB domain-containing engulfment adapter protein 1-like (modulates cellular glycosphingolipid and cholesterol transport), phosphatidylinositol 3-kinase regulatory subunit gamma-like, (PI3K, which phosphorylates phosphatidylinositol), and proto-oncogene c-Fos-like (can activate phospholipid synthesis), and showed enrichment of the GO term ‘phosphatidylinositol phosphorylation.’ We hypothesize that these genes are involved in the surface proliferation of the electrocytes membranes.
Additionally, each mormyrid electrocyte stands embedded in a gelatinous mucopolysaccharide matrix (the ECM) separated from neighboring electrocytes by connective tissue septa (Fig. 1) [34], and the membrane surface invaginations are coated by the same ECM that surrounds the electrocytes [50,77]. Hence, differences in surface invaginations could also be reflected in differences in the expression of genes whose products interact with the ECM. We detected none of these genes upregulated in individuals with long EODs, whereas those with short EODs upregulated three: hyaluronidase-5-like (breaks down hyaluronan), collagenase 3-like (plays a role in the degradation of ECM proteins), and fibroblast growth factor binding protein 1 (acts as a carrier protein that releases fibroblast-binding factors from the ECM storage).
Overall, our results identify genes that may affect EOD duration through membrane rearrangements, which could be coupled with changes in the interaction with the ECM and the expression of cytoskeletal and sarcomeric genes. Since this waveform feature is modulated by testosterone, this androgen could facilitate the study of these suggested genetic underpinnings under more rigorously controlled circumstances.
Waveform Polarity
The number [47] and diameter [34,43] of stalk penetrations are positively correlated with the magnitude of P0. This phenomenon is exemplified by P. sp. ‘magnostipes type I’, which has the largest P0 in the OTUs examined in this study, giving the EOD the appearance that it ‘inverted’ relative to other EODs. This OTU has numerous, large diameter penetrations, whereas P. kingsleyae (P-type) has relatively fewer, small diameter penetrations (Fig. 1). These large structural differences may influence the electrocyte’s connection with the surrounding ECM, and our results support this: the phenotypes of waveform polarity exhibited differences in the expression of genes that interact with the extracellular space. We found no such genes upregulated in individuals with large penetrations, whereas in samples with small penetrations two GO terms were enriched: ‘extracellular matrix organization’ and ‘response to mechanical stimulus,’ and three genes were upregulated: collagen alpha-1(I) chain-like, collagen alpha-1(X) chain-like, and hyaluronidase-5-like (breaks down hyaluronan).
OTUs with large penetrations also exhibited higher expression of genes related to cytoskeletal, sarcomeric, and lipid metabolism proteins than do individuals with smaller penetrations (Table 5). This includes the GO terms ‘heart contraction,’ ‘sarcomerogenesis,’ and ‘cardiac myofibril assembly,’ representative of the genes myosin XVB, heat shock protein HSP 90-alpha 1 (has a role in myosin expression and assembly), tubulin beta chain, tubulin beta 2A class IIa, fibroblast growth factor 13 (a microtubule-binding protein which directly binds tubulin and is involved in both polymerization and stabilization of microtubules), spindle and kinetochore associated complex subunit 3 (a component of the kinetochore-microtubule interface), troponin C, slow skeletal and cardiac muscles, protein tilB homolog (may play a role in dynein arm assembly), and cysteine and glycine rich protein 3 (codes for the Muscle LIM Protein (MLP), which is implicated in various cytoskeletal and sarcomeric macromolecular complexes [81–83] and is a positive regulator of myogenesis [84]). In contrast, samples with small penetrations only show upregulation of the genes myosin light chain 3-like and desmin-like.
We hypothesize that the differences in the number and diameter of penetrations that drive variation in EOD waveform polarity require changes to the electrocyte’s cytoskeletal and membrane properties. These arrangements may be necessary for the electrocytes body to adjust to the increased volume displacements imposed by larger penetrations; or alternatively, they may be a prerequisite for penetrating stalks to enlarge. Our observations support and elaborate on the hypothesis that sarcomeric proteins (which are non-contractile in mormyrids) may function as a means of cytoskeletal support and structural integrity in mormyrid electrocytes [85].
Waveform Complexity
Waveform complexity refers to the number of phases present in an EOD, and mormyrid EODs vary in the presence of a small head negative phase (P0). The presence or absence of P0 in the EOD depends on the anatomical configuration of the electrocytes: P0-present (or triphasic) EODs are produced by electrocytes that are innervated on the anterior face and have penetrating stalks (Pa), whereas P0-absent (or biphasic) EODs are produced by electrocytes innervated on the posterior face and lack penetrating stalks (NPp) [42,43,47,48,51,52]. Developmental studies of the adult EO suggest that Pa electrocytes go through a NPp stage before developing penetrations [86,87]. This motivated the hypothesis that penetrations develop by the migration of the posteriorly innervated stalk system (NPp stage) through the edge of the electrocyte, and that the interruption of this migration represents a mechanism for Pa-to-NPp reversals [88,89].
Our data indicates several DEGs that implicate specific cytoskeletal and ECM reorganizations between triphasic and biphasic EODs (Table 6). We observed differential expression of several genes associated with the polymerization of F-actin. In triphasic individuals, we observe upregulation of the gene capping protein regulator and myosin 1 linker 3 (CARMIL3); although this gene is little studied, its paralog CARMIL2 enhances F-actin polymerization. Also upregulated is the gene cysteine and glycine rich protein 3 (MLP protein, see Waveform Polarity). In contrast, the biphasic phenotype upregulated the genes protein-methionine sulfoxide oxidase mical2b-like (promotes F-actin depolymerization), transmembrane protein 47-like (may regulate F-actin polymerization), 5’-AMP-activated protein kinase subunit gamma-2-like (could remodel the actin cytoskeleton), and FYVE, RhoGEF and PH domain-containing protein 4-like (regulates the actin cytoskeleton). Thus, biphasic and triphasic EODs display several DEG, with potentially diverging outcomes, that influence the cellular internal structure.
We hypothesize that electrocytes with penetrating stalks (which produce triphasic EODs) require cytoskeletal arrangements to produce penetrations, perhaps related to increasing F-actin, to maintain their structural integrity. Similar to what we propose under waveform polarity, these arrangements may be necessary for the electrocyte body to adjust to the penetrations; or alternatively, they may be a prerequisite for penetrations to occur.
We also observed differential expression in a number of proteins expressed in the ECM. In biphasic OTUs, we found the GO term ‘mucopolysaccharide metabolic process’ to be enriched, and two upregulated copies of the gene inter-alpha-trypsin inhibitor heavy chain 3, which may act as a binding protein between hyaluronan and other ECM proteins. In triphasic individuals, we found the GO term ‘cell-matrix adhesion’ enriched, and the upregulated genes epiphycan-like, which may play a role in cartilage matrix organization, and two copies of ependymin-like (these paralogs are ortholog to the zebrafish ependymin-like gene epdl2).
The two ependymin-like genes are among the most differentially expressed genes between biphasic and triphasic OTUs (500-fold more highly expressed in triphasic individuals in the comparison from Set A’, Supplemental File 2). Although expressed in many tissues and with little amino acid similarity, all ependymin-related proteins are secretory, calcium-binding glycoproteins that can undergo conformational changes and associate with collagen in the ECM. They have been involved in regeneration, nerve growth, cell contact, adhesion and migration processes [90]. We hypothesize that ependymin-related proteins, and potentially some of the other ECM proteins highly expressed in triphasic individuals, are part of the “fibrillar substance” that lies between the stalk and the electrocyte body in individuals with penetrating electrocytes [50]. Notably, the P. kingsleyae genome assembly, which is based on a biphasic individual, contains three paralogs of epdl2, whereas the osteoglossiform Scleropages formosus only has one, suggesting the intriguing possibility that this gene may have been duplicated in Paramormyrops or in mormyrids. Ependymin-related paralogs have been proposed as suitable targets to experimentally test gene subfunctionalization [91].
Altogether, our results for EOD waveform complexity suggest that the conformation of the cytoskeleton and the expression of proteins secreted to the ECM are important elements of the stalk penetrations, which generate triphasic EODs.
Concluding Remarks
Two previous studies focused on DGE between EOs in another mormyrid species adaptive radiation/explosive diversification (genus Campylomormyrus). Both focused on comparisons between the species C. tshokwe (long duration) and C. compressirostris (short duration). The first study performed a canditate gene approach to quantify the expression patterns of 18 sodium and potassium homeostasis genes between the EOs of the two species [55], whereas Lamanna et al. [92] used RNA-seq to simultaneously compare gene expression between EOs of these species. While we did not observe differences in expression of any of the potassium channels reported by Nagel et al. (2017), we note that Lamanna et al. (2015) reported differential expression of metabolic pathways related genes, particularly fatty acid metabolism, and ion transport and neuronal function (subcluster 4). While we found no overlap in the identities of any specific genes in our study, we note that our analysis also detected differential expression of lipid metabolism related genes when comparing EODs of different duration.
The widespread differential expression within Paramormyrops of calcium-related genes (Supplemental File 6) emphasizes a much-needed area of future research. Calcium is known to be necessary for the proper electrocyte repolarization in some gymnotiform species [93], but it may not be as important in others [94]. Few studies have addressed calcium physiology in mormyrids: calcium-related proteins have been reported as differentially expressed in EO vs skeletal muscle in Campylomormyrus [92] and in Brienomyrus brachyistius [85]. As electrocytes do not contract, calcium may act in electrocytes as an important second messenger or cofactor, participate in interactions with the ECM, and/or to contribute to the electrocyte’s electrical properties through interaction with voltage gated ion channels.
A second notable pattern in our results is the unusual degree to which mormyrid electrocytes retain expression of some sarcomeric genes, which has been noted in several studies [58,85,92,95,96]. The role these proteins serves in electrocytes is presently unknown; however, results indicate that they are highly differentially expressed between Parmormyrops with different EOD waveforms. This strongly suggests that sarcomeric proteins could play an important role in the conformational changes required to develop and sustain penetrations.
Finally, the biochemical composition and function of the ECM in electrocytes is poorly understood. Our analysis identifies differential expression in ECM-related genes across the Paramormyrops, associated with each of the three EOD features studied. At least two of these genes (inter-alpha-trypsin inhibitor heavy chain 3 and hyaluronidase-5-like), distributed across all three EOD features, interact with hyaluronan. Hyaluronan is a type of mucopolysaccharide and a major component of some soft tissues and fluids [97]. Therefore, we propose that hyaluronan is an important constituent of the ECM in mormyrid fish. In addition, the electrocyte-ECM interactions should be an important area of future investigation, as they are likely to influence electrocyte shape, electrical properties, and potentially the morphology of penetrations and surface membrane invaginations.
To conclude, this study examined the expression correlates of a hyper-variable phenotype in a rapidly diversified genus of mormyrid electric fish. We examined DGE between taxa exhibiting variability along three major axes of variation that characterize EOD differences within Paramormyrops and among mormyrids: duration, polarity, and complexity. We found that gene expression in EOs among closely related species is largely similar, but patterns of DGE between EOs is primarily restricted to four broad functional sets: (1) cytoskeletal and sarcomeric proteins, (2) cation homeostasis, (3) lipid metabolism and (4) proteins that interact with the ECM. Our results suggest specific candidate genes that are likely to influence the size, shape and architecture of electrocytes for future research on gene function and molecular pathways that underlie EOD variation in mormyrid electric fish.
Competing interests
The authors declare they have no competing interests.
Ethics approval and consent to participate
All research protocols involving live fish were approved by the Michigan State University and Cornell University Institutional Animal Care and Use Committees.
Author Contributions
JRG designed the experiment, collected and identified specimens, performed library construction and sequencing, and oversaw data analysis. ML performed QC analysis, designed and implemented the data analysis procedure, and performed examination of gene ontology and gene function. Both authors contributed to writing the manuscript.
Acknowledgements
This work was supported by the National Science Foundation (1455405: JRG, PI) and a grant from the Cornell University Center for Vertebrate Genomics. The authors thank Bruce Carlson, Matt Arnegard, Carl Hopkins, Roger Afene, Jean Danielle Mbega and Marie-Francois Eva for assistance with specimen collection in Gabon. The authors also acknowledge the Michigan State University Institute for Cyber-Enable Research for use of their high-performance computing infastructure, and CENREST Gabon for logistical support and collection permits.
References
- 1.↵
- 2.↵
- 3.↵
- 4.
- 5.
- 6.↵
- 7.↵
- 8.↵
- 9.↵
- 10.
- 11.
- 12.
- 13.↵
- 14.↵
- 15.
- 16.
- 17.↵
- 18.↵
- 19.↵
- 20.
- 21.
- 22.
- 23.
- 24.
- 25.
- 26.
- 27.↵
- 28.↵
- 29.↵
- 30.↵
- 31.↵
- 32.↵
- 33.↵
- 34.↵
- 35.↵
- 36.↵
- 37.↵
- 38.↵
- 39.↵
- 40.↵
- 41.↵
- 42.↵
- 43.↵
- 44.↵
- 45.↵
- 46.↵
- 47.↵
- 48.↵
- 49.↵
- 50.↵
- 51.↵
- 52.↵
- 53.↵
- 54.
- 55.↵
- 56.
- 57.↵
- 58.↵
- 59.↵
- 60.↵
- 61.↵
- 62.↵
- 63.↵
- 64.↵
- 65.↵
- 66.↵
- 67.↵
- 68.↵
- 69.↵
- 70.↵
- 71.↵
- 72.↵
- 73.↵
- 74.↵
- 75.↵
- 76.↵
- 77.↵
- 78.↵
- 79.↵
- 80.↵
- 81.↵
- 82.
- 83.↵
- 84.↵
- 85.↵
- 86.↵
- 87.↵
- 88.↵
- 89.↵
- 90.↵
- 91.↵
- 92.↵
- 93.↵
- 94.↵
- 95.↵
- 96.↵
- 97.↵