Abstract
TRP melastatins (TRPMs) are most well-known as cold and menthol sensors, but are in fact broadly critical for life, from ion homeostasis to reproduction. Yet the evolutionary relationship between TRPM channels remains largely unresolved, particularly with respect to the placement of several highly divergent members. To characterize the evolution of TRPM and like channels, we performed a large-scale phylogenetic analysis of >1,300 TRPM-like sequences from 14 phyla (Annelida, Arthropoda, Brachiopoda, Chordata, Cnidaria, Echinodermata, Hemichordata, Mollusca, Nematoda, Nemertea, Phoronida, Priapulida, Tardigrada, and Xenacoelomorpha), including sequences from a variety of recently sequenced genomes that fill what would otherwise be substantial taxonomic gaps. These findings suggest: (1) The previously recognized TRPM family is in fact two distinct families, including canonical TRPM channels, and an 8th major, previously undescribed family of animal TRP channel, TRP soromelastatin (TRPS); (2) two TRPM clades predate the last bilaterian-cnidarian ancestor; and (3) the vertebrate-centric trend of categorizing TRPM channels as 1-8 is inappropriate for most phyla, including other chordates.
Introduction
Transient receptor potential (TRP) channels are a superfamily of ion channel commonly characterized by their 6 transmembrane segments and broad sensory capacity. Among animals, TRP channels have been canonically divided into 7 families (Venkatachalam and Montell 2007; Peng, et al. 2015): TRPA (ankyrin), TRPC (canonical), TRPM (melastatin), TRPML (mucolipin), TRPN (no mechanoreceptor potential C), TRPP (polycystin, or polycystic kidney disease), and TRPV (vanilloid; and a proposed sister family, TRPVL). These TRP channels vary substantially, but TRPM has arguably diversified the most with respect to function, participating in at least cardiac activity (Yue, et al. 2015), magnesium homeostasis (Schlingmann, et al. 2007; Hofmann, et al. 2010), egg activation (Carlson 2019), sperm thermotaxis (De Blas, et al. 2009), cell adhesion (Su, et al. 2006), apoptosis (Driscoll, et al. 2017), inflammation (Ramachandran, et al. 2013), and most famously, cold (Bautista, et al. 2007; Turner, et al. 2016) and menthol (McKemy, et al. 2002; Peier, et al. 2002; Himmel, et al. 2019) sensing.
TRPM channels are also thought to be incredibly ancient, predating the emergence of metazoans (>1000Ma) (Peng, et al. 2015; Himmel, et al. 2019). In some species, single moonlighting proteins carry out several functions (e.g. Drosophila melanogaster Trpm), while in others, functions are compartmentalized in a set of diverse paralogues (e.g. human TRPM1-8). However, little is known about the evolutionary history of TRPMs, or to what degree channels are related across taxa. Our understanding of TRPM evolution is additionally clouded by the existence of several highly divergent putative TRPM channels with uncertain origins (Teramoto, et al. 2005; Peng, et al. 2015; Kozma, et al. 2018; Himmel, et al. 2019). Here, we made use of the rapidly growing body of genomic data in order to better characterize the evolution of the TRPM family.
Via a stringent screening process, we assembled a database of >1,300 predicted TRPM-like sequences from 14 diverse eumetazoan phyla (Fig. 1). In this database we gave particular attention to underrepresented taxa, as well as included TRP genes identified in a number of recently sequenced genomes (Table S1; including, but not limited to, acoel flatworm, moon jelly, and great white shark). Herein, we elucidate the evolutionary history and familial organization of both TRP melastatin, and a previously unrecognized sister family that predates the Cnidaria-Bilateria split, TRP soromelastatin.
Materials & Methods
Data Collection & Curation
Starting with previously characterized TRPM sequences from human (NCBI CCDS), mouse (NCBI CCDS), Drosophila melanogaster (FlyBase), and Caenorhabditis elegans (WormBase), a TRPM-like protein sequence database was assembled by performing BLASTp against NCBI collections of non-redundant protein sequences, with D. melanogaster Trpm (isoform RE, FlyBase ID: FBtr0339077) serving as the bait sequence. In order to maximize useful phylogenetic information, only BLAST hits >300 amino acids in length with an E-value less than 1E-30 were retained. As we were interested in the origins of TRPM channels, and in less-studied taxa, only three tetrapod sequence-sets were included, from human, mouse, and chicken.In order to expand the taxa sampled, tBLASTn and BLASTp were used to search publically available, genomically-informed gene models for 11 cnidarians, 2 xenacoelomorphs, 1 hemichordate, 1 nemertean, 1 phoronid, 2 agnathans, and 4 chondrichthyes (Table S1).
We used several methods in order to validate and improve the quality of the initial database. First, CD-HIT (threshold 90% similarity) was used to identify and remove duplicate sequences and predicted isoforms, retaining the longest isoform (Li and Godzik 2006; Huang, et al. 2010; Fu, et al. 2012). Phobius was then used to predict transmembrane topology (Käll, et al. 2004, 2007); sequences which did not have at least 6 predicted transmembrane (TM) segments, which is typical of TRP channels, were removed. Sequences with more than the 6 predicted TM segments were analyzed via InterProScan (Mitchell, et al. 2019), and those with more than 1 ion-transport domain were removed. More than 90% of the remaining sequences contained a highly conserved glycine residue in the predicted TM domain (corresponding to D. melanogaster G-1049); the vast majority of those missing this residue had large gaps in an initial alignment and were subsequently removed.
Searches for TRPS (ced-11-like), TRPN, and TRPC sequences followed the same protocol. For TRPS, sequences from Caenorhabditis elegans, Strigamia maritima, and Octopus vulgaris were used as bait. For TRPN and TRPC datasets, Drosophila melanogaster nompC (isoform PA, FlyBase ID: FBpp0084879) and Trp (isoform PA, Flybase ID: FBpp0084879) served as bait sequences, respectively.
Principal Component Analysis
Principal component analysis (PCA) was used to help resolve protein families. TRPC, TRPN, and TRPM/TRPS database sequences were aligned by MAFFT. A pairwise sequence identity matrix was then computed, and PCA performed against it in Jalview (Waterhouse, et al. 2009). Data were exported from Jalview and visualized and edited in GraphPad Prism and Adobe Illustrator CS6.
Phylogenetic Tree Estimation
For the maximum likelihood approach, sequences were first aligned using MAFFT with default settings (Rozewicki, et al. 2019). Gap rich sites and poorly-aligned sequences were trimmed with TrimAl (Capella-Gutiérrez, et al. 2009). IQ-Tree (Nguyen, et al. 2014) was then used to generate trees by the maximum likelihood approach, using the best models automatically selected by ModelFinder (Kalyaanamoorthy, et al. 2017). Branch support was calculated by ultrafast bootstrapping (UFBoot, 2000 bootstraps) (Hoang, et al. 2017).
In order to test the alternative hypothesis that some trees formed due to long-branch attraction, gs2 was used to generate trees by the Graph Splitting method (Matsui and Iwasaki 2019). Branch support values were computed by the packaged edge perturbation method (EP, 2000 iterations). All trees were visualized and edited in iTOL and Adobe Illustrator CS6.
Homologue Prediction via Tree Reconciliation
In order to identify duplication events, TRPS and TRPM phylograms were reconciled using NOTUNG 2.9.1 (Durand, et al. 2006; Vernot, et al. 2008; Stolzer, et al. 2012). Edge weight threshold was set to 1.0, and the costs of duplications and losses were set to 1.5 and 1.0, respectively. In order to formulate the most parsimonious interpretation of the resulting trees, weak branches were rearranged (UFboot 95 cutoff) against a cladogram based in an NCBI taxonomic tree, wherein we placed Xenacoelomorpha (represented by acoel flatworms) as the sister group to all other bilaterians (Cannon, et al. 2016), and Priapulida as an outgroup to all other ecdysozoans (Yamasaki, et al. 2015). All other polytomies were randomly resolved using the ape package in R (Paradis and Schliep 2018).
Results
An ancient, unrecognized sister family to TRPM – TRP soromelastatin (TRPS)
Proteins within the same family typically have a high degree of sequence similarity, yet highly divergent TRPM-like proteins have been catalogued, a notable example being Caenorhabditis elegans cell death abnormal 11 (ced-11). The canonical C. elegans TRPMs gtl-1, gtl-2, and gon-2 share roughly 40% sequence identity with each other. However, ced-11—often considered a fourth C. elegans TRPM—shares approximately 18% sequence identity with the 3 canonical paralogues. Given this substantial difference, it seemed plausible that ced-11, and like proteins, had been errantly included in the TRPM family.
The TRPC and TRPN families are typically thought to be most closely related to TRPM (Peng, et al. 2015), and therefore constituted hypothetical homes for ced-11. ced-11, however, shares only 15% sequence identity with known C. elegans TRPC paralogues (trp-1 and trp-2), and 14% sequence identity with C. elegans TRPN (trp-4). Yet sequence identity between trp-4 and its TRPC counterparts is approximately 20%. In other words, ced-11 is less similar to TRPMs than TRPNs and TRPCs are to each other.
In order to clarify the relationship of ced-11-like proteins to canonical TRPM channels, we collected those sequences most similar to it from our initial TRPM-like sequence database, and phylogenetically characterized them. BLASTing our database with ced-11-like sequences recovered a number of sequences restricted to several protostome taxa and lancelets (Cephalochordata).
For any species with a ced-11-like protein, we assembled a database of putative TRPC and TRPN channel sequences. These sequences were then phylogenetically characterized alongside cnidarian, xenacoelomorph, insect (D. melanogaster), and human sequences. In the resulting tree, ced-11-like proteins formed a sister clade to the more traditional TRPM clade, the latter including cnidarian TRPM-like channels (Fig. 2 and Fig. S1).
Two competing hypotheses could explain these findings: (1) ced-11-like proteins constitute a distinct family of TRP channel which predates the cnidarian-bilaterian split, or (2) a variety of TRPM channels emerged independently in various taxa and diversified extremely rapidly, resulting in a clade which formed as a result of long-branch attraction, an artifact of many phylogenetic analyses (Bergsten 2005).
Hypothesis 2 appears highly unlikely. Most importantly, while C. elegans ced-11 itself has a relatively long branch, when qualitatively compared to other clades, the branches within the ced-11-like clade were not unusually long (Fig. 2 and Fig. S1). Additionally, principal component analysis of a pairwise sequence identity matrix revealed that ced-11-like sequences cluster together independent of TRPM-like sequences, suggesting they cluster in the phylogram due to sequence similarity (Fig. 3A). We tested the long-branch hypothesis by estimating trees which excluded Cnidaria and Xenacoelomorpha, which had particularly long branches and could serve to exacerbate long-branch attraction, were it present. The resultant phylogram still evidenced the split between ced-11-like and TRPM-like channels, with high branch confidence (Fig. S2). Moreover, we generated a phylogram by the Graph Splitting method, which is reported to be extremely robust when faced with the possibility of long-branch attraction in superfamily-level datasets (Matsui and Iwasaki 2019). This method likewise reproduced the ced-11-like-TRPM split with high edge perturbation branch support (Fig. S3).
These results strongly indicate that these two lineages diverged in or prior to the last cnidarian-bilaterian ancestor, and that ced-11-like proteins constitute an 8th family of metazoan TRP channel. We have thus named the ced-11-like family of TRP channels TRP soromelastatin (soro-, sister), or TRPS (Fig. 2 and Fig. 3).
The structure of TRPS channels suggests a SLOG- and Nudix-linked ancestor
While the function of TRPS channels remains unknown, domain prediction reveals that both TRPM and TRPS channels share an N-terminal SMF/DprA-LOG (SLOG) domain (which is hypothesized to function in ligand sensing) and a C-terminal ADP-ribose phosphohydrolase (Nudix) domain (Fig. 3B, top). These results indicate that the ancestral TRPM-TRPS channel was likely both SLOG- and Nudix-linked, and that the Ankyrin repeats typical of TRPC and TRPNs were lost prior to the TRPM-TRPS split. The TRPM alpha kinase domain (typical of human TRPM6 and TRPM7), however, appears to have arisen specifically in the TRPM lineage. These findings are consistent with previous findings suggesting that the TRPM ancestor was Nudix-linked (Schnitzler, et al. 2008), but pushes the origins of this domain further back in evolutionary history.
A notable difference between TRPM and TRPS channels lies in the TRP domain, a highly conserved, hydrophobic region located C-terminally to the transmembrane domain of TRPC, TRPN, and TRPM channels (Venkatachalam and Montell 2007). Consensus sequences for TRPM and TRPS (Fig. 2), while identical in TRP box 1, are divergent in TRP box 2 and the intermediate TRP segment (Fig. 3B, bottom). The functional consequences of these changes, if any, are unknown.
The TRPS family is largely restricted to protostomes
Having established that these TRPS sequences constitute a distinct set of channels, we assembled a more complete TRPS sequence database and phylogenetically characterized the channel family. These data suggest that, among Eumetazoa, TRPS genes are only present in some protostomes and lancelets (Fig. 4 and Fig. S4). The lack of widespread conservation among deuterostomes (most notably vertebrates) and insects likely explains why the family had gone unnoticed until now.
TRPS was likely lost early in deuterostome evolution – among the ambulacrarians (echinoderms and hemichordates), and in early Olfactores (tunicates and vertebrates) following the olfactore-lancelet split (Fig. S5, left). A recent study evidences that Ambulacraria and Xenacoelomorpha might form sister clades (Philippe, et al. 2019) – if this is the case, it may be more likely that TRPS was lost early in so-called “xenambulacrarian” evolution (Fig. S5, right).
TRPS duplication appears to have been limited during early animal evolution. While the number of TRPS paralogues varies by species (Fig. S4), duplication events occurred only after major taxa emerged, independently in molluscs, nematodes, tardigrades, and chelicerates (including arachnids and horseshoe crabs). Molluscs have two TRPS paralogues, but present lack of evidence for lophotrochozoan TRPS outside of molluscs makes it difficult to predict at what point in spiralian evolution the duplication event occurred. The simplest explanation is that it occurred specifically in molluscs, and that a single TRPS copy was lost among other lophotrochozoan taxa.
Among Euarthropoda, TRPS appears in chelicerates and myriapods, but there is no evidence for TRPS in crustaceans, springtails, or insects, suggesting that the single arthropod TRPS was lost in Pancrustacea, conserved in Myriapoda, and expanded independently in Chelicerata (Fig. S6).
Two TRPM clades predate the Cnidaria-Bilateria split
We next phylogenetically characterized and reconciled TRPM sequences among major taxa. Each set of sequences was initially assessed alongside cnidarian, xenacoelomorph, Drosophila, and human sequences, and rooted with TRPS sequences.
The general consensus of these analyses indicates that the TRPM family is made up of two distinct clades, here and previously deemed αTRPM and βTRPM (Himmel, et al. 2019), which emerged prior to the Cnidaria-Bilateria split (Fig. 5 and Figs. S7-S16). What might have constituted a previously described basal clade can be almost wholly explained by the discovery of TRPS (Peng, et al. 2015; Himmel, et al. 2019). In some of our initial phylograms, a basal or separate clade did appear, yet it always included Xenacoelomorpha and was inconsistent in its topology across analyses (Figs. S7-S11), suggesting that Xenacoelomorpha acted as a phylogenetically unstable rogue taxon (Thomson and Shaffer 2010). In order to assess this possibility, we performed a second set of analyses which excluded Xenacoelomorpha. This resulted in trees with largely consistent topology despite differing taxon sampling, indicating that xenacoelomorph sequences are in fact problematic (Figs. S12-S16).
Due to the overwhelming consistency of trees with different taxon sampling, and the inconsistency seen in trees including Xenacoelomorpha, xenacoelomorph TRPM sequences were treated as rogue taxa. In addition, an extremely small subset of arthropod TRPMs (12 sequences restricted to chelicerates and crustaceans; Fig. S11 and Fig. S16) may be part of a previously described Crustacea-specific TRPM sub-family (Kozma, et al. 2018). These trees suggest that these sequences are βTRPM-like and related to a subset of cnidarian sequences, yet this Cnidaria-inclusive clade is not strongly evidenced in phylograms with different taxon sampling. Like Xenacoelomorph sequences, the evolutionary histories of these sequences are left incertae sedis.
In summary, these results strongly support two duplication events predating the Cnidaria-Bilateria split: the TRPS-TRPM split and the α-β TRPM split.
TRPM1-8 expansion occurred early in vertebrate evolution, and constitutes a poor standard for TRPM family organization
The vertebrate TRPM1-8 expansion has been the focus of the majority of TRPM literature, and has been the principal basis for characterizing TRPM channels (Samanta, et al. 2018; Zhang, et al. 2018; Chen, et al. 2019). However, these trees evidence that the TRPM1-8 expansion occurred after the vertebrate-tunicate split, and before agnathans (jawless fish; lampreys and hagfish) split from the ancestor of all other vertebrates (Fig. 5, Fig. 6, and Fig. S13). Although immunohistochemical evidence has previously suggested that TRPM8 is present in teleost fish (Majhi, et al. 2015), we found no evidence of it in available sequences for ray-finned fish, cartilaginous fish, or agnathans (Fig. 6 and Fig. S17). While the simplest naive hypothesis would be that TRPM8 did not emerge until lobe-finned fish emerged, these phylogenetic analyses indicate that TRPM8 was independently lost in the indicated taxa, and conserved in the lobe-finned vertebrate lineage (including tetrapods).
Moreover, the 1-8 nomenclature may under-describe TRPMs among one of the most abundant vertebrate clades – the teleost fish. While basal ray-finned fish (e.g., Erpetoichthys calabaricus, the freshwater snakefish, or reedfish) have a TRPM topology that closely matches other vertebrates, the emergence of teleosts came with TRPM expansion. For example, there are as many as three teleost TRPM4 paralogues (Fig. S17).
Discussion
The evolutionary history of TRPM channels has been clouded by divergent sequences, making it uncertain if an ancestral clade of TRPMs had survived in species like C. elegans, or if these species had independently evolved rapidly changing TRPM paralogues (Teramoto, et al. 2005; Peng, et al. 2015; Kozma, et al. 2018; Himmel, et al. 2019). By taking advantage of the abundance of publicly available genomic data, we have demonstrated that the difficulty in phylogenetically characterizing TRPM channels is the result of an ancient, hidden family of channels that appeared before the Cnidaria-Bilateria split – the TRPS. By recognizing and characterizing this family, we now better understand not only the evolution and diversification of TRPM, but also the evolution of the broader TRP superfamily.
While some have been careful in describing TRP channels in taxon-specific ways (Saito and Shingai 2006; Hofmann, et al. 2010; Peng, et al. 2015), these findings are the strongest challenge to the pervasive, vertebrate-centric dogma that the TRPM family is constituted by 8 distinct paralogues organized into four subfamilies (Samanta, et al. 2018; Zhang, et al. 2018; Chen, et al. 2019). These results instead support that the eumetazoan TRPM family consists of two distinct radiations (αTRPM and βTRPM) which themselves predate the Cnidaria-Bilateria split. Importantly, these findings support that TRPM diversification occurred independently among cnidarians, ambulacrarians, lophotrochozoans, and other taxa, and that the TRPM1-8 expansion is specific to vertebrates. Based on these findings, we conclude that the TRPM1-8 nomenclature is at best evolutionarily uninformative (e.g. insect channels being simply TRPM1- or 3-like), and at worst grossly inaccurate (e.g. cnidarian TRPMs belonging to the TRPM2/8 subfamily) for describing members of this diverse family of critically important ion channels.
Funding
This work is supported by the National Institute of Neurological Disorders and Stroke at the National Institutes of Health (R01NS115209 to DNC); the National Institute of General Medical Sciences at the National Institutes of Health (R25GM109442-01A1); a GSU Brains & Behavior Fellowship (to NJH); a GSU Brains & Behavior Seed Grant (to DNC); and a Kenneth W. and Georgeanne F. Honeycutt Fellowship (to NJH).
Author contributions
Conceptualization and methodology, NJH; sequence collection, NJH; database curation, NJH and TRG; domain homology analysis, NJH and TRG; phylogenetic and other formal analyses, NJH; prepared the original draft, NJH; reviewed and edited the final draft, NJH, TRG, and DNC; visualization, NJH; supervision, DNC; funding acquisition, DNC.
Data and materials availability
The TRPN, TRPC, TRPM, and TRPS sequence databases have been deposited on Dryad in the FASTA format (doi:10.5061/dryad.kwh70rz03).
Acknowledgments
We thank Dr. Charles Derby and Mihika Kozma for critically assessing the original manuscript, and the PhyloPic repository, the source for many of the animal silhouettes used throughout (distributed in public domain). We also thank all the investigators who made sequence information public, which made this work possible.