ABSTRACT
Evolutionary transitions to a social lifestyle in insects are associated with lineage-specific changes in gene expression, but the key nodes that drive these regulatory changes are largely unknown. We tested the hypothesis that changes in gene regulation associated with social evolution are facilitated by lineage-specific function of microRNAs (miRNAs). Genome scans across 12 bee species showed that miRNA copy-number is mostly conserved and not associated with sociality. However, deep sequencing of small RNAs in six bee species revealed a substantial proportion (20-35%) of detected miRNAs had lineage-specific expression in the brain, 24-72% of which did not have homologs in other species. Lineage-specific miRNAs disproportionately target lineage-specific genes, and have lower expression levels than shared miRNAs. The predicted targets of lineage-specific miRNAs are enriched for genes related to social behavior in social species, but they are not enriched for genes under positive selection. Together, these results suggest that novel miRNAs may contribute to lineage-specific patterns of social evolution. Our analyses also support the hypothesis that many new miRNAs are purged by selection due to deleterious effects on mRNA targets, and suggest genome structure is not as influential in regulating bee miRNA evolution as has been shown for mammalian miRNAs.
INTRODUCTION
Eusociality has evolved several times in the hymenopteran insects. In its most basic form, this lifestyle involves reproductive queens living with their worker daughters who forego direct reproduction to cooperatively defend the nest, care for their siblings, and forage for the colony. Due to the complex nature of this lifestyle, the evolution of eusociality likely requires modification of molecular pathways related to development, behavior, neurobiology, physiology, and morphology [1]. The evolution of eusociality is thus expected to involve both genetic changes as well as changes in the way the genome responds to the environment [2]. It is therefore unsurprising that recent studies aimed at identifying the genomic signatures of eusocial evolution in insects have found that social species share an increased capacity for gene regulation [3, 4]. Evidence for this comes from signatures of rapid evolution of genes involved in transcription and translation, gene family expansions of transcription factors, and increasing potential for DNA methylation and transcription factor binding activity in conserved genes. Interestingly, while these types of regulatory changes are common to independent origins and elaborations of eusociality, the specific genes and regulatory elements involved are unique to each lineage [3–5]. This suggests that lineage-specific processes are influential in generating new patterns of gene regulation that contribute to social behavior.
Small, non-coding RNAs such as microRNAs (miRNAs) may be an important source of regulatory novelty associated with the evolution of phenotypic complexity, including eusociality. MiRNAs are short (∼21-22 nt), noncoding RNAs that regulate protein-coding genes through post-transcriptional binding to the 3’ UTR region of messenger RNA (mRNA) transcripts, in most cases preventing translation or causing mRNA degradation [6]. Each miRNA can target dozens to hundreds of mRNAs, and may therefore regulate multiple gene networks [6, 7]. Like mRNAs, the majority of miRNAs are generated via Pol II transcription, and are spatially- and temporally-specific in their expression patterns. Thus, complex changes in gene regulation can be achieved with relatively minor changes in miRNA expression. This can result in major phenotypic shifts or fine-tuning of phenotypic optimization [6]. Novel miRNAs can originate in a variety of genomic features, including exons and introns of protein-coding and non-coding RNA genes, transposable elements, pseudogenes, or intergenic regions, and thus emerge and disappear over relatively rapid timescales [8–11]. It is thus not surprising that expansion of the miRNA repertoire is associated with the evolution of morphological complexity across the tree of life [9,12–14].
There is accumulating evidence for a role of miRNAs in regulating the social lives of insects. While most miRNAs seem to be conserved in major lineages of insects [15, 16], expression levels vary across individuals performing different social functions, such as between workers performing different tasks in honey bees [17–19]. MiRNAs may also play a role in caste determination, as queen- and worker-destined larvae express different sets of miRNAs throughout development in honey bees [20–22] and bumble bees [23]. Additionally, miRNAs play a role in regulating some physiological correlates of social behavior in honey bees, including activation of ovaries in queens and workers [24] and response to the reproductive protein vitellogenin [25]. Together, these studies suggest that miRNAs could play a role in the evolution of eusociality through their effects on gene regulatory networks involved in socially-relevant traits. A rigorous test of this hypothesis requires comparisons of the presence, expression, and function of miRNAs across related species that vary in social organization.
Here we present a comprehensive comparative analysis of miRNAs across bee species with variable social organization. These solitary and social species shared a common ancestor ∼75-110 mya [26]. Previous comparative studies of miRNAs associated with eusociality have relied on the parasitoid wasp, Nasonia vitripennis, as a solitary comparison [16]. This is a far more distant relative to the social insects, sharing a last common ancestor with bees nearly 200 mya [26]. Moreover, the parasitoid lifestyle of N. vitripennis is different from that of eusocial bees in nearly every way. The lifestyle of solitary bees, such as the ones we include in this study, share many features of their natural history with the presumed ancestors from which eusociality evolved.
We first looked for miRNA repertoire expansions associated with eusociality by scanning 12 bee genomes for known miRNAs, and statistically evaluating copy-number of each miRNA type with regard to differences in sociality in a phylogenetic model. We then described and compared miRNAs expressed in the brains of six bee species from three families that include repeated origins of eusociality. We tested the hypothesis that changes in gene regulatory function associated with social evolution are facilitated by lineage-specific miRNA regulatory function with two predictions: (1) If lineage-specific miRNAs are assimilated into ancestral gene networks, their predicted target genes should be ancient and conserved. (2) If lineage-specific miRNAs play a role in social evolution, their predicted targets should be enriched for genes that function in social behavior (e.g., caste-biased expression) or genes that are under selection in social species.
MATERIALS AND METHODS
Sample Acquisition
We used adult females from six bee species for our study (Fig. 1). These species include both eusocial and solitary species with well-studied behavior from three families. Megalopta genalis samples were collected on Barro Colorado Island, Panama in 2015 and exported to the U.S.A. (permit SEX/A-37-15). Nomia melanderi samples were collected in Touchet, WA, U.S.A. with permission from land owners. Megachile rotundata samples were collected from Logan, UT, U.S.A. on the Utah State University campus. Bombus impatiens samples were collected from a commercial colony purchased from BioBest. Bombus terrestris samples were collected from colonies obtained from Pollination Services Yad-Mordechai, Kibbutz Yad-Mordechai, Israel. Apis mellifera samples were collected from hives in Urbana-Champaign, IL or the Tyson Research Field Station, MO, U.S.A. A. mellifera and Bombus samples were workers. M. genalis samples were lab-reared females. All other samples were reproductive females. All samples were collected into liquid nitrogen and stored at −80 °C until dissection.
RNA Isolation and Sequencing
Head capsules from B. impatiens, M. genalis, and N. melanderi samples were dissected after incubation in RNALater ICE (Ambion) to remove the entire brain. We used the mirVana miRNA Isolation kit with phenol (Ambion) to isolate total RNA from individual brains. Total RNA was sent to the University of Illinois Roy J. Carver Biotechnology Center for library preparation with the Illumina TruSeq Small RNA Sample Preparation kit and sequencing. Libraries were pooled, quantitated by qPCR, and sequenced on one lane for 51 cycles on a HiSeq 2500.
Whole brains of A. mellifera, B. terrestris, and M. rotundata were dissected from frozen heads. Total RNA from individual brains was isolated using TRIzol reagent (Thermo Fisher Scientific). All subsequent small-RNA sequencing steps were performed by the Genome Technologies Access Center at Washington University, using their Illumina TruSeq pipeline. Total RNA samples were size fractionated and multiplexed. Single-end small RNA libraries were prepared using the SMARTer kit (Clontech). Up to 12 barcoded libraries from a single species were run on a single Illumina HiSeq 2500 lane.
miRNA Discovery and Quantification
We used miRDeep2 [27] to identify and quantify miRNAs expressed in the brains of each species, with a three-step process of miRNA detection to identify homologous miRNAs between species. First, we gathered a set of mature miRNA sequences previously described in other insect species (Table S1). Reads for each sample were quality filtered (minimum length 18, removal of reads with non-standard bases), adapter-trimmed, and aligned to the species’ genome (Table S2) with the mapper.pl script. Approximately 60-84% of reads successfully mapped.
We then identified known and novel miRNAs in each sample with the miRDeep2.pl script, using our curated set of insect miRNAs (Table S1) as known mature sequences. We followed this with the quantifier.pl script to generate sets of known and novel miRNAs in each sample, along with quantified expression information for each. We then filtered novel miRNAs in each species according to the following criteria: no rRNA/tRNA similarities, minimum of five reads each on the mature and star strands of the hairpin sequence, and a randfold p-value < 0.05. Randfold describes the RNA secondary structure of potential pre-miRs [27].
We used these filtered miRNAs in a second run of detection and quantification, adding the mature sequences of novel miRNAs from each species to our set of known miRNAs, and repeated the pipeline above. This allowed detection of homologous miRNAs (based on matching seed sequences) that are not represented in miRBase across our species. We applied the same set of filtering criteria as for our first run.
Some of the novel miRNAs may exist in the genomes of other bees, even if they are not expressed. We used blastn (-perc_identity 50 -evalue 1e-5) to search for homologous precursor miR (pre-miR) sequences in 12 bee genomes (Table S2) for each of the novel miRNAs without a matching seed sequence.
miRNA Localization
We used bedtools intersect [28] to find overlap of miRNAs with predicted gene models (Table S3), and repetitive element repeatmasker [29] annotations from previously established repeat libraries [4,30–33].
Target Prediction
We extracted potential target sites 500 bp downstream from each gene model using bedtools flank and getfasta [28], following previous studies [21] and an average 3’ UTR region of 442 nt in Drosophila melanogaster [34]. Target prediction was run with miRanda v3.3 [35] (minimum energy threshold −20, minimum score 140, strict alignment to the seed region [-en −20 -sc 140 –strict]) and RNAhybrid v2.12 [36] (minimum free energy threshold −20). We kept only miRNA-target gene pairs that were predicted by both programs with p < 0.01.
Target Age and Functional Enrichment
Gene ages were determined using orthogroups from OrthoDB v9 [37], which includes A. mellifera, B. impatiens, B. terrestris, and M. rotundata. Gene sets of M. genalis and N. melanderi were mapped to Metazoa-level (330 species) orthogroups. Gene sets of M. genalis and N. melanderi were mapped to Metazoa-level (330 species) orthogroups. Gene ages were inferred from the taxonomic breadth of all species in each orthogroup: Vertebrata (≥ one vertebrate), Metazoa (≥ one non-arthropod and non-vertebrate metazoans), Arthropoda (≥ one non-insect arthropods), Insecta (≥ one non-holometabolous insects), Holometabola (≥ one non-hymenopteran holometabolous insects), Hymenoptera (≥ one non-Aculeata hymenopterans), Aculeata (≥ one non-Apoidea Aculeata), Apoidea (≥ one other Apoidea). Genes without identifiable orthologs were labeled ‘Unique’.
Gene Ontology (GO) terms for each species were derived from a previous study [4], with the exception of B. impatiens, for which GO terms were assigned based on reciprocal blastp (evalue < 1e-5) between two sets of gene models (OGS v1.2 and OGS v1.0). Functional enrichment was performed with the GOstats package [38] in R [39]. We included terms enriched at an unadjusted p < 0.1.
Enrichment tests of lineage-specific miRNA targets with previous studies
For each species, brain or head gene expression datasets related to socially relevant phenotypes (e.g., caste) and genes under positive selection were compared against targets of lineage-specific miRNAs. The complete list of included studies and gene lists are in Table S4. For M. genalis caste data, RNAseq reads from Jones et al. [40] (NCBI PRJNA331103) were trimmed using Trimmomatic (v. 0.36) [41] and aligned to an unpublished genome assembly of M. genalis (NCBI PRJNA494872) using STAR (v. 2.5.3) [42]. Reads were mapped to gene features using featureCounts in the Subread package (v. 1.5.2) [43]. Remaining differential expression analysis followed the methods of Jones et al. [40] using edgeR [44].
We also tested datasets identifying genes under selection in bee species or across social lineages of bees for enrichment of lineage-specific miRNA targets (Table S4). When necessary, we used reciprocal blastp (evalue < 10e-5) to identify orthologous genes across species, and only genes with putative orthologs were included in the analysis. Hypergeometric tests (using phyper in R) were used to test for significance of over- or under-enrichment between each pair of lists. The representation factor (RF) given represents the degree of overlap relative to random expectation (RF=1). RF is calculated as RF=x/E, where x is the number of genes in common between two lists and E is the expected number of shared genes (E = nD/N, where n is the number of genes in list 1, D is the number of genes in list 2, and N is the total number of genes.)
miRNA Diversification
We performed genome scans for small RNAs across 12 bee genomes (Table S2) using covariance models implemented with Infernal cmsearch using the gathering threshold for inclusion (--cut_ga) [45] to find all Rfam accessions in each bee genome. We used Spearman rank regressions to test for significant associations between miRNA copy-number and social biology. We categorized each species as solitary, facultative basic eusocial, obligate basic eusocial, or obligate complex eusocial following Kapheim et al. [4]. We used the ape package [46] in R [39] to calculate phylogenetic independent contrasts for both social organization and miRNA copy-number, cor.test to implement the Spearman’s rank correlation, and p.adjust with the Benjamini-Hochberg method to correct for multiple comparisons.
RESULTS
Low levels of miRNA copy-number variation among bee genomes
Our genome scans revealed very little variation in copy-number of most miRNAs among bee genomes. Of the 50 miRNA Rfam accessions, half had the same number of copies (1 or 2) in all 12 bee genomes (Table S5). The mean copy-number across all miRNAs in all bee genomes was 1.19 ± 0.74. One exception was miR-1122, for which we found 70 copies in M. genalis, but no copies in the other species. We did not find any significant associations between miRNA copy-number and social organization (Table S5).
Expressed miRNA diversity in bee brains
We identified 97-245 known and novel miRNAs expressed in the brains of each of our six species (Table S6). The majority of these were located in intergenic regions or introns (Table 1). Each species had at least one miRNA that originated from exons of protein-coding genes and repetitive DNA (Table 1). Most of the overlap between miRNA precursors and repetitive DNA corresponded to uncharacterized repeat elements, with very few overlaps with well-characterized transposons or retrotransposons (Table 1).
Most of the detected miRNAs in each species had known homologs in at least one other species. However, each species had a substantial proportion (20-35%) of detected miRNAs with lineage-specific expression in the brain (Table 1; Fig. 1A), 24-72% of which did not have any known homologs in other species (Table 1). We defined lineage-specific miRNAs as those with lineage-specific expression and for which no seed match with a known mature miRNA was identified (Table 1, columns 6-7), because these show the most evidence of being real miRNAs that are unique to a particular species. (Sequence similarity of pre-miRs in the genome of other bee species is not sufficient evidence that a mature miRNA is transcribed.) Lineage-specific miRNAs had significantly lower expression levels compared with homologous miRNAs in each species (t-tests: A. mellifera, p = 3.81e-05, B. impatiens, p = 0.003, B. terrestris, p = 0.006, M. genalis, p = 0.0003, M. rotundata, p = 8.00e-05, N. melanderi, p = 0.02).
Lineage-specific miRNAs were localized both within genes and intergenically. The proportion of lineage-specific miRNAs that were intra- or intergenic was similar to miRNAs with homologs for every species except N. melanderi, for which a disproportionate number of lineage-specific miRNAs were intragenic (χ2 = 4.78, p = 0.03). Genes that serve as hosts for intragenic lineage-specific miRNAs were not significantly older than would be expected by chance (i.e., belong to orthogroups shared with vertebrates) in any species (hypergeometric tests: p = 0.14-0.76). Across all species, genes that serve as hosts for intragenic lineage-specific miRNAs were not significantly older than genes hosting miRNAs with known homologs (χ2 tests: p = 0.05-0.89).
Of the miRNAs with homologs, most were expressed in all six species, but we detected one miRNA (miR-305) that was expressed in the brains of each of the social, but not the solitary, species. Although we did not detect expression of miR-305 in the two solitary species, M. rotundata and N. melanderi, genome scans of each species against the Rfam database suggested all bee species have one copy of this miRNA (Table S5). Predicted targets of miR-305 differed across species. Oxysterol (OG EOG091G0FV2) was a common target among the (social) Apidae bees, but was not among the targets for M. genalis. However, arylformamidase (OG EOG091G0KT8), which is also involved in lipid metabolism and transport, was a predicted target in M. genalis. Synaptobrevin (OG EOG091G0MPE), which is involved in synaptic plasticity and neurotransmitter release, was a predicted target of miR-305 in B. impatiens.
Lineage-specific miRNAs preferentially target lineage-specific genes and genes with caste-biased expression, but not genes under selection
If lineage-specific changes in gene regulatory function associated with social evolution are facilitated by novel miRNAs inserted into existing gene networks, then predicted targets of lineage-specific miRNAs should be highly conserved and enriched for genes with known functions in social evolution. Most of the predicted mRNA targets of lineage-specific miRNAs were highly conserved and belonged to orthogroups shared by vertebrates (Fig. 2; Table S8). However, most genes in each genome are also highly conserved, and there was not a significant enrichment for conserved genes among predicted targets of lineage-specific miRNAs, beyond what would be expected by chance (hypergeometric test: p > 0.99). We did, however, find a significant enrichment for genes unique to each species among the predicted targets of lineage-specific miRNAs (hypergeometric tests: A. mellifera – RF = 1.51, p = 5.44e-5; B. impatiens – RF = 1.28, p = 0.02; B. terrestris – RF = 1.78, p = 1.90e-6; M. rotundata – RF = 1.79, p = 0.0002; M. genalis – RF = 1.62, p = 1.48e-12; N. melanderi – RF = 1.78, p = 9.02e-5), indicating that novel miRNAs are more likely to target novel genes than would be expected by chance (Fig. 2; Table S8).
We found mixed support for the prediction that novel miRNAs should target genes that function in social behavior and evolution. The predicted targets of lineage-specific miRNAs were enriched for genes differentially expressed between castes in the social Apidae (A. mellifera and B. terrestris), but not Halictidae (M. genalis) (Fig. 3; Table S4). In A. mellifera, this included genes upregulated in the brains of reproductive workers, compared with sterile workers (hypergeometric test: RF = 3.4, p = 0.007) and queens (hypergeometric test: RF = 1.6, p = 0.015) [48], as well as genes upregulated in the brains of foragers compared with nurses (hypergeometric test: RF = 2.8, p = 0.011) [49]. However, there was no significant enrichment for genes differentially expressed between nurse and forager honey bee brains in a later study (hypergeometric test: p = 0.09) [50]. In B. terrestris, we found significant overlap between the predicted targets of lineage-specific miRNAs and genes that are upregulated in workers, compared to queens (whole body, including brain; hypergeometric test: RF = 2, p = 0.013). We did not find significant overlap with genes differentially expressed in the brains of nurses and foragers (hypergeometric test: p = 0.103) [51] or between reproductive and sterile worker brains (hypergeometric test: p = 0.39) [52], but these were much more limited gene sets. To our knowledge, there are no studies of gene expression differences between B. impatiens castes, so we could not evaluate target overlap with caste-biased genes in this species. We did not find significant enrichment for caste-biased genes in the brains of the facultatively eusocial M. genalis (hypergeometric test: p = 0.25).
Contrary to our prediction, targets of lineage-specific miRNAs were not significantly enriched for genes under selection in any species. We assessed overlaps between genes undergoing positive directional selection in A. mellifera [53], B. impatiens [54], M. genalis [33], and N. melanderi [32] and the predicted targets of lineage-specific miRNAs in each species. There was no significant enrichment for targets of lineage-specific miRNAs with genes under positive directional selection in any species (Table S4). In fact, genes under selection in the halictid bees were significantly depleted for targets of lineage-specific miRNAs (hypergeometric test: M. genalis – RF = 0.2, p = 4.28e-10; N. melanderi – RF = 0.3, p = 5.59e-4). We also assessed overlaps with genes previously found to be under positive selection in social species, compared to solitary species [4, 55], but found only marginally significant overlap [4] or depletion [55] with predicted targets of lineage-specific genes in one species (hypergeometric tests: M. genalis – RF = 1.9, p = 0.053; RF = 0, p = 0.05; Table S4).
DISCUSSION
Eusociality is a major evolutionary innovation that requires regulatory changes in a wide range of molecular pathways [1]. We tested the hypothesis that miRNAs play a role in the evolution of eusociality via their regulatory effects on gene networks by comparing miRNA expression in three eusocial and three solitary bee species from three families. Our results provide several lines of support for this hypothesis.
We identified a single miRNA (miR-305) that was expressed exclusively in the brains of the social bees in our study. The presence of this miRNA in the solitary bee genomes suggests that an evolutionary shift in expression pattern has accompanied at least two independent origins of eusociality in bees. This miRNA coordinates Insulin and Notch signaling in D. melanogaster, and both of these pathways are important regulators of social dynamics in insects [56–60]. Interestingly, this miRNA is also upregulated in worker-destined compared to queen-destined honey bee larvae, and may thus play a role in caste differentiation [22]. Further investigation with additional social and solitary species is necessary to determine how this miRNA may influence social behavior across species.
We focused attention on miRNAs for which no mature miRNAs with seed matches were detected in any other species, because these have the potential to influence the lineage-specific patterns of gene regulatory changes previously shown to influence social evolution [3, 4]. We hypothesized that if novel miRNAs are inserted into existing gene networks that become co-opted for social evolution, they should target genes that are highly conserved across species. Instead, we find that the targets of lineage-specific miRNAs are enriched for lineage-specific genes, while genes belonging to ancient orthogroups were not more likely to be targets than expected by chance. This suggests that novel miRNAs co-evolve with novel genes, as has been shown for the evolution of cognitive function in humans [61]. Previous work in honey bees has shown that taxonomically-restricted genes play an important role in social evolution. Expression of taxonomically-restricted genes is significantly biased toward glands with specialized functions for life in a social colony (e.g., the hypopharyngeal gland and the sting gland) [62], and toward genes that are upregulated in workers [63]. Thus, it is reasonable to expect that new miRNAs targeting new genes could have important social functions.
Alternatively, it is possible that new miRNAs targeting lineage-specific genes are transient and will be purged by natural selection because they are less integrated into existing gene networks [10,64,65]. Emergent miRNAs are expected to initially have limited expression to mitigate potential deleterious effects on the protein-coding genes they target. Thus, lineage-specific miRNAs with low levels of expression may be in the process of being purged and may not have accumulated gene targets with important functions [9, 10]. Evidence for this model comes from primates [66] and flies [11, 67]. Likewise, we find that lineage-specific miRNAs are expressed at significantly lower levels than those with at least one homolog in another species. A purging process could explain why there are large differences in the numbers of miRNAs detected in even closely related species (e.g., the two Bombus species). Functional analysis of lineage-specific genes in additional tissues and life stages will help to resolve their roles in social evolution.
We find support for the prediction that lineage-specific miRNAs should target genes with social function in the Apidae (e.g., honey bees and bumble bees), but not the Halictidae (M. genalis). One explanation for this pattern is technical. We define genes with social functions as those that are differentially expressed among castes. The genetic basis of social behavior has been much better studied in honey bees and bumble bees than in any other species, and the sets of genes known to function in sociality is thus richer for apids than for halictids. Further, not all genes that function in social behavior are expected to be differentially expressed in the brains of different castes, and our analysis is thus likely to exclude some important genes.
Nonetheless, our results reflect differences in the antiquity and degree of social complexity, and thus caste-biased gene expression patterns, between apid and halictid bees. Eusociality has a deeper origin in the Apidae than in Halictidae [47, 68], and thus more time has accumulated for associated changes in miRNA regulation to evolve. Unlike for honey bees and bumble bees, which cannot live outside of social colonies, eusociality is facultative in M. genalis. As such, caste traits are not fixed during development, and females who served as non-reproductive workers can become reproductive queens if given the opportunity [69]. This flexibility is reflected in the magnitude of differences in brain gene expression patterns between queen and worker honey bees (thousands of genes [48]) and M. genalis (dozens of genes [40]). Previous research suggests that miRNAs increase their functional influence over evolutionary time [10,11,65,66,70,71]. Thus, emergent miRNAs are more likely to target genes with social function due to chance alone in species with increased social complexity and a larger set of caste-biased genes. Consistent with this explanation, regulatory relationships between miRNAs and genes with caste-biased expression were not found among two other social insect species with reduced social complexity [72].
An additional explanation for these differences in the function of lineage-specific miRNAs concerns the role of miRNAs in gene regulatory networks. One of these roles is to stabilize regulatory relationships in the face of environmental variation, thus canalizing phenotypes during development [9,73–75]. This is likely to be more important in species with obligate eusociality, such as the honey bees and bumble bees for which caste determination is canalized, than in species like M. genalis, where plasticity of phenotypes related to eusociality are maintained in totipotent females.
Contrary to their effects on genes with socially-differentiated expression patterns, lineage-specific miRNAs showed no evidence for preferential targeting of genes under positive selection – either within or across species. In contrast, we find these emergent miRNAs are less likely than expected by chance to target genes under positive selection in the two halictid bees. A potential explanation for this pattern is that genes adaptively targeted by miRNAs tend to be under purifying selection to maintain the regulatory relationship between the miRNA and target, thus preventing gene mis-expression [76–78]. This selective constraint is likely to be most significant in the 3’ UTR region, where miRNA binding sites are located.
A more likely explanation involves the hypothesized pattern of miRNA origins and assimilation, as proposed by Chen and Rajewsky [10]. This model suggests that new miRNAs are likely to have many targets throughout the genome due to chance. Most of these initial miRNA-target regulatory relationships are likely to have slightly deleterious effects, and would be quickly purged through purifying selection. These deleterious effects could be particularly strong for target genes undergoing positive selection, because changes in the functional regulation of these genes are likely to have significant fitness consequences. Also, genes under positive selection are undergoing rapid evolution, and thus may be more likely to “escape” control by errant miRNAs. Indeed, it is easier for mRNAs to lose miRNA target binding sites, which typically require exact sequence matches, than to gain them [10]. Thus, emergent miRNAs may not be expected to target adaptively or fast evolving genes, regardless of their role in social evolution.
The evolution of eusociality depends on many different tissues and physiological processes, and brain-specific expression patterns are not likely to be representative of the complete role of individual miRNAs in social behavior. Some or all of the predicted miRNA-gene relationships we identified may have evolved to support traits in other cell types or processes unrelated to sociality. Additional sequencing of miRNA and mRNA across tissue-types and stages of development in social and solitary species is necessary to provide a comprehensive assessment of the role of emergent miRNAs in social traits. Nonetheless, the brain is a major focus of research in social evolution because it is the primary source of behavioral and neuroendocrine output. Our results thus provide a good starting place for evaluating the role of miRNAs in lineage-specific processes in the evolution of social behavior.
Our analyses reveal important differences in patterns of miRNA evolution between bees and other species. For example, expansion in miRNA repertoire is associated with the evolution of animal complexity in a wide range of species [9,12,13]. The evolution of eusociality from a solitary ancestor is associated with increases in phenotypic complexity, and considered to be one of the major transitions in evolution [79]. We therefore hypothesized that evolutionary increases in social complexity would be associated with expansions in the number of miRNAs found within bee genomes. To the contrary, we find that most bees have a single copy of previously identified miRNAs in their genomes. This is consistent with results of comparative genome scans across several ant species [3]. A recent study of miRNA diversity in insects found that morphological innovations such as holometabolous development was accompanied by the acquisition of only three miRNA families [15]. This suggests that insect evolution is not as reliant on major expansions of miRNA families as other taxonomic groups.
Additionally, our characterization of lineage-specific miRNAs expressed in the brain of each species reveals that genome structure is not as influential in regulating bee miRNA evolution as has been shown for human miRNAs. Novel human miRNAs tend to arise within ancient genes that have multiple functions and broad expression patterns [65]. It is hypothesized that this increases the expression repertoire of emergent miRNAs, and thus facilitates persistence in the population [64, 65]. Only in one species (N. melanderi) were lineage-specific miRNAs more likely to be localized intragenically than previously identified miRNAs, while lineage-specific miRNAs did not differ from previously identified miRNAs in their genomic locations in the other five species. This suggests emergence patterns for new miRNAs are unique to each lineage in bees. We also do not find a consistent pattern between young, emerging miRNAs and host gene age. There was no significant difference in the age of genes that serve as hosts for established versus lineage-specific miRNAs across species. This is despite the fact that a similar proportion of bee miRNAs are located within introns (31-43%; Table 1), compared to in vertebrates (36-65%) [8]. However, the fact that 73-88% of miRNAs localized to genes are encoded on the sense strand suggests that they would benefit from host transcription, as is observed in vertebrates [8]. Additional research with insects will be necessary to identify general patterns of miRNA evolution in relationship to genome structure.
Our study identifies patterns of miRNA evolution in a set of closely related bees that vary in social organization. Our results highlight important similarities and differences in the emergence patterns and functions of mammalian and insect genomes. We find evidence that emergent miRNAs function in lineage-specific patterns of social evolution, perhaps through co-evolution of novel miRNAs and species-specific targets. We do not see an overall increase in the number of miRNAs in the genome or expressed in the brains of species with more complex eusociality. However, we do find evidence that the role of miRNAs in social evolution may strengthen with increasing social complexity, perhaps due to an increased need for canalization of caste determination or due to chance, as a function of an increased number of genes with caste-biased expression. Empirical tests of miRNA function across additional species with variable social organization will further improve our understanding of how gene regulatory evolution gives rise to eusociality.
DATA AVAILABILITY
Sequences are deposited at NCBI SRA as BioProject PRJNA559906. Code is available upon request.
AUTHOR CONTRIBUTIONS
K.M.K. conceived of the study and designed the experiments. K.M.K., E.S., G.B., and Y.B-S. collected the data. K.M.K., B.M.J., E.S., and R.M.W. analyzed the data. K.M.K. wrote the initial draft of the manuscript. All authors edited and approved the article for publication.
ACKNOWLEDGEMENTS
This work was supported by the USDA National Institute of Food and Agriculture [2018-67014-27542 to K.M.K.]; the Utah Agricultural Experiment Station, Utah State University [Project 1297, journal paper number 9239 to K.M.K.]; the U.S.-Israel Binational Science Foundation [BSF 2012807 to G.B. and Y.B.S.]; and the Swiss National Science Foundation [PP00P3_170664 to R. M. W.]. G.B. thanks the Clark Way Harrison Visiting Professor in Arts and Sciences that supported his stay in Washington University in St. Louis. Sequencing was performed at the University of Illinois Roy J. Carver Biotechnology Center. We thank the University of Utah High Performance Computing Center for computational time and assistance. Illustrations were created by J. Johnson (LifeSciences Studios). We thank G. Robinson for helpful feedback on an earlier draft of this manuscript.
Footnotes
↵ϕ Department of Ecology and Evolutionary Biology, Princeton University, Princeton, NJ 08544, USA