Abstract
The extent to which horizontal gene transfer (HGT) has shaped eukaryote evolution remains an open question. Two recent studies reported four plant-like genes acquired through two HGT events by the whitefly Bemisia tabaci, a major agricultural pest (Lapadula et al. 2020; Xia et al. 2021). Here, we performed a systematic search for plant-to-insect HGT in B. tabaci and uncovered a total of 50 plant-like genes deriving from at least 24 independent HGT events. Most of these genes are present in three cryptic B. tabaci species, show high level of amino-acid identity to plant genes (mean = 64%), are phylogenetically nested within plant sequences, and are expressed and evolve under purifying selection. The predicted functions of these genes suggest that most of them are involved in plant-insect interactions. Thus, substantial plant-to-insect HGT may have facilitated the evolution of B. tabaci towards adaptation to a large host spectrum. Our study shows that eukaryote-to-eukaryote HGT may be relatively common in some lineages and it provides new candidate genes that may be targeted to improve current control strategies against whiteflies.
Introduction
Horizontal gene transfer (HGT) is the passage of genetic material between organisms by means other than reproduction. The patterns, mechanisms and vectors of HGT are well-characterized in prokaryotes, in which these transfers are ubiquitous and a major source of innovation (1). In eukaryotes, HGT has long been considered anecdotal because of multiple barriers that should impede such transfers, or controversial, as resulting from phylogenetic artifacts or contaminant sequences (2). Yet, dozens of recent studies have reported robust HGT events in various eukaryotic organisms, sometimes with demonstrated impact on host phenotype (3, 4). For example, two studies reported plant genes (two BAHD acyltransferases called BtPMaT1 and BtPMaT2 and ribosome inactivating proteins [RIPs]) acquired by horizontal transfer by the sweet potato whitefly Bemisia tabaci (5, 6). Whiteflies (family Aleyrodidae) are herbivorous hemipteran insects that are important agricultural pests because of their feeding habits and the many viruses they transmit to plants. Functional assays revealed that the BtPMaT1 protein detoxifies plant phenolic glucosides that are normally used by plants to protect themselves against insect herbivores. Thus, the acquisition of a plant gene by whiteflies through HT enabled them to thwart plant defenses and may in part explain why these insects have become generalist phloem-feeders (6). Interestingly, whiteflies feeding on transformed tomato plants expressing BtPMaT1 gene-silencing fragments showed increased mortality and reduced fecundity (6). Therefore, identifying genes of plant origin in herbivore insects can provide targets to engineer new pest control strategies.
Results and discussion
To assess the extent to which plant-to-insect HGT may have shaped interactions between whiteflies and their host plants, we conducted a systematic search for genes of plant origin in the B. tabaci MEAM1 genome (7). To this end, we performed last common ancestor (LCA) prediction for each B. tabaci predicted protein using MMseqs2 taxonomy (8) against the Uniref90 database from which Aleyrodidae proteins were excluded (Supplementary Materials and Methods). In this approach, the placement of a protein LCA in plants (Viridiplantae) indicates potential HGT from plants to B. tabaci after the emergence of Aleyrodidae. The deeper the LCA, the more conserved the protein across the breadth of plant taxonomy. In total, we identified 65 B. tabaci proteins for which LCA was inferred deeper than genus level in plants and which were hence considered potentially transferred from plants to Aleyrodidae. This included the two BAHD acyltransferase genes (BtPMaT1 and BtPMaT2) reported by Xia et al. (6) and the two RIPs identified by Lapadula et al. (5).
To further characterize the taxonomic origin of candidate B. tabaci plant-derived genes, we grouped them in 41 clusters of orthologous proteins and used one representative sequence per cluster to perform similarity searches against all Aleyrodidae proteomes as well as against the UniRef90 protein database of streptophytes, that of eukaryotes without streptophytes and Aleyrodidae, and those of prokaryotes and viruses (Supplementary Materials and Methods). The ten best hits from each search supplemented with the twenty most diverse hits against streptophytes were used to build a multiple alignment which was submitted to phylogenetic analysis. Remarkably, while the phylogenetic signal observed in 17 out of 41 trees was in disagreement with plant-to-Aleyrodidae gene transfer (data not shown), the topology of 24 trees was consistent with this scenario. In 22 of the 24 trees, the B. tabaci proteins were nested within clades of proteins of streptophyte origin (Figure 1A and Supplementary Dataset), as in (5, 6). In the remaining two trees (Bta13103 and Bta14885), the B. tabaci proteins are sister to all proteins of streptophyte origin (Figure 1B). The topology of these two trees could be explained either by extensive sequence adaptation after transfer or by donor plant lineage being extinct or under-represented in genome databases. In support of a plant origin, the level of amino-acid (aa) similarity between the corresponding 24 B. tabaci representative proteins and their plant homologues varied from 28% to 84% (mean=64%; Table 1) and homologous genes are virtually absent from the proteomes of non-Aleyrodidae insects. In addition, 23 clusters are shared between MEAM1 and at least one other B. tabaci cryptic species (MED1 and SSA-ECA) independently sequenced in different laboratories, indicating that they are bona fide B. tabaci genes, and not contaminating plant genes. Altogether, these evidences indicate that as proposed for BAHD acyltransferases and RIPs (5, 6), these 24 representative genes were acquired by an ancestor of B. tabaci through plant-to-insect HGT.
In total, the 24 corresponding Aleyrodidae orthogroups encompass 138 proteins, of which 133 have best hits against plant homologs including 46 from SSA-ECA, 35 from MED, and 50 from MEAM1 B. tabaci cryptic species as well as two from the outgroup species Trialeurodes vaporariorum (Tv) (Supplementary Dataset). Protein clustering combined to phylogenetic analysis shows that at least 24 independent HGT events are necessary to explain the presence of these genes in B. tabaci genomes. This number of plant-to-insect HGT is remarkable given that most HGT events reported so far in animals (excluding HT of transposable elements) involve genes of microbial origins, with only few cases of gene transfers from eukaryotes to animals (9). It is noteworthy that fifteen out of twenty clusters are shared between the MEAM1 and the SSA-ECA cryptic species, indicating that most HGTs likely occurred before the split of these species, i.e., between 19 and 40 million years ago (10, 11). Several transfers were apparently followed by gene duplications. The largest cluster, with predicted delta(12) fatty acid desaturase (FAD) function, comprises 38 Aleyrodidae members including 15 MEAM1 proteins. Eight of the FAD genes are organized in a genomic region spanning about 120kb (scaffold 995: 912024-1039680) which is also observed in syntenic position at least in the SSA-ECA genome (scaffold 436:909978-1039696), indicating that this hotspot evolved before the split of these species. In combination with phylogenetic analysis, the amplification of the FAD genes can be explained by a mixture of local and distal duplication events post HGT (Figure 1A). We also note that, in the FAD tree, the plant-related B. tabaci proteins are distributed on two distinct branches and we cannot rule out that it could reflect two independent transfer events of FAD genes (Figure 1A).
To search for evidence of functionality of plant-like genes in B. tabaci, we measured the evolutionary pressures acting on them by calculating pairwise ratios of non-synonymous over synonymous substitutions (Ka/Ks) between sequences within each of the 24 clusters. We found that 593 out of the 612 ratios we obtained were lower than 0.5, indicating that most if not all plant-like genes are evolving under purifying selection. Furthermore, we found transcripts supporting expression of at least one gene for 18 out of the 24 clusters, often in multiple independent transcriptome assemblies (Table 1). Together with conservation across cryptic species and evidence of gene duplication, these data indicate that most if not all B. tabaci plant-like genes are functional.
Interestingly, most of these genes (21 out of 24 clusters) have a predicted function based on similarity with their nearest plant relative. As for the recently described malonyltransferase (6, 12), there is direct or indirect evidence that many of them are involved in plant-pathogen interactions (Table 1). For example, the delta(12) FAD are known to produce polyunsaturated linoleic acid in plants, which are involved in response to pathogens (13). Likewise, members of subtilisin-like protease and pathogen-related protein families can both be induced following pathogen infection (14, 15). Ornithine decarboxylase is also worth noting as this enzyme synthesizes putrescine in plants, a polyamine involved in pathogen response (16) and suspected to be usurped by Hessian fly larvae to facilitate nutrient acquisition while feeding on wheat (17). In the same vein, a gene resembling the nicotianamine synthase, involved in the transport of various metal ions in plants (18), may facilitate acquisition of micronutrient by whiteflies. Finally, pectinesterase is a plant cell wall degrading enzyme (PCWDE) that is also found in plant and fungal pathogens causing maceration and soft-rotting of plant tissues. In fact, horizontal acquisition of PCWDE has already been documented in insects, but the source of the gene was bacterial (19).
To conclude, our study reveals that in addition to bacterial genes, which repeatedly entered arthropod genomes and fueled the evolution of herbivory (20), numerous plant genes have been acquired through HGT by B. tabaci, likely contributing to the highly polyphagous nature of this species. The significant representation of predicted functions potentially involved in parasitism suggests that these genes were selected from an important set of transferred genes. Using the same approach on Drosophila melanogaster, we found no gene of potential plant origin, showing that plant-to-insect HGT is not ubiquitous and suggesting that it could be facilitated by specific vectors in association with B. tabaci. It is noteworthy that viruses have been proposed to act as vectors of HT in eukaryotes (21, 22) and that B. tabaci is known to act as a vector of dozens of plant viruses, some of which are able to replicate in insect cells (23). Our results call for a large-scale evaluation of plant-to-insect HGT and for a detailed functional characterization of B. tabaci plant-like genes, which may further contribute to control this pest.
Data availability
Supplementary Dataset (on figshare repository): this folder contains a table listing all transcripts covering at least 75% of the B. tabaci MEAM1 plant-derived genes with at least 95% identity, a table providing features of all B. tabaci plant-derived genes included in this study and references supporting involvement in plant-pathogen interactions, Supplementary Material and Methods (direct link), a fasta file combining the plant-derived MEAM1 representative sequences, an archive comprising the initial (.aln files) and trimmed (.trim files) protein alignments in fasta format as well as the phylogenetic trees in Newick format (.annot files) for each cluster of plant-derived genes, and the tree images combined in a single file (.pdf). The phylogenetic trees can be viewed interactively at https://itol.embl.de/shared/fmaumus.
Author contributions
CG and FM contributed equally to perform analyses, interpret the results and write the manuscript. FM conceived the study.
Footnotes
Emails: clement.gilbert{at}egce.cnrs-gif.fr and florian.maumus{at}inrae.fr
https://figshare.com/projects/Gilbert_and_Maumus_-_Supplementary_Dataset/128792