Genome-wide Search for Gene Mutations likely Conferring Insecticide Resistance in the Common Bed Bug, Cimex lectularius

Insecticide resistance in the bedbug Cimex lectularius is poorly understood due to the lack of genome sequences for resistant strains. In Japan, we identified a resistant strain of C. lectularius that exhibits a higher pyrethroid resistance ratio compared to many previously discovered strains. We sequenced genomes of the pyrethroid resistant and susceptible strains using long-read sequencing, resulting in the construction of highly contiguous genomes (N50 of resistant strain: 2.1Mb and N50 of susceptible strain: 1.5 Mb). Gene prediction was performed by BRAKER3 and Functional annotation was performed by Fanflow4insects workflow. Next, we compared their amino acid sequences to identify gene mutations, identifying 729 mutated transcripts that were specific to the resistant strain. Among them, those defined previously as resistance genes were included. Additionally, enrichment analysis implicated DNA damage response, cell cycle regulation, insulin metabolism, and lysosomes in the development of pyrethroid resistance. Genome editing of these genes can provide insights into the evolution and mechanisms of insecticide resistance. This study expanded the target genes to monitor allele distribution and frequency changes, which will likely contribute to the assessment of resistance levels. These findings highlight the potential of genome-wide approaches to understand insecticide resistance in bed bugs. Simple Summary Bedbugs have expanded globally over the past two decades, causing several health risks. Mutations in their genes allow bed bugs to develop insecticide resistance. However, the extent to which gene mutations exist in the bug genome remains largely unknown because the genomes of resistant strains have not been determined. We accurately sequenced the genomes of both susceptible and resistant strains and compared the gene sequences between the two strains. Several genes with resistance-specific mutations have been identified. These mutations can alter gene function and lead to insecticide resistance.


Introduction
Over the past 20 years, bed bugs (Cimex lectularius and Cimex hemipterus) have spread rapidly globally [1,2].Bedbugs are obligate hematophagous insects [3,4].Bedbug bites can cause pruritic rashes [5], folliculitis, and cellulitis due to secondary infections [6].Although approximately 40 pathogens have been detected in bedbugs from natural populations, no evidence exists that they have been transmitted to humans [7].Histamine, an aggregated pheromone released by bedbugs, is deposited in house dust and may pose health risks [8,9].
Insecticide resistance is believed to be a factor contributing to the recent resurgence of bed bugs [10,11].Infestation was rare in the 1960s because of the use of dichlorodiphenyltrichloroethane (DDT) or other synthetic organic insecticides, such as pyrethroids [2,10].The physiological mechanisms conferring insecticide resistance can be divided into several types: metabolic, penetration, and knockdown resistance [11].The major groups of enzymes associated with metabolic resistance include cytochrome P450s, glutathione Stransferase (GST), and ATP-binding cassette (ABC) transporters [12].Cuticle thickening has been implicated in penetration resistance [13], and RNA-seq analysis confirmed that several genes encoding P450s, cuticle proteins (CPRs), and ABC transporters were upregulated in pyrethroidresistant strains [14][15][16].Target site insensitivity causes knockdown resistance.To date, amino acid mutations in voltage-gated sodium channels have been observed in the common bed bug C. lectularius with pyrethroid resistance [15,17].Amino acid substitutions from "L to I" or "V to L" in voltage-gated sodium channel have been observed in C. lectularius as well as in other insects [18,19].Recently, resistant strains of C. lectularius for organophosphates and carbamates have been found in Japan, and their acetylcholinesterase (AChE, target genes of these two insecticides) possess mutations [20].However, the genomic information involved in the development of insecticide resistance relies heavily on information from other insects.
Genome sequencing of the common bed bug, C. lectularius has revealed gene sets related to its ecology [21,22].Previous studies have used short-read sequencing to identify SNPs associated with pyrethroid resistance [23,24].These studies have identified multiple genomic regions that are likely involved in pyrethroid resistance.However, no study has accurately determined the whole-genome sequence of the resistant strain, and the full extent of mutations in resistant strains remains unknown.In Japan, inquiries into bedbugs have increased since 2000 [25].In this study, we constructed highly contiguous genomes of susceptible and resistant strains isolated from Japan using long-read sequencing.Next, gene prediction and functional annotation of the assembled genome were performed.We identified amino acid mutations that are potentially related to pyrethroid resistance.This study expanded the target genes to monitor allele distribution and frequency changes, which will likely contribute to the assessment of resistance levels.These findings could highlight the potential of genome-wide approaches to understand insecticide resistance in bed bugs.

Insects
Susceptible strains (Teikyo-university strain) of C. lectularius were collected from fields at the Department of Medical Zoology, Research Institute of Endemics, Nagasaki University (Isahaya City, Nagasaki, Japan) by Prof. N. Omori on May 22, 1956 [26].This strain was maintained at Nagasaki University and Teikyo University and bred and maintained at the Japan Environmental Sanitation Center (Kawasaki City, Kanagawa, Japan) from approximately 1972.This strain was considered insecticide susceptible and was kindly provided to Fumakilla Limited (Chiyoda-ku, Tokyo, Japan) in 2011.The resistant strain (Hiroshima strain) was collected from a hotel in Hiroshima City (Hiroshima, Japan) in July 2010 and maintained by Fumakilla Limited.Bed bugs were reared by feeding mouse blood.All samples used in this study were collected in Japan.Therefore, the international ABS (Access and Benefit-Sharing) framework under the Convention on Biological Diversity and the Nagoya Protocol is not applicable.
Simplified bioassays were conducted to evaluate the susceptibility of adult C. lectularius to the representative insecticides.Bioassays were performed on adults 7-10 days after blood feeding.Adult C. lectularius (male: females = 1:1) were held on a piece of tape applied to a white paper.Following the topical application method described in WHO [27], an acetone solution (0.4125 µL) of the test insecticide was topically applied to the dorsal mesothorax of each test C. lectularius using a hand microapplicator (Burkard Manufacturing Co., Ltd., Rickmansworth, UK).The control group was administered only acetone.Six doses were tested for each strain, and each replicate contained at least 20 insects.The treated insects were kept in a plastic cup, the bottom of which was lined with filter paper.Mortality was observed 24 h after treatment, and the halflethal dose (LD50) was calculated using the Bliss' probit method [28].A moribund insect that was unable to walk was considered dead.The resistance ratio (RR) was calculated by dividing the LD50 of the resistant strain by that of the susceptible strain.Fieller's equation was used for calculating the 95% confidence intervals [29].

DNA extraction and genome sequencing
The genomic DNA of C. lectularius was extracted from the whole body of an adult male using a MagAttract HMW DNA Kit (Qiagen Co. Ltd., Valencia, CA, USA).A sequence library was prepared using the SMRTbell Prep Kit 3.0 (PacBio, Menlo Park, CA, USA).The libraries were sequenced using Sequel IIe (PacBio).

RNA sequencing
Public RNA sequencing data were searched from NCBI using the keywords Cimex lectularius.
Total RNA was extracted from one adult and used as a controls in 2.2 Insecticide Efficacy Tests using an RNeasy Mini Kit (Qiagen, Hilden, Germany).The extraction was performed in triplicate for both the susceptible and resistant strains.Libraries were prepared using TruSeq stranded mRNA (Illumina, San Diego, CA, USA) and sequenced using NovaSeqX (Illumina), generating 150 bp pair-end reads.Accession numbers are listed in Table S2.
Gene models of susceptible strains were transferred into the genome assembly of resistant strains using Liftoff [42].The transcripts with different CDS between susceptible and resistant strains were identified by searching for 'matches_ref_protein: False' in the GFF files generated by Liftoff.Amino acid sequences of corresponding transcript IDs between susceptible and resistant strains were aligned using MAFFT v 7.525 with '--auto' option [43].The protein sequence of Clec2.1 were also included in the MAFFT alignment.The corresponding amino acid sequences of Clec2.1 were identified from the results of Fanflow4Insects, as mentioned above.Alignment files were imported using the AlignIO module in Biopython 1.83, and only transcripts with resistant-specific mutations were selected.On the list, if transcripts were missing start and/or stop codons after liftoff, the correctness of the missing codons was manually verified by identifying contig fragmentation or liftoff mistakes.Differences in amino acid sites between the susceptible and resistant strains were visualized on a Tree [44].To characterize the list of transcripts with resistance-specific mutations, enrichment analysis was performed using Metascape (https://metascape.org/,accessed on June 24, 2024), with default settings [45].

Lethal effects of pyrethroid for the susceptible and resistant strains
We evaluated the resistance status of the Hiroshima strain.(Table 1).LD50 of susceptible strain (Teikyo university strain) showed 0.002 µg of permethrin per individual, whereas LD50 of Hiroshima strain showed 40.1 µg of permethrin per individual.The resistance ratio (RR) was 19,859, which was higher than that of many previously discovered strains [46].Hereafter, we refer to the Teikyo University strain as the susceptible strain, and the Hiroshima strain as the resistant strain.RR : LD50(Hiroshima strain)/LD50(Teikyo-u strain)

De novo genome assembly
We generated primary contigs with genome sizes of 644,683,587 bp and 614,893,704 bp for the susceptible and resistant strains, respectively (Table 2).Indicators for genome contiguity were of higher quality than those of the current reference genome Clec2.1 (Table 2).The contig N50 lengths for the susceptible and resistant strains were 1,530,193 bp and 2,132,290 bp.These assembly metrics were of higher quality than those of the current reference genome Clec2.1 (Table 2).The number of contigs in both strains was lower than that in Clec2.1 reference genome.The BUSCO completeness in both strains was comparable to that of the Clec2.1 reference genome.Kmer-based evaluation demonstrated high base accuracy and completeness of primary contigs in susceptible and resistant strains The susceptible strain showed a quality value (QV) of 57.0 and completeness of 97.8%, while resistant strain had QV of 56.9 and completeness of 94.9%.The Kmer spectrum analysis revealed two peaks in both strains, indicating the construction of a diploid genome assembly (Figure 1a and b).No counts were observed that could be considered sequence errors.We examined the genomic differences between susceptible and resistant strains using dotplot visualization (Figure 1c).Linear plots confirmed the high similarity between the susceptible and resistant strains.

Gene prediction and functional annotation
Because repetitive sequences may affect gene prediction accuracy [47], these sequences in the genome assembly were searched using RepeatModeler and RepeatMasker.In both strains, approximately 50% of the genomic regions contained repeat sequences (Table S3, shown in bases masked).These values are slightly higher than those of the current reference genome, Clec 2.1.BRAKER3 predicted 13116 and 13104 protein-coding genes in the susceptible and resistant strains, respectively (Table 3).Gene annotations and protein sequences are available at figsShare (https://doi.org/10.6084/m9.figshare.26362135).The BUSCO completeness values for both strains were comparable to Clec2.1.Gene annotations of the susceptible strains were transferred to the resistant strain using Liftoff [42], identifying the corresponding genes between both strains (12724 protein-coding genes and 20151 transcripts) (Table 3).The BUSCO completeness of the resistant strains after liftoff did not decline substantially, indicating that the transfer of gene predictions was successful (Table 3).Functional annotation was added to the protein-coding genes of the susceptible strains using Fanflow4Insects, which included GGSEARCH and HMMSCAN.The number of gene hits for each reference protein is shown in Table 4.Of 13116 genes in susceptible strains, 99% protein-coding genes (13010/13116) were successfully annotated by GGSEARCH and HMMSCAN.The genes that could not be annotated were classified as hypothetical proteins (Table 4).The complete results of Fanflow4Insects are presented in Table S4.

The search for mutated genes related to insecticide resistance
As previous report suggested, point mutations are related to the acquisition of pyrethroid resistance in bed bugs [18].These genes can be identified on a genome-wide scale by comparing the protein sequences of susceptible and resistant strains.Liftoff revealed 3938  Total transcripts with amino acid mismatches between susceptible and resistant strains.These amino acid sequences were aligned with those of Clec2.1, using MAFFT, and the alignments among the three strains were compared.Clec2.1 genome comes from a susceptible strain [21].The MAFFT alignment revealed 729 transcripts with resistance-specific mutations (Table S5).We investigated the number of mutation sites in the amino acid sequence of each transcript (Figure 2A, Table S5).
The majority of the transcripts (496 of 729) had one mutation in their amino acid sequences.laminin subunit beta 1 had the largest number of mutated sites (54).To verify the reliability of the gene list, we searched for known or candidate genes with resistance-specific mutations, including sodium voltage-gated channels, AChE, Gamma-aminobutyric acid (GABA) receptors, nicotinic acetylcholine receptors (nAChR), P450s, ABC transporters, GSTs, and CPRs) [11].Transcripts with resistance-specific mutations were detected in all these genes, except for the GABA receptor (Figure 2B-G).No studies have identified mutations in C. lectularius nAChR, a target gene of neonicotinoid insecticides.Although we determined the resistance ratio of the neonicotinoid insecticide (dinotefuran) between the susceptible and resistant strains (Table S6), the Hiroshima strain did not show neonicotinoid resistance (resistance ratio: 1.11).
Enrichment analysis was performed on 729 transcripts to characterize the genes (Figure 3, Table S7, and Table S8).Enrichment analysis using Metascape was conducted using Homo sapiens gene ID or D. melanogaster gene ID, as described in Table S5.Using Homo sapiens gene ID, we identified several GO terms potentially related to the development of insecticide resistance, as described below.DNA metabolic process, mitotic cell cycle process, insulin metabolic process and Lysosome were enriched (-log10 (p-value) > 6) (Figure 3A).Gene lists associated with the GO terms are shown in Table S7.Forty-two genes were included in the "DNA metabolic process" category.Metascape clustered several similar ontologies, and the summarized results are shown in the GroupID column in Table S7.The "DNA metabolic process" category included multiple terms related to DNA damage and repair, such as "DNA replication," "DNA repair," "DNA damage response," "double-strand break repair," and "recombinational repair" (Table S7).Fiftytwo genes were clustered under GO term "mitotic cell cycle process" (Table S7).Among these genes, CCNB2 (cyclin B) and ORC1 are activated during M and G1 phases of the cell cycle, respectively [48,49].Nine genes clustered under the GO term "insulin metabolic process" (Table S7).Among these genes, CPE (Carboxypeptidase E) is involved in the proteolytic processing of proinsulin peptides in humans [50].Thirteen genes were clustered under the GO term "Lysosome" (Table S7).Intracellular proteases respond to insecticide stress in resistant strains to maintain cellular health [51].As expected, the cathepsin genes (CTSB and CTSD) were included among the 13 genes.Additionally, components of lysosomes, such as SLC11A2, were included.SLC11A2 is a divalent metal transporter that is a component of lysosomes [52].
When using D. melanogaster gene ID, "Lysosome" was enriched, similar to the results obtained using human gene IDs (Figure 3B, Table S8).GO terms included cathepsin.Tetraspanins (Tsp29Fa and Tsp42Ee) were also identified.The enriched GO term "glucose transmembrane transport" was related to the "insulin metabolic process" enriched when using human gene ID, as these were associated with nutrient response [53].

Discussion
We successfully constructed a highly contiguous genome assembly of susceptible and resistant strains of C. lectularius and identified mutations that may be related to the development of insecticide resistance.The present study confirmed the mutations conferring pyrethroid resistance in the amino acids of the sodium voltage-gated channel, as expected from previous reports in C. lectularius.We confirmed mutations in other target genes that confer neonicotinoid or OPs resistance (AChE and nAChR).Although mutations in AChE have been found in Cimex species [20], no reports have described mutations in the nAChR.However, we were unable to confirm neonicotinoid resistance in this study.Mutations found in nAChR may not be associated with neonicotinoid reception.Genes related to metabolic resistance (CYP450 and ABC transporters) or penetration resistance (CPRs) were highly expressed in resistant strains.To the best of our knowledge, this is the first study to report mutations in these genes.
A total of 729 transcripts with resistance-specific mutations were identified.We found "DNA metabolic process," "Lysosome," "Insulin metabolic process" and "glucose transmembrane transport" and "Mitotic cell cycle process" as enriched GO terms in the 729 transcripts.These biological processes may be related to the physiological adaptation of the resistant strains, as discussed below.

DNA metabolic process
Enrichment analysis revealed that resistance-specific mutations were present in many genes associated with DNA damage, repair, and replication.Bifenthrin, a pyrethroid insecticide, cause the DNA damage in Sf9 cells [54]．In Anopheles coluzzii, the expression of DNA repairrelated genes increased after exposure to pyrethroids.[55]．Mutations identified in this study may promote DNA damage repair in resistant strains, leading to adaptations to insecticide exposure.

Lysosome
In the house fly Musca domestica and red flour beetle Tribolium castaneum, lysosomal protease activity generally increases in response to exposure to pyrethroids and DDT [51].Protease functioning in the lysosome, cathepsin B, was present in the enrichment analysis.Mutations were also observed in the lysosomal components of tetraspanin (Tsp29Fa and Tsp42Ee).In D. melanogaster, Tsp42Ee is primarily located in retina [56], and its mutation leads to lightinduced retinal degeneration.Whether the tetraspanin mutation found in this study is related to insecticide resistance through the visual system may provide a target for understanding the ecology and evolution of resistant strains.

Insulin metabolic process and glucose transmembrane transport
Enrichment analysis Insulin signaling is conserved between mammals and insects, and regulates circulating sugar levels [53].Therefore, we discuss insulin metabolic processes and glucose transmembrane transport in the same section.
Mutations were present in glucose and trehalose transporters (Tret1-1, pippin, and nebu) (Table S8).Tret1-1 expression is upregulated in response to starvation in D. melanogaster and T. castaneum [57][58][59], leading to the efficient uptake of trehalose and glucose.Pippins regulate the selective transport of carbohydrates into the brain [60].Mutations in these transporters may contribute to resistance to starvation.
When the human gene ID was used for enrichment analysis, the insulin metabolic process was enriched.Components of insulin signaling are regulated in response to starvation in D. melanogaster [53].Therefore, "insulin metabolic process" and "glucose transmembrane transport" may be associated with each other.Additionally, in D. melanogaster, 114 mutations in insulin-signaling genes have been found in DDT-resistant strains [61].Modifying insulin signaling may be a common feature in conferring insecticide resistance in insects.

Mitotic Cell Cycle Process
DNA damage is monitored during cell cycle progression, and the cell cycle is arrested until the DNA damage is repaired [62].Mutations were observed in 52 genes associated with the mitotic cell cycle.Among these, ORC1 has been extensively studied with respect to cell cycle regulation [63].ORC1 plays a central role in the assembly of the pre-replication complex to initiate DNA replication and its activation is regulated during the cell cycle [64].Mutations observed in insecticide-resistant strains may allow strict regulation of the cell cycle, preventing daughter cells from inheriting damaged DNA in response to insecticide exposure.

The relationships among GO terms enriched in this study
The GO terms identified in this study (DNA metabolic process, insulin metabolic process, glucose transmembrane transport, lysosomes, and mitotic cell cycle process) may be related to cell cycle regulation.DNA metabolic processes play crucial roles in cell-cycle progression and arrest of cell cycle [62].Cell cycle progression depends on sugars such as glucose [65].Insulin signaling regulates the cell cycle (i.e., cell growth and cell cycle progression) by activating mTOR signaling [66].Furthermore, lysosomes function as nutrient sensors in response to amino acid inputs [67], suggesting an association between lysosomes and the response to nutrients in C. lectularus.Investigating the phenotypic functions of the mutated genes will be a challenge in the future.Genome editing is an effective method for elucidating molecular mechanisms underlying insecticide resistance.

Limitations of this study
Mutation identified in this study may be rare depending on the populations.In bed bugs, population genetics to detect alleles with mutation have been carried out on voltage-gated sodium channel α-subunit gene [17].The functional contributions of the mutations to pyrethroid resistance levels could be better elucidated by comparing genes from a larger number of common bedbug strains.Furthermore, strict evaluation for pyrethroid resistance levels may need population genetic approach to clarify the allele frequencies.

Conclusions
Through genome-wide search for gene mutations in resistant strains of C. lectularius, we identified 729 mutant transcripts that have scarcely been studied in common bedbugs by comparing the sequences of protein-coding genes.These genes may enable the development of insecticide resistance in the resistant strains.Population genetics approach may clarify the extent of functional contributions of these mutations.In bed bugs, population genetics to detect alleles with mutation have been carried out on voltage-gated sodium channel α-subunit gene [17].This study contributes to the expansion of target genes to detect the distribution and changes in allele frequencies within populations.Distribution and changes in allele frequency may also be helpful for assessing the degree of insecticide resistance.In addition, genome editing of these genes may be an effective method for understanding the evolution and physiological mechanisms involved in the acquisition of insecticide resistance.

Supplementary Materials:
The following supporting information can be downloaded from figshare: https://doi.org/10.6084/m9.figshare.26362135,Table S1: Public RNA sequencing data used in this study; Table S2: Metadata of RNA sequencing data in susceptible and resistant strains; Table S3: The counts and ratio of repeats annotated by RepeatModeler and RepeatMasker; Table S4: The full result of Fanflow4Insects; Table S5:729 transcripts with resistance-specific mutations; Table S6: The LD50 value and resistance ratio of each strain to Dinotefuran; Table S7: Detail results of Metascape when using Homo sapiens gene ID; Table S8: Detail results of Metascape when using Drosophila melanogaster gene ID.Gene annotation and protein sequences are available in FigShare (https://doi.org/10.6084/m9.figshare.26362135).

Author Contributions:
Conceptualization, K. T. and H. B.; methodology, K. T. and H. B.; software, K. T.; validation, K. T. and H. B.; formal analysis, K. T.; investigation, K. T.; resources, H. B.; data curation, K. T.; writing-original draft preparation, K. T. and F. K.; writing-review and editing, K. T. and H. B.; visualization, K. T. and F. K.; supervision, H. B.; project administration, H. B.; funding acquisition, H. B. All authors have read and agreed to the published version of the manuscript."

Table 1 .
The LD50 value and resistance ratio of each strain to two insecticides.

Table 2 .
Comparison of assembly statistics with the current reference genome.

Table 3 .
Statistics of protein-coding gene prediction.

Table 4 .
Summary of functional annotation results.