Multi-omics analysis identifies symbionts and pathogens of blacklegged ticks (Ixodes scapularis) from a Lyme disease hotspot in southeastern Ontario, Canada

Ticks in the family Ixodidae are important vectors of zoonoses including Lyme disease (LD), which is caused by spirochete bacteria from the Borreliella (Borrelia) burgdorferi sensu lato (Bbsl) complex. The blacklegged tick (Ixodes scapularis) continues to expand across Canada, creating hotspots of elevated LD risk at the leading edge of its expansion range. Current efforts to understand the risk of pathogen transmission associated with I. scapularis in Canada focus primarily on targeted screens, while variation in the tick microbiome remains poorly understood. Using multi-omics consisting of 16S metabarcoding and ribosome-depleted, whole-shotgun RNA transcriptome sequencing, we examined the microbial communities associated with adult I. scapularis (N = 32), sampled from four tissue types (whole tick, salivary glands, midgut, and viscera) and three geographical locations within a LD hotspot near Kingston, Ontario, Canada. The communities consisted of both endosymbiotic and known or potentially pathogenic microbes, including RNA viruses, bacteria, and a Babesia sp. intracellular parasite. We show that β-diversity is significantly higher between individual tick salivary gland and midgut bacterial communities, compared to whole ticks; while linear discriminant analysis (LDA) effect size (LEfSe) determined that the three potentially pathogenic bacteria detected by V4 16S rDNA sequencing also differed among dissected tissues only, including a Borrelia from the Bbsl complex, Borrelia miyamotoi, and Anaplasma phagocytophilum. Importantly, we find co-infection of I. scapularis by multiple microbes, in contrast to diagnostic protocols for LD, which typically focus on infection from a single pathogen of interest (B. burgdorferi sensu stricto). IMPORTANCE A vector of human health concern, blacklegged ticks, Ixodes scapularis, transmit pathogens that cause tick-borne diseases (TBDs), including Lyme disease (LD). Several hotspots of elevated LD risk have emerged across Canada as I. scapularis expands its range. Focusing on a hotspot in southeastern Ontario, we used high-throughput sequencing on whole ticks and dissected salivary glands and midguts. Compared to whole ticks, analysis of salivary glands and midguts revealed greater β-diversity among microbiomes that are less dominated by Rickettsia endosymbiont bacteria and enriched for pathogenic bacteria including a Bbsl- associated Borrelia, Borrelia miyamotoi, and Anaplasma phagocytophilum. We also find evidence of co-infection of I. scapularis in this region by multiple microbes. Overall, our study highlights the challenges and opportunities associated with the surveillance of the microbiome of I. scapularis for pathogen detection using metabarcoding and metatranscriptome approaches.

(Bbsl) complex. The blacklegged tick (Ixodes scapularis) continues to expand across Canada,23 creating hotspots of elevated LD risk at the leading edge of its expansion range. Current efforts 24 to understand the risk of pathogen transmission associated with I. scapularis in Canada focus 25 primarily on targeted screens, while variation in the tick microbiome remains poorly understood. 26 Using multi-omics consisting of 16S metabarcoding and ribosome-depleted, whole-shotgun 27 RNA transcriptome sequencing, we examined the microbial communities associated with adult 28 I. scapularis (N = 32), sampled from four tissue types (whole tick, salivary glands, midgut, and 29 viscera) and three geographical locations within a LD hotspot near Kingston, Ontario,Canada. 30 The communities consisted of both endosymbiotic and known or potentially pathogenic 31 microbes, including RNA viruses, bacteria, and a Babesia sp. intracellular parasite. We show that 32 β-diversity is significantly higher between individual tick salivary gland and midgut bacterial 33 communities, compared to whole ticks; while linear discriminant analysis (LDA) effect size 34 (LEfSe) determined that the three potentially pathogenic bacteria detected by V4 16S rDNA 35 sequencing also differed among dissected tissues only, including a Borrelia from the Bbsl 36 complex, Borrelia miyamotoi, and Anaplasma phagocytophilum. Importantly, we find co-37 infection of I. scapularis by multiple microbes, in contrast to diagnostic protocols for LD, which 38 typically focus on infection from a single pathogen of interest (B. burgdorferi sensu stricto). 39

Introduction 52
One-fifth of emerging human infectious diseases are transmitted by arthropod vectors, 53 forty percent of which are transmitted by Ixodidae ticks, representing an increasingly large threat 54 to human health and welfare (1, 2). Ticks in the genus Ixodes are important vectors of human 55 disease, including LD, tick-borne relapsing fever, anaplasmosis, babesiosis, ehrlichiosis, tick-56 borne encephalitis, and Powassan virus (3-6). In Eastern North America, I. scapularis is the 57 primary vector of LD, contributing to the more than 300,000 estimated cases in the United States 58 of America (USA) each year (7,8). In addition to B. burgdorferi sensu stricto (s.s.), I. scapularis 59 is also a vector for the broader Bbsl complex of spirochete pathogens, including B. mayonii, B. 60 kurtenbachii, B. bissettiae, and B. andersonii (9)(10)(11)(12)(13)(14). ticks, this technique is not often used for tick monitoring in Canada. 68 Experimental dysbiosis of the tick microbiome has revealed complex interactions among 69 the bacterial community that can influence pathogen colonization in the midgut (17-20). It has 70 also been shown that I. scapularis produces antimicrobial factors that can modulate microbe 71 colonization (20-22). Given the complex interactions that occur in the midgut and the salivary 72 glands of I. scapularis, further screening of these tissues from field-collected ticks using HTS 2022-11-09 Microbiome of Ixodes scapularis from Ontario, Canada 6/52 microbiomes in a LD hotspot in Canada, using both targeted sequencing of the V4 region of the 120 16S rDNA gene, and ribosomal RNA-depleted, whole shotgun RNA metatranscriptome 121 sequencing. We examined tissue-specific differences between the salivary gland, midgut, and 122 other internal viscera, as well as whole-tick extractions from three distinct sites representing 123 different land-use histories. Our study demonstrates how HTS methods can help quantify 124 changing health risks associated with range expansion by an arthropod vector of increasing 125 public health concern. 126 Results 127 Bacterial Community inferred from V4 by 16S rDNA sequencing 128 The core bacterial community associated with I. scapularis was represented by 25 129 dominant amplicon sequencing variants (ASVs), each with a minimum total relative abundance 130 of greater than 0.1 % (Figure 1; Supplemental Table S1). Within the core, only 131 A. phagocytophilum (ASV8) and B. miyamotoi (ASV2) were identified at the species level by the 132 DADA2 pipeline. We identified three primary ASVs of Concern (AoCs) from the core 133 microbiome, which included Borrelia sp. (ASV3), B. miyamotoi, and A. phagocytophilum. 134 Among these AoCs, the potentially LD-causing spirochete, Borrelia sp. was the most widely 135 detected across all tissue types and sampling sites in this study, ranging from trace levels up to 136 98 % relative abundance. We also detected the relapsing fever spirochete, B. miyamotoi, at 137 > 99 % relative abundance from the salivary gland and midgut of a single tick collected at the 138 Lemoine Point (LP) site. Aside from the three AoCs detected in the core microbiome, another major feature of the 152 I. scapularis bacterial community was the widespread detection of Rickettsia sp. (ASV1), which 153 was common across all three sites from whole-tick and dissected tissue samples. A total of six 154 ASVs were identified as Rickettsia sp. (Supplemental Table S2  We consider the detection of ASVs from the sequencing libraries to be reliable. The 163 negative control PCR and bead libraries contained markedly fewer sequences following our data 164 processing but removed one salivary gland-based sample (42_sg) (Supplemental Figure S1), 165 noting that an initial inspection of ASVs detected among the three separate batch PCR replicate 166 libraries determined the presence of several additional Rickettsia-associated ASVs almost 167 exclusively from a single PCR batch, including the negative control PCR (Supplemental 168   Table S3 and Supplemental Figure S2). Since these were unexpected results, the entire batch 169 replicate was removed from further analysis, while the two more consistent batch replicates were 170 retained (noting seven of the dissected samples only had a single PCR batch replicate). 171 From these sequences, 305 potentially contaminating ASVs, including taxa typically 172 known from environmental samples (e.g., Sediminibacterium, Pseudomonas, Paraburkholderia,

9/52
Cutibacterium, Chryseobacterium, Corynebacterium) were identified using decontam and 175 removed from the dataset, (Supplemental Table S4). Based on the asymptotes of rarefaction 176 curves, bacterial communities associated with whole ticks generally contained more ASVs than 177 those detected from the various dissected tissues (Supplemental Figure S3). Overall, each library 178 contained 1 to 139 ASVs, with an average of 54. 179 Sequences attributed to ASV7 were detected in 2 out of 28 of the tick samples, both from 180 salivary gland samples from the Queen's University Biological Station (QB) site and could not 181 be assigned beyond Kingdom Bacteria in our pipeline (Supplemental Table S5). However, a 182 placed ASV7 within the Clade X Babesia "sensu stricto" (i.e., classical or true Babesia) 186 (Supplemental Figure S4). The Babesia-associated ASV7 was not included in further analysis. 187 Co-infection of ticks with multiple AoCs were identified in the salivary gland of a tick 188 from the QB site (260_sg). This sample was coinfected with Borrelia sp. and A. 189 phagocytophilum, representing 22 % and 64 % relative abundance, respectively ( Figure 1). 190 In contrast, the midgut of the same individual (260_g) contained 9 % relative abundance of 191 Borrelia sp. and trace levels (< 1 % relative abundance) of A. phagocytophilum. Moreover, the 192 Babesia sp. classified as ASV7 (see above) was detected among other bacterial AoCs, either  We used LEfSe to explore variation in ASV abundance among tissue types and sample 206 sites. The abundance of two AoCs of concern, Borrelia sp. and A. phagocytophilum 207 discriminated between dissected tick tissues and wholes-tick samples, based on LDA scores 208 (Supplemental Figure S7A). The third AoC, B. miyamotoi, discriminated between tissue samples 209 from the LP site only (Supplemental Figure S7B). While only the abundance of the three AoCs 210 discriminated among dissected tick tissues, the LEfSe identified several core ASVs that 211 discriminate among sites for the whole-tick samples (Supplemental Figure S7B). Specifically, 212 Methylobacterium sp., Mycobacterium sp., Curtobacterium sp., Massilia sp., Hymenobacter sp., 213 and Order Rhizobiales were more common at the QB site; Pseudomonas sp., and Shinella were 214 more common at the MP site; and Sphingomonas sp., Frondihabitans sp., and Clavibacter sp. 215 were more common at the LP site.  Shapes indicate tissue type -whole (w) or dissected (d).

12/52
In contrast to weighted UniFrac, the first two axes of the unweighted UniFrac distances 231 explain only 34.1 % of the variation in the data ( Figure 3A). Generally, the unweighted UniFrac 232 ordination separated the bacterial communities by tissue type, with whole-tick samples clustering 233 away from dissected samples, with the exception of a few samples (71_g, 71_sg, Q5_w, Q13_W, 234 and 47_w) found in ellipse peripheries where the ellipses between whole and dissected types 235 overlapped. Overall, we detected 409 ASVs from the ticks and tissues that were sampled, with 236 112 ASVs unique to whole ticks, 152 unique to dissected tissues, and 145 shared between them 237 ( Figure 3B). Among these uniquely identified ASVs, strains of bacteria in the Phylum 238 Verrucomicrobia were more common in whole-body samples from the LP and MP sites, 239 compared to strains in the Phylum Firmicutes, which were more common among dissected 240 tissues collected across all three sampling locations (Supplemental Figure S8).  Figure S9). Furthermore, the average distance among the whole tick libraries was 255 also significantly lower (P < 2.22e -16 ) than the average distance of 0.706 (SD = 0.104) measured 256 from libraries between the groups. Using a permutation test for homogeneity of multivariate 257 dispersions, we also identified a significant difference in variance between whole and dissected 258 sample types. The unweighted UniFrac PCoA was repeated using only the top 500 most 259 relatively abundant ASVs, which was consistent with the previous findings that used ASVs 260 agglomerated at the lowest taxonomic level (Supplemental Figure S10A) but the PCoA inverted 261 Axis 2 and showed slightly more distinction between dissected and whole tick samples 262 (Supplemental Figure S10B). Following the removal of transcripts associated with unwanted phyla (see methods), we 267 further analyzed the remaining 402 transcripts of which, endosymbiont R. buchneri comprised 268 over 75 % (Table 1). Single-stranded negative-sense RNA viruses were common in the I.   Using a multi-omics approach with HTS, we found evidence for endosymbiotic and 300 pathogenic microorganisms of viral, eukaryotic, and prokaryotic origin associated with adult 301 I. scapularis collected from a LD hotspot located in southeastern Ontario. Consistent with other 302 16S rDNA-based sequencing studies of I. scapularis, the most common and relatively abundant 303 bacteria found in our study belong to the genus Rickettsia. In addition, we detected three primary 304

2022-11-09
Microbiome of Ixodes scapularis from Ontario, Canada 18/52 non-Bbsl relapsing fever spirochete B. miyamotoi. Using whole-shotgun RNA sequencing we 306 found that I. scapularis was infected with the three primary AoCs in the preceding sentence, as 307 well as Babesia sp., and R. buchneri, though few pathogen-associated transcripts were captured 308 from four adult ticks that were sequenced. A diversity of RNA viruses, especially from the 309 family Bunyavirales, were identified in relatively high abundances in the metatranscriptome. Our 310 findings highlight the usefulness of combining both non-targeted meta-transcriptomics and tick 311 organ-specific 16S rDNA sequencing to assist in the detection of pathogens and polymicrobial 312 infections involving diverse pathogenic and symbiotic microbes of the I. scapularis microbiome. 313

Potential pathogens 314
In contrast to targeted surveillance, metabarcoding by HTS is a cost-effective method to 315 simultaneously detect diverse Bbsl and non-Bbsl Borrelia in ticks, which can inform risk 316 assessment of TBDs in Canada. Using HTS, our study revealed Borrelia sp. from the Bbsl 317 complex at three sampling sites, and at higher abundance in the dissected midgut and salivary 318 gland tissues compared to whole tick samples. We also detected the non-Bbsl relapsing-fever 319 spirochete, B. miyamotoi in relatively high abundance from the dissected tissues of a single tick 320 and in other sites at a lower frequency than Bbsl-associated ASV. Although B. burgdorferi s.s. Borrelia spp. from whole ticks, but the same sequences, when detected from some individual 333 tick midgut and salivary glands were found in much higher relative abundance compared to 334 whole ticks. Our analysis highlights the value of metabarcoding the V4 16S rDNA from tick 335 salivary glands and midguts to identify potential pathogens, and for discrimination between Bbsl 336 and non-Bbsl Borrelia, which has implications for modelling risk associated of TBDs in Canada. 337 We found another potentially important human pathogen, A. phagocytophilum, at low 338 frequency in this study, mainly associated with ticks collected from the QB site. Such findings 339 are consistent with other PCR-based screening that detected A. phagocytophilum infection 340 frequencies from 0.4 to 8.9 % in I. scapularis from Canada (62-64). Our work also demonstrates 341 the value of metabarcoding by HTS in a phylogenetically-informed analysis to resolve the strain-342 specific distribution of this potential pathogen. Specifically, we find strain Ap-var-1, which 343 differs from the known pathogen strain Ap-ha (57), and although the Ap-var-1 strain is not 344 known to be pathogenic, human health risk assessments may be better informed through greater 345 sampling effort across more sites in the region to capture potentially virulent strains that may be 346 present at lower frequencies within I. scapularis populations. 347 In addition to distinguishing between three bacterial AoCs, we find that V4 16S rDNA 348 sequencing also contained Babesia-associated apicoplast sequences from the dissected salivary 349 gland of two ticks collected from the QB site. Interestingly, the Babesia-associated sequence was 350 co-associated with either Borrelia sp., or Borrelia sp. and A. phagocytophilum. While this 2022-11-09 Microbiome of Ixodes scapularis from Ontario, Canada

20/52
Babesia sp. is not closely related to human pathogenic Babesia microti based on BLAST hits, its 352 pathogenic potential is unknown. It is reasonable to suspect Babesia odocoilei because it has 353 been detected in eastern Canada before (65-70), but we are not able to confirm this since the 354 apicoplast sequence is not available for B. odocoilei. A next step may include using targeted 355 PCR, metabarcoding with Eukaryote-specific primers, or whole metagenomics to further classify 356 this co-associated Babesia sp. detected from the QB site. of extracted DNA from tissue-specific sequencing is more prone to contamination (28). Second, 378 the β-diversity measures of low biomass samples are more prone to inflation due to uneven 379 sampling depth, whereas samples with relatively fewer sequences have greater uncertainty 380 associated with rare ASVs (73-75). To address these issues, we applied a two-step approach to 381 screen contaminating ASVs from the sequencing data, followed by a rarefaction analysis to 382 assess for sufficient capture of bacterial diversity (supplemental Figure S4). Even after these 383 corrective measures, we identified a greater number of unique ASVs associated with bacterial 384 communities from dissected tissue compared to whole-body samples. This does not appear to be 385 an artifact of small sample size or uncertainty associated with rare ASVs because α-diversity in 386 our lower biomass samples (i.e., dissected tissues) was not inflated relative to the whole-body 387 samples in our rarefaction analysis (supplemental Figure S4). The biological significance of 388 higher diversity in tick tissue is worthy of future study. 389

Other microbial associates of interest 390
Although several pathogens were identified in our study, the most abundant ASVs were 391 that are not known to be pathogenic to humans but could influence tick physiology and behavior. 410 In contrast to the metatranscriptome reported by Cross et al. (42), in which they rarely observed 411 BLTPV-1 and BTPV-2 co-occurring in the same I. scapularis, our study detected co-occurrence 412 of both BTPV-1 and BTPV-2 in three out of four of the ticks analyzed, suggesting that both 413 variants can co-exist within the same host. 414

Conclusions 415
Among the female I. scapularis that we surveyed using 16S rDNA sequencing, Bbsl and In addition to the dissected ticks, we extracted DNA from seventeen whole female ticks. 445 Following sterilization as above, these ticks were dried at room temperature, frozen to -80 °C We isolated RNA from one male and three female ticks using the same sterilizing and 458 pulverizing steps as the whole-tick DNA protocol above but with 500 µL TRIzol Reagent 459 (Invitrogen) following the manufacturer's protocol and resuspending purified RNA in 15 μL 460 lab-grade water. 461

Library preparation 462
The V4 region of the 16S rRNA gene was amplified from tick DNA using forward 515F 463

2022-11-09
Microbiome of Ixodes scapularis from Ontario, Canada 25/52 (5'-GACTACHVGGGTWTCTAAT-3') primers (82). The PCRs were completed in triplicate 465 using a two-step approach, with each replicate including a control PCR (template replaced with 466 PCR-grade water). The first-step PCRs were completed in 25 µL volumes using ~ 2 -3 ng of 467 DNA template, 2 U Platinum TM Taq polymerase high-fidelity (Invitrogen), 0.5 µg µL -1 bovine 468 serum albumin (NEB), 3 mM MgSO4, 200 µM dNTPs, and 0.4 µM each 515F and 806R in 1 x 469 High Fidelity PCR Buffer. Using a SimpliAmp TM thermal cycler (Applied Biosystem), the 470 reactions were incubated at 94 °C for 3 min, followed by 20 cycles of 94 °C for 45 s; 53 °C for 471 1 min; and 72 °C for 1 min 30s, and then a final elongation at 72 °C for 10 min. These first-step 472 products were purified using Sera-Mag Magnetic SpeedBeads according to manufacturer 473 directions, and 15 µL of this purified PCR product was used as the template for the second-step 474 PCR for 12 cycles as above in 50 µL volumes. The second-step reactions were cycled 12 times 475 using the above program with the 515F/806R phasing primers (83) to add the appropriate 476 Illumina sequencing adapters, barcodes, and heterogeneity spacer components. 477 Following the second-step PCR, the products were then purified with SPRI beads as 478 out of the three PCR batch replicates were found to contain 31 additional Rickettsia-associated 520 ASVs that were not identified from the other two replicates. This problematic replicate was 521 considered atypical compared to the other two PCR batch replicates and was subsequently 522 removed from the analysis. Inconsistencies between low abundance ASVs across PCR replicate 523 libraries were also removed by retaining only ASVs occurring in the remaining two replicate 524 libraries unless only one library was available (n = 7). Libraries that were generated from any 525 remaining replicate PCRs were combined in silico. 526 Any ASVs assigned at the Phylum level within Kingdom Bacteria were retained, except 527 any ASVs assigned as ''Cyanobacteria/Chloroplast' at the Phylum level, which were removed. 528 Only ASVs with counts greater than four were retained for further analysis. Any sample libraries 529 with 7,500 or fewer ASVs were removed. Rarefaction curves were generated using ranacapa 530 Phylogenetic reconstructions of potential pathogenic Borrelia-and A. phagocytophilum-534 associated ASVs identified from the tick core microbiome were generated using 16S rDNA 535 sequences available from NCBI 16S rRNA reference database (accession: PRJNA3175; 536 downloaded November 16, 2021) (96), and select sequences available through NCBI's GenBank 537 website (97). The Phangorn R package (v2.7.1) was used to generate optimized phylogenies for 538 these ASVs (98). For each phylogeny, a maximum-likelihood tree was generated using a pre-539 optimized distance-based method (i.e., rooted or unrooted) and the best substitution model was 540 identified with Phangorn. The branch confidence of each of the final phylogenetic 541 reconstructions was tested with 1,000 bootstraps. 542 To assist with the taxonomic identification of known pathogens, we obtained 27 sequences 543 of Borrelia and Borreliella 16S rDNA-associated sequences from the NCBI 16S rRNA reference 544 database (96), and an additional sequence from B. burgdorferi isolate 15−0797 (Accession: 545 MH781147.1) that did not contain any ambiguous. Potential Borrelia sequences from the above 546 pipeline were aligned to these reference sequences using a staggered alignment with DECIPHER 547 in R (v2.20.0) (99). Similarly, we used eleven Anaplasma and Ehrlichia sequences from 548 GenBank's 16S rRNA reference database and two additional V4 16S rDNA sequences from 549 strains of A. phagocytophilum that were previously detected in Ontario (57). Of these, the Ap-ha 550 (Accession: HG916766.1) strain is considered pathogenic, while Ap-var 1 (Accession: 551 HG916767.1) is not known to be pathogenic (57). 552 Next, ASVs associated with the entire bacterial community were agglomerated at the 553 lowest level of taxonomic identification. Linear discrimination analysis (LDA) Effect Size 554 (LEfSe) comparing 16S V4 rDNA libraries from 1) whole versus dissected tissue types across all 555 sites, and 2) whole or dissected tissue types from each of the three sampling sites discriminated 2022-11-09 Microbiome of Ixodes scapularis from Ontario, Canada 29/52 separately was undertaken using a LDA score (log10) cut-off of 2.5 with microbiomeMarker 557 (100). The V4 16S rDNA sequences from the entire bacterial community were next aligned with 558 a staggered alignment using DECIPHER (2.14.0) using the profile-to-profile method (99). Using 559 this alignment, a phylogenetic tree was constructed for the entire bacterial community with 560 Phangorn (v2.5.5) (98) using the neighbor-joining tree estimation function (101)  A permutation test of difference in multivariate homogeneity of group dispersions 572 (variances) between tissue and whole tick libraries based on the unweighted UniFrac PCoA (see 573 above) was implemented using vegan (v2.5-7). Non-parametric Wilcoxon rank sum tests (108) 574 were used to identify significant differences between mean unweighted UniFrac distances 575 calculated from within and across libraries representing whole ticks and dissected tissues. To 576 investigate how rare taxa influenced the results, the statistical analysis was repeated with the 500 577 most abundant ASVs (rather than the agglomerated dataset described above) following 2022-11-09 Microbiome of Ixodes scapularis from Ontario, Canada

Metatranscriptome analysis 580
Paired-end sequence data were trimmed for quality and adapter removal of Illumina-581 specific adapter sequences (Supplemental Table S6 filtered sequence data were carried into the de novo assembly. 591 The metatranscriptome was assembled de novo using Trinity (v2.8.4) in default mode (114) 592 and then clustered at 90 % sequence identity threshold using CD-HIT-EST (v4.6.8) (115). 593 Secondary clustering generated 6,868 assembled transcripts (N50 = 2,063 bp; L50 = 320 bp) but 594 left a high proportion (82.8 %) of unassembled sequences. Count tables were generated by 595 aligning the filtered sequence data against the reference transcripts using Salmon (v0.14.1) (116). 596 Successfully aligned sequences represented 29.1 % of all paired-end reads. Prior to 597 normalization, the raw count data were filtered for low abundance transcripts (i.e., fewer than 598 five aligned reads in at least one sample). The count data were then normalized with the upper 599 quartile scaling approach and transformed using the voom function implemented in edgeR (117) 600 and limma (118, 119), respectively. The log2 counts-per-million (CPM) expression values were 2022-11-09 Microbiome of Ixodes scapularis from Ontario, Canada

31/52
For taxonomic assignment to each of the reference transcripts, we used BLASTx (120) to 603 search against GenBank's non-redundant (nr) protein database (97). Similarly, we also used 604 BLASTn (120) to search against GenBank's nucleotide (nt) database (97) and the R. buchneri 605 genome assembly REISMNv1 (Accession: GCF_000696365.1) (121). An e-value retention 606 threshold of 1e -5 was used for all BLAST-based searches. Taxonomy could be assigned to 2,206 607 (32.1 %) reference transcripts by priority using matches from the R. buchneri genome with the 608 lowest e-value, over matches against any residual I. scapularis non-redundant proteins, then 609 followed by the lowest e-value from either the nr protein or nucleotide reference databases. 610 Taxonomic information was assigned based on the blast-generated taxonIDs using TaxonKit 611 GenBank's Sequence Read Archive (SRA) representing the raw sequence data analyzed in this 618 study. The authors have made these data available without restriction. 619 The authors have also made available, without restriction, the details related to the programs and 620 algorithms used to analyze these sequencing data, including suitable documentation regarding 621 their use, as a reproducible analysis pipeline through GitHub. Please see https://github.com/ 622 damselflywingz/tick_microbiome (archived version https://doi.org/10.5061/dryad.fqz612jw9) 32/52 and the above methods and materials for the relevant details. This analysis pipeline also includes 624 the computer code created to interpret data and generate the results presented in this study. 625