Taxonomic survey of Anadenobolus monilicornis gut microbiota via shotgun nanopore sequencing

Millipedes constitute one of many soil-inhabiting organisms that act as important components of litter decomposition and nutrient recycling in terrestrial ecosystems. This is thanks in part to the microbial diversity that they contain in their gut compartments. However, millipedes and their gut microbiota are understudied, compared to other arthropods. For this reason, we partook in a metagenomic analysis of the gut of Anadenobolus monilicornis. We collected specimens of A. monilicornis, which were starved for a varying amount of time, from different municipalities of Puerto Rico. Once the DNA from their guts was extracted and sequenced using the MinION nanopore sequencer, we proceeded to analyze and compile the data obtained from the sequencer using programs such as Phylosift and MEGAN6 and the web-based MG-RAST. From our two best samples, we obtained a total of 87,110 and 99,749 reads, respectively. After comparing the data analyses and gene annotation done for both samples, we found that the bacterial phyla Proteobacteria, Bacteroidetes and Firmicutes were consistently well represented; one of our samples had much more Chlamydiae representation than the other, however. Sampled eukaryote phyla include Arthropoda, Chordata and Streptophyta. We would need a greater sample size to better determine differences in microbial diversity between millipede populations across the island; considering our small sample size, however, we were able to broadly reveal the diversity within the microenvironment of A. monilicornis’s gut.


Introduction
Millipedes are a group of arthropods belonging to the class Diplopoda and the subphylum 31 Myriapoda. With around 12,000 described species worldwide, they are a diverse group found in a 32 variety of habitats from humid rainforests to xeric deserts [1][2][3]. Millipedes are one of many soil- which aid in digestive processes [11]. Dilution plating techniques have shown that some millipede 45 species harbor an abundant diversity of proteobacteria and actinobacteria [12]. 46 There have been a small number of microbial community surveys of the gut of millipedes. 47 In some species, the most dominant bacteria were found to belong to the Enterobacteriaceae 48 family; in addition, ascomycetes were the most common yeast strains found [13]. A desert-49 dwelling millipede species possesses gut bacteria that can degrade cellulose, contributing to 50 nutrient cycling in deserts [1]. As well, certain species from the millipede orders Julida, 51 Spirobolida, and Spirostreptida harbor an association between methanogenic archaea and ciliate 52 protozoa in their hindguts, contributing to methane production [14]. The diversity of bacteria and 53 other microorganisms that occur in millipede guts might be of interest to field ecologists and 54 microbiologists alike, as the interactions between these organisms affects both soil nutrient 55 recycling and organic matter decomposition [15,16]. A full genetic or metagenomic approach to 56 these problems, however, has yet to take off.

57
For this study, we will be focusing on the microbiota that inhabits Anadenobolus 58 monilicornis's digestive tract. A. monilicornis is a species native to the Caribbean which has also 59 been introduced to Florida (U.S.A.), where it is treated as a pest [17,18] by Oxford Nanopore Technologies. Our objective is to begin to develop protocols for shotgun 79 sequencing of millipede guts via ONT, and to develop an initial fingerprint of the microbial 80 diversity found within A. monilicornis using ONT as a tool to elucidate this complex system. The gut extraction and DNA extraction work were done in the Symbiosis laboratory at the 92 University of Puerto Rico, Mayagüez Campus. Following workstation and lab material 93 sterilization with 10% bleach, the head and the last two or three segments of the abdomen of the 94 specimens were cut and removed with a scalpel; the abdomen was cut to facilitate gut extraction.

95
The guts were removed and placed in 2mL tissue disruption tubes, where they were liquified by 96 manually shaking the tubes.  For all samples, the mixtures were moved to a QIAamp Mini Spin Column and centrifuged 106 for one minute at 15,000 rpm. The spin column was then placed in a clean 2mL collection tube, 107 while the previous tube and filtrate were discarded. 500µL of the AW1 Buffer were added to the 108 spin column before being centrifuging, again for one minute at 15,000 rpm; the spin column was 109 placed in another 2mL collection tube, and the previous tube discarded. 500µL AW2 Buffer were 6 110 added to the spin column before centrifuging. The spin column was then added into a new 111 collection tube, which was centrifuged again for two minutes, and later placed in a clean 1.5mL 112 microcentrifuge tube. 50µL of nuclease-free water was added directly into the spin column, which 113 was left subsequently for one minute at room temperature and later centrifuged for one minute.

114
This last step was repeated once to increase yield. After this step, the Oxford Nanopore 115 Technologies (ONT) 1D PCR barcoding genomic DNA (SQK-LSK108) for version R9 chemistry 116 procedure was followed, with some minor alterations.

118
A master mix of 14.14µL of Fragmentase buffer and 2.2µL of 10X NEBNext® dsDNA 119 Fragmentase® (NEB cat. No. M0348s) was mixed first. In new tubes, we added 32µL of the 120 samples and 8µL of the master mix to each. The new tubes were vortexed for two seconds and 121 spun down; they were then placed on a thermocycler for five minutes at 37℃ followed by 122 approximately five minutes at 4℃. In order to heat kill the Fragmentase 5µL of EDTA was added 123 and placed on a thermocycler for 15 minutes at 65℃ followed by 10 minutes at 5℃. We aimed 124 to produce 5,000-30,000Kb DNA fragments. DNA quality was checked using 2µL of each sample 125 mixed with 3µL of loading dye and then added to a 1X gel set to 66V for 30 minutes.

126
Leftover enzymes were cleaned via Agencourt® Ampure® XP beads: 50µL of samples of 127 each sample were added to 90µL of Ampure XP beads, mixed 10 times by pipetting. The mixture 128 was left at room temperature for five minutes, then placed on a magnetic rack for two minutes.

129
The cleared solution was then aspirated out. The process was repeated but with 200µL of 70% 130 ethanol followed by aspiration. Finally, 48µL of nuclease-free water was added, and aspirated out 131 into new 1.5 mL tubes and carried forward in the protocol.    Quality filtered reads and summary statics 197 We were able to de-multiplex the two millipede gut samples via albacore, to which we will be   The data overall was of phred quality scores were decent for nanopre data with the majority of the 209 reads with a with a phred quality score greater than 9, i.e. 80% base call accuracy or better (Fig 1). 210 However, the read distribution length was much shorter than expected and we had few reads that 211 were approximately 10kbp in length (Fig 1). The results from this was that we were able to 212 MEGAN6 produced a summary taxonomic tree of the phyla sampled showing the distribution of 213 reads across phyla (Fig 2). Finally, the taxon accumulation curve created via MEGAN6 (Fig 3) 214 starts to plateau around 20 phyla for the Mayaguez sample, and around 15 phyla for the Rincón 215 sample, indicating that we will most likely discover 15 or more phyla in total with continued 216 sampling.  total of 187 reads for Phylosift (Fig 4), 673 reads for MG-RAST (Fig 5) and 356 reads for 231 MEGAN6 (Table 3). The Rincón sample had Bacteroidetes and Proteobacteria as the most well 232 represented phyla, with a total of 15 and 10 reads for Phylosift (Fig 4), 147 and 204 reads for MG-233 RAST (Fig 5), and 24 and 224 reads for MEGAN6 (Table 2), respectively. Phylosift indicated 234 that Bacteria represented 87% of the sampled reads for the Mayagüez sample, and 96% for the 235 Rincón sample (Fig 4). Chlamydiae was the most abundant phylum sampled for the Mayagüez 236 sample (187 total reads), while Bacteroidetes was the most abundant phylum sampled for the 237 Rincón sample (15 total reads) (Fig 3).  According to Phylosift, the two samples had roughly the same number of reads for the 247 protist phyla Alveolata and Stramenopiles (Fig 6). The MG-RAST analysis showed a majority of  (Table 3). MG-RAST analysis showed that most of the annotated metabolic reads belonged to core 261 cellular metabolism, followed by genetic and environmental metabolic pathways (Fig 7). The 262 metabolic pathways, created via iPath3, can be seen in detail in Figures 8 and 9.   It is very odd that we did not get many annotated Nematoda reads, for example with 313 Phylosift we were unable to obtain any Nematoda reads (Table 3). We expected to find more, 314 since we visually identified nematodes inside the extracted guts before sequencing them. We also 315 expected a larger number of reads for the fungal phylum Ascomycota; according to the work of  to Chordata (Fig 2, Fig 5), and MG-RAST annotation for metabolism returned reads related to 329 human diseases (Fig 7). sequencing the gut of a millipede using nanopore sequencing. We hope to be able to continue