MRGM: a mouse reference gut microbiome reveals a large functional discrepancy for gut bacteria of the same genus between mice and humans

The gut microbiome is associated with human diseases and interacts with dietary components and drugs. In vivo mouse models may be effective for studying diet and drug effects on the gut microbiome. We constructed a mouse reference gut microbiome (MRGM, https://www.mbiomenet.org/MRGM/) that includes newly-assembled genomes from 878 metagenomes. Leveraging samples with ultra-deep metagenomic sequencing (>130 million read pairs), we demonstrated quality improvement in assembled genomes for mouse gut microbes as sequencing depth increased. MRGM provides a catalog of 46,267 non-redundant genomes with ≥70% completeness and ≤5% contamination comprising 1,689 representative bacterial species and 15.2 million non-redundant proteins. Importantly, MRGM significantly improved the taxonomic classification rate of sequencing reads from mouse fecal samples compared to previous databases. Using MRGM, we determined that reliable low-abundance taxa profiles of the mouse gut microbiome require sequencing >10 million reads. Despite the high overall functional similarity of the mouse and human gut microbiomes, only ~10% of MRGM species are shared with the human gut microbiome. Although ~80% of MRGM genera are present in the human gut microbiome, ~70% of the shared genera have <40% of core gene content for the respective genus with human counterparts. These suggest that although metabolic processes of the human gut microbiome largely occur in the mouse gut microbiome, functional translations between them according to genus-level taxonomic commonality require caution. Key Points MRGM provides 46,267 genomes comprising 1,689 bacterial species of mouse gut microbiome. Despite high overlap of genera, functional discrepancy between mouse and human gut microbiota is large. Lineage-specific markers underestimate the completeness of assembled genomes for uncharacterized taxa.

genera are present in the human gut microbiome, ~70% of the shared genera have <40% of core gene content for the respective genus with human counterparts. These suggest that although metabolic processes of the human gut microbiome largely occur in the mouse gut microbiome, functional translations between them according to genus-level taxonomic commonality require caution.

INTRODUCTION
During the past several decades, numerous studies have provided accumulated evidence of an association between the gut microbiome and human diseases (1). These findings inspired the idea of microbiome medicine, where the gut microbiome can be leveraged as a biomarker and therapeutic target to improve disease conditions (2)(3)(4). As the human reference genome has opened the era of genome medicine, the human reference gut microbiome will allow microbiome medicine to become feasible (5)(6)(7). Recently, a number of reference gut microbiomes specific for humans have been published (8)(9)(10), and the latest of these microbiomes provides ~232k non-redundant genomes comprising 5,414 prokaryotic species (10). These new catalogs exhibited substantially improved taxonomic and functional classification of metagenomic sequence reads compared to those of the previously used standard reference catalog. Mice can provide in vivo models for human gut microbiota research (11), and based on this, establishing comprehensive catalogs of mouse gut microbial genomes will also facilitate our understanding of the human gut microbiome.
To date, only two large-scale culture-based studies, miBC and mGMB, have reported 76 and 126 isolated species, respectively, from the mouse gut environment (12,13). Furthermore, the collection of metagenome-assembled genomes (MAGs) for the mouse gut microbiome lags behind that of the human gut microbiome. De novo genome assembly from gut metagenome sequencing samples obtained from 184 mice cataloged ~2.6 million non-redundant protein coding genes for 541 microbial species (14). A more recently published reference gut microbiome for mice, iMGMC, cataloged 18,203 MAGs possessing medium quality (completeness ≥ 50% and contamination ≤ 10%), including those from co-assembly (all-inone) (15). The co-assembly approach allows for the retrieval of low-abundance genomes by complementing sequence reads from different samples. However, this method is susceptible to mismatching of genomic fragments from different strains of the prevalent species (16).
Therefore, strain-level analysis requires MAGs derived only from a single-sample assembly.
Here, we present the mouse reference gut microbiome (MRGM) that provides 46,267 nonredundant genomes with a completeness of ≥ 70% and a contamination of ≤ 5% and is comprised of 1,689 representative prokaryotic species and ~15.2 million encoded proteins. In addition to the consolidated isolated genomes and MAGs available from the public repository, we deposited novel MAGs by conducting de novo assembly using public whole metagenomic 4 sequencing (WMS) data from 838 mouse fecal or cecal samples and newly-generated WMS data from 40 mouse fecal samples. To improve the usability of the database, we provide a web server (www.mbiomenet.org/MRGM/) where users can browse the cataloged genomes with all relevant information.
Taxonomic and functional discrepancies in the gut microbiome between mice and humans will likely cause pitfalls in human gut microbiome research that is based on mouse models (11). Using MRGM, we performed taxonomic and functional comparisons of the gut microbiome between mice and humans. The information obtained from this analysis will provide useful guidelines for translating metagenomic research from mouse models into the human gut microbiome. Consistent with previous reports (14), the overall functional capacity of the human gut microbiome is largely present in the mouse gut microbiome; however, only ~10% of bacterial species from the mouse gut are shared in humans. Information from mouse gut microbiome research is often transferred to the human gut microbiome via shared genera between these microbiomes. Therefore, we evaluated the taxonomic and functional overlap between mouse and human gut microbiomes at the genus level. We observed that although ~80% of mouse gut bacterial genera are shared in the human gut microbiome, ~70% of the shared genera have <40% overlap of core gene content for the respective genus. These results suggest difficulties regarding the study of the human gut microbiome when using the taxonomic profiles of the mouse gut microbiome. that estimated the quality of genome bins using CheckM (25) and selected the best result.

Public MAGs and isolated genomes for mouse gut microbiome
We collected publicly available MAGs and isolated genomes for the mouse gut microbiome from MMGC (26) (as of February 2021) and from iMGMC (15) (only sample-specific MAGs were taken). We also collected isolated genomes from miBC (12), mGMB (13), and PATRIC (27).

Genome quality assessment
We assessed the completeness and contamination of genome bins using the CheckM v1.1.3 (25) lineage-specific workflow that estimates genome quality based on completeness and contamination using lineage-specific marker gene sets. Genomes were further assessed using GUNC v1.0.1 (28), a software program that detects genome chimerism and reports clade separation scores (CSS). We filtered out genomes exhibiting CSS > 0.45, as this is the default threshold of the software.

Generation of genomic species clusters
We clustered genomes into species-level genome bins (SGBs) using a two-step iterative procedure that included fast preliminary clustering followed by more accurate secondary clustering. Preliminary clustering was performed according to average-linkage hierarchical clustering at a cutoff of 0.2 using the Mash v2.2.2 (29) distance. The mash distance was calculated for all pairwise distances between genomes using a sketch size of 10,000. Mash can calculate all pairwise distances between genomes relatively quickly. However, when the coverage of the genome is low, the accuracy will be reduced. We complemented the low accuracy of the initial clusters through refined secondary clustering using the ANImf of dRep (16). ANImf was calculated for every pair of genomes within each initial cluster. To avoid overestimation of average nucleotide identity (ANI) by local alignment, a minimum coverage threshold was applied for each pair. The coverage cutoff of genome A and genome B was determined at min (0.8, Genomes were clustered by ≥ 95% ANI, as this is equivalent to ANI among the same bacterial species (30). In each cluster, we chose a representative genome with the highest score of the genome intactness score according to the equation . Next, using the selected representative genomes, we clustered genomes iteratively using a preceding two-step clustering until they were no longer clustered. When we counted the conspecific genomes for each species, and we clustered average-linkage hierarchical clustering at a cutoff of 0.001 7 using the Mash distance (Mash ANI 99.9%). We also selected only one genome per species per sample.

Taxonomic annotation and phylogenetic tree construction
We conducted taxonomic annotation for species-representative genomes based on the Genome Taxonomy Database (GTDB) R95 (31). We used GTDB-Tk v1.4.1 (32) to classify query genomes to GTDB taxa based on its reference tree using 120 bacterial and 122 archaeal marker genes. GTDB-Tk aligned the marker genes and generated multiple sequence alignments of these genes for each species-representative genome. We inferred a phylogenetic tree using IQ-TREE v.2.0.3 (33) based on multiple sequence alignment of the concatenated sequences of 120 bacterial marker genes. For visualization of the phylogenetic tree, we used iTOL v5 (34).

Identifying 16S rRNA sequences from reference gut bacterial genomes
We predicted the 16S rRNA sequences from the representative species genomes using barrnap v0.9 (35). We lowered the default threshold of the e-value from 1e-06 to 1e-05. As highly conserved rRNA genes are difficult to predict from MAGs that are generated from short-read sequences, we used all non-redundant genomes rather than only representative genomes. We cataloged the 16S rRNA sequences from representative genomes by default. If representative genomes possessed no 16S rRNA sequence, we cataloged this sequence from the longest genomes of the species cluster.

Assessment of sequencing depth effect on MAG assembly
To test whether MAG assembly is improved by increased sequencing depth, we generated sequencing depth for read pair length of 300 bp) were randomly selected. Next, 80 simulated datasets (10 samplesⅹ8 depths) were assembled using the same in-house pipeline. We tested whether the quality of the conspecific genomes was improved as the sequencing depth was increased. We performed average-linkage hierarchical clustering at a cutoff of 0.1 using the Mash distance (Mash ANI 90%). Thereafter, if the MAGs from the same sample but at different depths were clustered, the MAGs were considered to be conspecific genomes. We compared the quality of the conspecific MAGs from adjacent sequencing depths.
We also tested whether deep sequencing aided in the assembly of MAGs for low-abundance taxa. All 878 WMS samples used for MRGM construction were divided into two groups that included >10 Gbp and <10 Gbp WMS sample groups. We measured the relative abundance of taxa whose MAGs were assembled specifically from each group of samples and did not exist in iMGMC or MMGC (i.e., newly assembled in MRGM

Evaluation of the taxonomic classification rates of the metagenomic sequencing reads
We compared taxonomic classification rates according to different reference gut microbial genome databases using Kraken2 v2.0.8-beta (43). The standard Kraken2 database was downloaded using "kraken2-build --standard". In addition to the standard Kraken2 database that is based on RefSeq prokaryotic genomes, we generated a custom Kraken2 database for iMGMC (15)  For the assessment of taxonomic classification rates, we used WMS data obtained from PRJEB37572 (44) and PRJNA730805 (45) that were not used for the construction of any of the genome catalogs. PRJEB37572 and PRJNA730805 contained 86 samples of single-end sequencing and 30 samples of paired-end sequencing, respectively. Prior to taxonomic classification, we also pre-processed the WMS data using Trimmomatic and Bowtie2. For paired-end sequencing samples, we performed Kraken2 with the option "paired". The taxonomic classification rate was calculated based on the percentage of reads that were classified.

Assessment of the sequencing depth effect on taxonomic profiling
Taxonomic features at the domain, phylum, class, order, family, genus, and species levels were stratified into eight groups according to mean relative abundance (ranging from 1e-7 to 1 with every ten-fold increase) using 10 WMS samples possessing the largest sequencing For each of the 10 WMS samples, simulated datasets were generated for 10 different depth ranges that included 0.5, 1, 2.5, 5, 10, 20, 40, 60, 80, and 125 million read pairs. Taxonomic profiles for the 100 simulated datasets were generated using the custom Kraken2 database based on MRGM. As the group for mean relative abundance of <1e-7 included only three taxonomic features, we combined it with the group for mean relative abundance of <1e-6. We then calculated the Pearson correlation coefficient (PCC) and Spearman correlation coefficient (SCC) between the taxonomic profiles of relative abundance at different sequencing depths for each level of taxonomic features.

Comparison of core genes for the same bacterial genus between humans and mice
Protein clusters for 50% similarity of MRGM and HRGM were used to compare core genes for the same bacterial genus between humans and mice. Based on the GTDB taxonomic annotation, MRGM contains 272 genera. We conducted a comparison of 220 genera that existed in both MRGM and HRGM. Genes that were conserved among ≥ 90% of nonredundant genomes of each genus were defined as 'genus core' genes. As we compared core genes for the same bacterial genus between humans and mice, we grouped core genes of humans and mice together according to 50% sequence identity using linclust with parameter '--min-seq-id 0.5'. For the same genus of gut bacterial genomes, we calculated the proportion of shared genus core genes between humans and mice.

Cataloging genomes and proteins for 1,689 prokaryotic species in the mouse gut
Our in-house pipeline for cataloging the microbial genomes of the mouse gut is presented in Figure 1a. We combined pre-assembled genomes obtained from public databases and newlyassembled genomes into MRGM. To assemble the novel genomes of the mouse gut microbiome, we collected public WMS data for 838 fecal or cecal samples that have not yet been used for metagenome assembly. Additionally, we generated in-house WMS data from , and this is equivalent to the medium quality (MQ) according to the minimum information about a metagenomeassembled genome (MIMAG) (46). Next, we combined the newly-assembled genomes with MAGs and isolated genomes from several public databases that included iMGMC (15) Table 2).
We dereplicated 58,684 genomes iteratively to 1,689 species clusters of genomes (Figure 1b, Supplementary Table 3). We selected the genomes with the highest quality for each species cluster (species-level genome bins, SGBs) as representatives. Each representative genome of the species was taxonomically annotated using GTDB-Tk (32). We observed only bacterial genomes and no archaeal genome among the 1,689 species, and these were comprised primarily of Firmicutes_A (63.5%), Bacteroides (14.5%), and Firmicutes (11.0%).
Identification of 16S rRNA sequences for each species will allow functional profiling via direct use of the corresponding species genomes and their gene content with amplicon-based metagenome analysis. We thus predicted the 16S rRNA sequence of representative species using the barnnap of Prokka (47). Highly conserved genomic regions such as 16S rRNA genes are typically not efficiently assembled from short-read sequencing data (48). To 1 2 increase the chances of recovering 16S rRNA sequences, we utilized not only representative genomes but also other conspecific genomes for each species. Nevertheless, we identified 16S rRNA sequences for only 790 (46.8%) out of 1,689 species.
Next, we surveyed the prevalence of genera among the 878 mouse gut microbiomes used for MAG assembly in this study. The three major phyla-Firmicute_A, Bacteroides, and Firmicutes-contain the majority of genera that exhibit widely different prevalence levels across mouse gut microbiomes (Figure 1c). We identified 12 genera that were present in all 878 samples (Figure 1d) Table 4). We performed functional annotation of the proteins using eggNOG-mapper (39). Notably, as the protein similarity decreased, the functional annotation rate decreased (Figure 1e). This may be due to the increased proportion of proteins that are specific for mouse gut microbiota and that cannot be annotated by known functions for orthologous groups such as eggNOG (51). 1 3 In

Sequencing depth correlates with MAG quality and genome assembly efficacy for lowabundance microbial taxa in mouse gut
We previously demonstrated the positive effect of sequencing depth on MAG assembly for human gut microbes (10). Microbiota complexity (i.e., the number of species) may affect the efficiency of MAG assembly, and it is much lower in the mouse gut than it is in the human gut. To confirm the relationship between sequencing depth and MAG quality, we generated simulated datasets for different sequencing depths using 10 WMS samples subjected to ultradeep sequencing (>130 million read pairs). For each of the 10 samples, 0.5, 2.5, 5, 10, 20, 40, 80, and 125 million read pairs were randomly selected. We assembled MAGs for 80 simulated datasets (10 samplesⅹ8 depths) and then compared the quality of conspecific MAGs in two different simulated samples at adjacent sequencing depths. We observed that MAGs from mouse gut metagenomes with greater sequencing depth possessed significantly higher quality in regard to completeness, N50, and contamination (Figure 2a-c).
Improved de novo genome assembly may enable the reconstruction of MAGs for lowabundance taxa. To test this hypothesis, we divided 378 WMS samples that were used for MRGM construction into two groups based on the sequencing depth range, and these groups included <10 Gbp and >10 Gbp. After filtering out MAGs that also existed in iMGMC or MMGC, we obtained two groups of MAGs that were specific for different ranges of sequencing depth. We then estimated the relative abundance of taxa that were specific for each group in 10 WMS samples possessing the largest sequencing depths (>80 Gbp or >260 million read pairs) of PRJNA603829 (https://www.ebi.ac.uk/ena/browser/view/PRJNA603829) and that were not included in 1 4 MRGM using Kraken2. We observed a lower range of abundance for taxa that were specifically assembled from WMS samples with higher sequencing depth (>10 Gbp) (Figure   2d). These results suggest that deep sequencing may aid in MAG assembly of low-abundance bacterial taxa in the mouse gut.

CheckM with lineage-specific markers underestimates MAG completeness
A previous catalog, iMGMC, provided mouse gut microbial genomes with MQ. We expected a higher quality of MAGs from the mouse gut microbiome due to its lower complexity compared to that of the human gut microbiome. Thus, we initially used the higher-level  (Figure 3a). Furthermore, genomes for the phylum Cyanobacteria and Firmicutes_B exhibited phylum-wide exclusion by completeness ≥ 90% (Figure 3b). To test our hypothesis, we measured completeness using SCGs for all bacterial reference genomes instead of lineage-specific genomes. Notably, genome completeness for the five orders that were predominantly <90% was significantly increased by CheckM evaluation 1 5 based on bacterial marker genes (Figure 3c). For example, majority of genomes for the TANB77 order exhibited completeness between 70% and 80% based on lineage-specific markers, whereas this value was between 90% and 100% based on the bacterial markers. In contrast, genomes for all other orders possessed significantly decreased completeness based on bacterial markers instead of lineage-specific markers. These results suggest that CheckM completeness estimation based on lineage-specific markers works properly for relatively well-established taxa groups but not for uncharacterized ones. We also found that the ratio of species clusters with only a single member genome (singleton) gradually increased as the genome completeness decreased, and genomes with completeness of 50% were comprised predominantly of singletons (Figure 3d). Therefore, for the MRGM we used 70% as the completeness cutoff to salvage genomes for the five orders that could be excluded by conventional criterion for the high-quality genome and to avoid too many singletons that are likely to be due to the misassembled MAGs (Figure 3e). test). These results suggest that the expanded microbial genome catalog specific for the mouse gut may improve the efficacy of taxonomic profiling of mouse gut metagenomes.

Reliable estimation of low-abundance bacterial taxa of the mouse gut requires deep sequencing
A previous study demonstrated a high correlation of overall taxonomic profiles of the human gut microbiome by according to shallow sequencing (0.5-2 million reads) with those obtained by ultra-deep sequencing (2.5 billion reads) (54). Nevertheless, the correlation may not be equally high for low-abundance taxa profiles. Therefore, we sought a minimum sequencing depth to obtain reliable profiles for low-abundance taxa in the mouse gut microbiome. First, we stratified taxonomic features at the domain, phylum, class, order, family, genus, and species level according to mean relative abundance (ranging from 1e-7 to 1 with every tenfold increase) using 10 WMS samples with the largest sequencing depth (>80 Gbp or >260 million read pairs) of PRJNA603829 (https://www.ebi.ac.uk/ena/browser/view/PRJNA603829) that were not used for MRGM ( Figure 4a). As the group <1e-7 included only three taxonomic features, we merged it with the group specific for <1e-6. Taxa possessing a relative abundance of <1e-6 accounted for less than 5%, whereas those with <1e-5 comprised 27.44% of the mouse gut microbiome (Figure 4b). Therefore, we chose a sequencing depth that could obtain a high correlation of abundance profiles for taxa <1e-5 with those obtained by ultra-deep sequencing. To examine the correlation between taxonomic profiles at different sequencing depths, we generated simulated datasets for 10 different depth ranges (0.5, 1, 2.5, 5, 10, 20, 40, 60, 80, and 125 million read pairs) from each of the 10 WMS samples. Correlations between taxonomic profiles for the 100 simulated datasets were calculated using the Pearson correlation coefficient (PCC) (Figure 4c). Notably, a shallow sequencing depth, 0.5 million reads, could not achieve a PCC > 0.8 for profiles for abundant taxa with a relative abundance of <1e-4, and these accounted for >62% of the mouse gut microbiome (Figure 4d). Our analyses suggest that we must sequence >10 million read pairs to achieve a PCC > 0.9 for profiles for taxa, including those possessing a relative abundance of <1e-5 that account for >27% of the 1 7 mouse gut microbiome. We observed similar results with a slightly lower overall correlation based on the Spearman correlation coefficient (SCC) (Figure 4e).

Functional capacity of the mouse gut microbiome is largely shared in the human gut microbiome
Notably, although HRGM contains bacterial genomes present in the gut environment, it exhibited an unexpectedly low classification rate for sequencing reads from mouse gut metagenomes (Figure 3f). This suggests that bacterial taxa in the gut environment are distinct between humans and mice. A previous study demonstrated that the majority of the KEGG orthology (KO) terms supported by the mouse metagenome catalog are also present in the human metagenome catalog (14), and we could reproduce similar degrees of similarity between MRGM and HRGM in regard to the total KO, Gene Ontology (GO), and carbohydrate-active enzymes databases (CAZy) (42) (Figure 5a). Therefore, despite the taxonomic discrepancy in the gut microbiome, the overall functional capacity is largely shared between mice and humans.
Next, we compared the functional annotation rates of gut microbial proteins between MRGM and HRGM. We observed that functional annotation rates were generally lower for mouse gut microbial proteins than they were for humans (Figure 5b). In particular, the GO annotation rate for mouse gut microbial proteins was less than 3% at all similarity levels. Notably, KO annotation rates for gut microbial proteins specific for mice were substantially lower than were those for proteins that were conserved in humans (Figure 5c). These results demonstrate challenges regarding the functional annotation of mouse gut microbial proteins that have no orthologous proteins.

Only ~10% of mouse gut bacterial species are present in the human gut
We next evaluated the taxonomic overlap between the MRGM and HRGM. To perform a taxonomic comparison of genomes of the same quality, human gut microbial genomes from HRGM were further filtered for CheckM completeness ≥ 70% and GUNC CSS < 0.45. We also excluded archaeal genomes of HRGM, as MRGM contains no archaeal genomes.  Table 5). The majority of the shared bacterial species belonged to the phyla Proteobacteria, Firmicutes, and Bacteroidota (Figure 6c).

Gut microbes for same genus typically share a minor portion of core gene content between mouse and human
Conversely, we observed high overall taxonomic similarity at the genus level and a higher level of taxonomic classes in which most of the MRGM genomes possessed HRGM genomes for the same taxonomic groups (Figure 6d). For example, 220 of 278 (79.1%) MRGM genera possessed genomes for the same genus in HRGM (Supplementary Table 6).
Particularly, MRGM Firmicute_A phylum that does not possess a large number of species overlaps with HRGM exhibited numerous shared genera (Figure 6c). Nevertheless, bacteria for the same taxonomic groups in the same body site could be dissimilar in their core gene content, and in turn, they could carry different functional repertoires. Therefore, we first identified 'genus core' genes for protein groups according to 50% similarity that are 1 9 conserved among ≥ 90% of non-redundant genomes for each genus, and we then measured proportion of MRGR core genes shared with HRGM for each genus (Materials and Methods). Notably, the majority of the genera for the three major phyla, Firmicute_A, Bacteroidota, and Firmicutes of MRGM possessed <40% of core genes that overlapped with their HRGM counterparts (Figure 6e). Among all 220 shared genera of MRGM, 155 (70.5%) and 119 (54.1%) genera possessed <40% and <30% of core genes, respectively (Supplementary Table 7). These results indicate that many intestinal commensal bacteria of the same genus may play distinct roles in the mouse and human gut. The results also suggest that mouse gut microbiome research may not be easily translated into human microbiome research by genus-level taxonomic commonality.

DISCUSSION
In the present study, we constructed a reference gut microbiome for mouse, MRGM, which provides an expanded catalog of genomes and proteins by MAG assembly from 878 new samples with deep metagenomic sequencing. We demonstrated a significantly improved classification rate of metagenomic sequencing reads by MRGM compared to those of the standard database of the prokaryotic genome and of previously published databases of mouse gut microbial genomes. These results indicate that gut microbial dark matter was effectively uncovered through the additional MAG assembly in the present study.
Our work is distinct from earlier studies on the construction of the mouse gut microbiome catalog in several aspects. First, we found that the current database of lineage-specific markers underestimated the completeness of MAGs for many of the relatively uncharacterized taxa. Thus, we cataloged MAGs using a threshold of 70% for CheckM completeness. Genomes for unculturable bacteria are relatively uncharacterized, and thus, the majority of the genomes with completeness between 70% and 90% may represent unculturable bacterial taxa. Given that the majority of intestinal commensal species are uncultivable, this new criterion for MAG completeness will provide more opportunities for exploring gut microbial dark matter. Second, we found that reliable profiles for lowabundance taxa require deep sequencing rather than shallow sequencing. Therefore, we recommend sequencing >10 million read pairs for taxonomic profiling based on           p _ _ C y a n o b a c t e r ia