Variability and Bias in Microbiome Metagenomic Sequencing: an Interlaboratory 2 Study Comparing Experimental Protocols

57 Background 58 Several studies have documented the significant impact of methodological choices in 59 microbiome analyses. The myriad of methodological options available complicate the 60 replication of results and generally limit the comparability of findings between independent 61 studies that use differing techniques and measurement pipelines. Here we describe the Mosaic 62 Standards Challenge (MSC), an international interlaboratory study designed to assess the 63 impact of methodological variables on the results. The MSC did not prescribe methods but 64 rather asked participating labs to analyze 7 shared reference samples (5x human stool samples 65 and 2x mock communities) using their standard laboratory methods. To capture the array of 66 methodological variables, each participating lab completed a metadata reporting sheet that 67 included 100 different questions regarding the details of their protocol. The goal of this study 68 was to survey the methodological landscape for microbiome metagenomic sequencing (MGS) 69 analyses and the impact of methodological decisions on metagenomic sequencing results.

This study was the result of a collaborative effort that included academic, commercial, and 86 government labs. In addition to highlighting the impact of different methodological decisions 87 on MGS result comparability, this work also provides insights for consideration in future 88 microbiome measurement study design. 89 90 Introduction 91 Over the last decade, advances in DNA sequencing technology (Next-Generation Sequencing or 92 NGS) have led to its widespread adoption by the scientific community for myriad applications. 93 One such application, known as metagenomic sequencing (MGS), has led to a transformation in 94 how we measure and characterize complex microbial communities of microbiomes. MGS has 95 emerged as an important and powerful tool as we seek to comprehend the roles of microbes 96 inside complex and dynamic communities that are both capable of maintaining and harming 97 human and environmental health. MGS measurements are able to 'see' whole classes of 98 microorganisms present in a microbiome sample (e.g., all bacteria by 16S rRNA gene amplicon 99 sequencing (16S), or all dsDNA by whole-genome shotgun WGS); MGS can also assign a relative 100 abundance to each microorganism in complex samples. [1][2][3][4] Because of these advantages, 101 MGS is being increasingly adopted across diverse application spaces including infectious disease 102 diagnostics, [5][6][7][8][9][10][11][12] epidemiological investigations, [13][14][15] food safety, [16] and biothreat 103 surveillance. [9,[17][18][19]. The results of MGS measurements have been used to diagnose 104 infectious diseases that were missed by conventional methods. [20,21] As such, regulatory 105 agencies are actively developing new guidance and policies regarding the use of MGS in the 106 clinic and in other regulated spaces. 107 While MGS measurements hold great promise in monitoring and understanding microbial 108 communities, the current impact is often hampered by a lack of reproducibility and 109 comparability, particularly between different research centers. [22][23][24] MGS measurement 110 results are the product of complex workflows incorporating multiple distinct steps and involving 111 a multitude of methodological choices (e.g., sample collection and storage, DNA extraction and 112 purification, NGS library preparation either for WGS or 16S, DNA sequencing platform, data 113 cleanup and processing, bioinformatic analysis, interpretation). Throughout this workflow, 114 measurement bias (deviation from ground truth) and measurement noise (experimental 115 variability) are potentially introduced with each step and will depend on the particular 116 methodological choices made. [25] It is widely recognized that the interlaboratory 117 reproducibility of MGS microbiome measurements is poor, and there have been numerous 118 efforts aimed at benchmarking the analytical performance of MGS measurements in terms of 119 sensitivity, specificity, precision, reproducibility, etc. [26][27][28][29][30][31][32][33] These challenges are well-120 documented, and the community has long recognized the need for studies to prioritize and 121 investigate the sources of variability and bias in the experimental workflow [27] and the need 122 for standardized materials and methods to improve the comparability and scope of MGS 123 measurement results. 124 Designing the studies to identify sources of variability and bias as outlined above comes with its 125 own set of challenges including: sufficient numbers and diversity of reference samples to help 126 power the study; testing of a wide range of variables; a lack of consistent data analysis; cost & 127 coordination. While the task may seem daunting, several groups have taken up the call to begin 128 to address these challenges. In recognition of the complexity of the workflow, some groups 129 have broken the MGS workflow into more manageable sections with most of the focus being 130 directed at characterizing the effect of data processing and analysis either using in silico 131 datasets [32,33]

150
Briefly, the study consisted of three components: reference material selection and production, 151 broad participation from the microbiome community including metadata reporting and MGS 152 data uploads, and common analysis pipelines applied to the raw sequencing data alongside the 153 methodological metadata from each participating laboratory. The timeline and overall 154 workflow of the MSC are shown in Figure 1.

Material Production 158
The reference samples selected and distributed in this study included 5 human stool samples 159 and 2 DNA mixtures (mock DNA communities). The five stool samples were selected from a 160 pool of potential donors based on the dissimilarity of their microbiome composition ( Figure 2   were typically a log higher with >10 6 reads ( Figure SI-2). Significant variation in read number 207 was observed both between participating labs and individual samples ( Figure SI-2). 208  (Figure 4a and 4b) demonstrate that the biological variability (i.e., stool sample ID) was 217 the major factor influencing the overall ordination of the data, as expected. The impact of 218 methodological variability can be seen via the dispersal of datasets within each stool sample. It's 219 noteworthy that the outputs from the 16S datasets and the WGS datasets were so dissimilar that 220 separate PCoA analysis plots were required. From the PCoA plot of the 16S data ( Figure 4a), we 221 observed that one of the participating labs made an apparent transposition in the labeling of 222 samples 3, 4, and 5. Based on this apparent error, we excluded all the data (stool samples 1-5) 223 from this lab for the remainder of the analyses described in this manuscript. sample were expected to be more reliable because the effects of sample composition on each 233 taxa relative abundance could cancel out. One ratio that has been of interest in the field is the 234 ratio of phyla Firmicutes:Bacteroidetes; therefore, we chose this ratio to demonstrate the 235 utility of using ratios of taxa to compare data between samples [39][40][41][42]. Thus, this ratio was 236 utilized and included in our results purely for its bioinformatic utility and is not intended to 237 serve as an indicator of gut health or dysbiosis. The Firmicutes:Bacteroidetes ratio was 238 calculated for each Mosaic stool sample and compared among the individual laboratory results 239 ( Figure 5). As was expected since each laboratory used their individual MGS protocols (e.g., 240 methodological choices for DNA extraction, library preparation, and sequencing), the 241 Firmicutes:Bacteroidetes ratio varied substantially both between stool samples within each lab, 242 as well as between labs. 243

245
Of note, data submission was anonymous, so multiple submissions from the same research center would appear as distinct labs.

246
Amplicon vs. Shotgun sequencing 247 One goal of the MSC was to determine how the selection of different methodological 248 parameters during MGS would lead to observed differences in the taxonomic profiles and 249 relative abundances. The highest-level methodological choice was between 16S MGS or WGS 250 MGS. Indeed, the Firmicutes:Bacteroidetes ratio was affected by the type of sequencing 251 performed, with 16S MGS analyses reporting significantly higher Firmicutes:Bacteroidetes 252 ratios ( Figure 6a). While the majority of the 16S MGS datasets indicated that Firmicutes were 253 present at a higher relative abundance than Bacteroidetes, WGS data found the inverse with 254 Bacteroidetes being present at a higher relative abundance than Firmicutes. The magnitude of 255 this effect was quantified by averaging the results from all labs reporting each methodological 256 parameter (e.g., 16S or WGS for sequencing strategy) divided by the average result overall and 257 plotted as a fold change on a log scale ( Figure 6b). The dependence of the 258 Firmicutes:Bacteroidetes ratio on analysis strategy that was observed in this dataset could 259 explain recent reports that question the reliability of the Firmicutes:Bacteroidetes ratio as a 260 diagnostic indicator of gut health [47]. This dependence was consistent across all five stool 261 samples ( Figure SI-3), and an analysis of each sample's alpha diversity also yielded a similar 262 stratification with respect to the methodological choice of 16S or WGS analysis ( Figure SI-4). 263

270
Other metadata parameters 271 When submitting results, participating labs were asked to complete a standardized metadata 272 reporting sheet that included 100 different questions regarding the details of their protocol. 273 Some questions were generally applicable like "what sequencing instrument did you use" while 274 others were more nuanced like "what was the PCR primer set used." As such, some fields were 275 required, and others were optional. Because of the large impact generated by the 16S 276 vs. WGS methodological variable ( Figure 6) and the hierarchical nature of other methodological 277 choices (e.g., 'What was the target gene amplicon'), we chose to analyze each data set 278 separately. The effect on the Firmicutes:Bacteroidetes ratio on the 16S MGS results was 279 quantified for each subsequent methodological choice ( Figure 7) in a similar manner to that 280 employed in Figure 6b. While there were many methodological variables that appeared to have 281 a significant impact on the results (Figure 7; similar analysis for other stool samples is included 282 in Figure SI-5), many of these were only reported by a single lab (n=1). Of the 30 labs 283 submitting 16S MGS data, there were 14 methodological differences in their protocols. Of the 284 14 labs submitting WGS data, there were 9 methodological differences in their protocols ( Figure  285 SI-6). Not all methodologic variables had a significant impact on the result. Methodological 286 variables that were observed to have a significant impact on the 16S MGS results for 2 or more 287 stool samples (parameter effect and 99% confidence interval) included the manufacturer of the 288 DNA extraction kit and the target gene for amplification (Figure 7 and SI-5). Methodological 289 variables that were observed to have a significant impact on the WGS results (parameter effect) 290 for 2 or more stool samples included the DNA extraction protocol, the manufacturer of the DNA 291 extraction kit, and the library kit for shotgun sequencing. In addition to their impact on the 292 parameter effect as described above, some methodological variables were observed to have a 293 significant impact on the robustness of the measurement (observed as a lack of variability when 294 other parameters are varied). For example, when asked whether a homogenizer or shaking 295 apparatus was used, those labs that reported "Yes" displayed much greater robustness than 296 those reporting "No" (Figure 7 and SI-5 and SI-6).

301
'Spike-in' organisms 302 An additional attribute of the fecal materials used for this interlaboratory study was the 303 inclusion of two exogenous organisms to serve as whole-cell internal controls (i.e., spike-ins).

304
Since these organisms were added during the bulk homogenization step, their abundance 305 should be constant across all the stool sample aliquots. As such, it was expected that the ratio 306 of A. fisherii to L. xyli would be constant for each particular methodology (e.g., within a lab). 307 Surprisingly, L. xyli was not identified in any of the submitted 16S datasets and was only 308 observed at a low abundance (approximately 0.001 %) by WGS analysis. When the 309 A. fisherii:L. xyli ratio (by WGS) was plotted for each participating laboratory ( Figure SI-7), 310 significant variability between samples was observed. These data were unexpected and could 311 have resulted from poor database representation of L. xyli in the commercially available 312 bioinformatic pipeline used, inefficient DNA extraction, or low or inconsistent distribution 313 during material manufacture, among other possible explanations. 314

Genomic DNA mixtures 315
Another control included in the interlaboratory study were mixtures of purified microbial 316 genomic DNA. These were included alongside the stool samples in the Mosaic Kit to serve as 317 parallel processing controls and included two different mixtures, one equigenomic between 318 taxa (Mix A) and one with ten-fold dilutions of the various taxa (Mix B). These genomic DNA 319 mixtures were validated for genome copy number using ddPCR (droplet digital PCR) and serve 320 as 'ground truth' for the MGS measurements. For comparison to the MGS measurements, 321 genome copy number (as measured by ddPCR) was scaled by the assembled genomes of the 322 individual strains (i.e., rRNA copy number or genome size) to yield ground truth values for 323 comparison to 16S or WGS results, respectively. As with the Firmicutes:Bacteroidetes ratio 324 described above, ratios of individual taxa were used to characterize the DNA mock communities 325 and remove the compositional dependence of the raw relative abundance assignments. Since 326 these analyses included 16S sequencing results, we focused on strains that were unique at the 327 genus level, yielding 6 distinct ratios within each sample. The independent determination of 328 actual DNA concentration (ddPCR) was compared to the results of MGS analyses (Figure 8).

329
While there was some agreement among participating laboratories (   The design and implementation of this project can be broken into three major areas: (1) 368 reference material selection and production, (2) capturing metadata and MGS raw data, and (3)  369 comparing results between participating laboratories. 370

Reference Material Production 371
One of the first decisions was the identification of reference material(s) to include. There are 372 two primary types of materials that have been used for this type of study: (1) biologically 373 derived microbiome samples and (2) mock communities. Both types of materials were included 374 in the current investigation because they are useful in different ways for comparing between 375 diverse analytical workflows. 376 For biologically derived microbiome reference materials, a natural community (e.g. sludge, soil, 377 fecal material) is collected, homogenized, and aliquoted. Previous interlaboratory studies have 378 used these homogenized real-world materials [31,39]; however, the number of units needed 379 and the associated costs of a large-scale study are often prohibitive. Further, while biologically-380 derived materials represent the complexity and diversity of real-world samples, they currently 381 lack ground-truth value assignments (e.g. actual taxonomic abundances) due to a lack of 382 unbiased analytical methods (e.g., DNA extraction, PCR amplification) and the inherent 383 ambiguity associated with microbial taxonomy that hinders our ability to define clear 384 measurands (e.g., Escherichia coli vs. Shigella, or the recent reclassification of Lactobacillus into 385 23 novel genera).
[48] The addition of allochthonous bacteria ("spike-ins") at consistent 386 abundances into biologically-derived materials can provide some ground truth values to 387 facilitate the assessment of MGS measurements. 388 Nevertheless, these biologically-derived materials remain useful for comparing methods and 389 assessing measurement precision within individual laboratories and across different 390 laboratories. In the current study, five stool samples were selected based on their dissimilarity 391 from one another among a constellation of potential stool donors (Figure 2), with the intention 392 of representing the variability of naturally-occurring samples. Preliminary in-house analysis of 393 individual aliquots demonstrated (Figure 3) that the material collection and preparation 394 resulted in samples with reliable between-aliquot homogeneity, even given the inherently 395 inhomogeneous starting point of multiple donations of human stool. 396 Mock community reference materials are laboratory-prepared mixtures of defined constituents 397 (typically DNA from individually cultured bacteria; sometimes mixtures of whole cells) at 398 specified amounts. Thus, these materials are useful as 'ground truth' for analysis workflows, 399 allowing quantitative assessment of analytical performance (e.g., accuracy, bias, precision, etc.). 400 However, these mock community materials are inherently non-biomimetic of actual 401 microbiome samples (e.g., feces, soil, etc), namely due to their low complexity and the absence 402 of a matrix-effect, which can limit their utility for assessing analysis workflows. only 44 sets of raw sequencing data and metadata were submitted (Table 1), limiting the 417 statistical power of the resulting analyses. Nevertheless, the unused units remain currently 418 available from The BioCollective, allowing interested researchers to analyze with their own 419 methods using the same samples that have been characterized and reported on here. 420 Alongside the raw sequencing data submitted, participating laboratories filled out a metadata 421 questionnaire (available in supplemental file 1) with ~100 discrete questions about the 422 methods employed, most of which allowed selection from drop-down options describing the 423 most common methodological choices. However, it must be noted that even these in-depth 424 options were not sufficient to encompass all experimental possibilities, and many metadata 425 selections represented 'Other' or 'Internal Method' options. And, of course, the number of 426 potential methodologies continues to expand as new techniques are developed or made 427 commercially available. It was also apparent within the submitted metadata that the observed 428 methodological choices were not randomly distributed. There was no effort made in this 429 investigation to encourage exploration of a diverse set of methodologies, and groups tended to 430 cluster around common methods. The resulting metadata reflect the most employed methods 431 during the timeframe of this study ( Figure. 1). For instance: nearly half of participants analyzing 432 samples by 16S reported using the same DNA extraction kit (there were ~15 other pre-433 identified options, as well as 'In-house' and 'other' possibilities); and only 2 labs (~4%) used 434 non-Illumina sequencing platforms. 435

Comparing Results Between Laboratories 436
To help assess the impact of methodological choices in the context of compositionally-sensitive 437 MGS measurements, we focused here on ratios between Phyla (e.g., the 438 Firmicutes:Bacteroidetes ratio: Figure 5) instead of the raw relative abundances [46]. By using 439 this strategy to remove the compositional dependence of MGS results, common statistical tools 440 (e.g., mean, standard deviation, confidence intervals) could be directly applied. However, it 441 must be noted that the Firmicutes:Bacteroidetes ratio only reveals the impact of particular 442 methodological choices on the tested phyla (Firmicutes and Bacteroidetes). Thus, a 443 methodological choice that only impacted Proteobacteria, as well as one that affected 444 Firmicutes and Bacteroidetes similarly, would not be noted herein. Nevertheless, significant 445 variability in the Firmicutes:Bacteroidetes ratio was observed ( Figure 5) both between samples 446 (presumably due to real differences between the samples) and between participating 447 laboratories (presumably due to differences in measurement methodology). 448 When comparing between methodologies, the most basic experimental choice is between 16S 449 and WGS, and this choice had further implications for how subsequent steps were performed 450 (e.g., PCR conditions, library prep, sequencing depth, bioinformatic analysis). Thus, we first 451 compared the Firmicutes:Bacteroidetes ratio between analysis methods ( Figure 5). In this case, 452 it turned out that the most basic choice of how to analyze samples had a statistically significant 453 effect ( Figure 6). Analysis of each of the stool samples individually ( Figure  confirmed the significant impact of analysis strategy on observed results. Practically, this raises 456 real concerns about the comparability of data results between laboratories whose analyses 457 differ between 16S and WGS analysis. More generally, researchers should use utmost caution 458 when trying to compare between data sets collected using divergent experimental methods. 459 Within the data collected for the MSC, the significant effect observed for the choice of analysis 460 strategy had the specific implication of further limiting statistical power (e.g., of the 44 461 participating labs, 30 reported 16S results and 14 reported WGS results). Nevertheless, the 462 observed effects of other methodological choices could be similarly assessed for 16S (Figure 7) 463 or WGS ( Figure SI-4) results. Interestingly, while the statistical power was limited in this study, 464 some methodologies still appeared to have either large effect sizes or large impacts on 465 variability/precision. While it is tempting to draw firm conclusions from the current 466 investigation, caution is warranted due to the limited sample sizes. Instead, it is hoped that this 467 investigation will help guide further investigations. 468

'Spike-in' organisms 469
During production of the stool samples, two exogeneous, 'spike-in', whole cell bacterial strains 470 were included, A. fischeri and L. xyli. Both strains are typically absent in human stool. With the 471 addition of 10 8 cells/mL, each organism was expected to comprise approximately 1 % of the 472 total stool relative abundance, providing sufficient signal for identification without significantly 473 affecting the overall sample profile. Unfortunately, while this expectation proved accurate for 474 A. fischeri, L. xyli was not identified in any the of 16S MGS results and was only observed at a 475 very low relative abundance by WGS ( Figure SI-7). This absence or low-level detection could be 476 the result of a number of sources including lack of representation in the databases, bias in the 477 DNA extraction of L. xyli, or the amount of L. xyli added to the samples. However, multiple 478 coauthors were individually able to reliably detect L. xyli using alternate bioinformatic pipelines, 479 so it is likely that its limited detection in this dataset reflects a shortcoming in the reference 480 database used [data not shown and manuscript in preparation]. This explanation is also 481 supported by the observation that for WGS analyses, the variability of the ratio of spike-in 482 relative abundances between samples was somewhat improved among the labs with the 483 deepest sequencing results ( Figure SI-2). It is worth noting that all raw fastq data submitted 484 through the Mosaic Standards Challenge has been archived and made publicly available for the 485 exploration of alternate bioinformatic methods. 486 The inability to reliably detect L. xyli within the framework of this project impacts our ability to 487 accurately and confidently use A. fischeri as well since observing a constant ratio between the 488 two spike-in organisms is fundamental to trusting their utility ( Figure SI-7). Nevertheless, key 489 considerations were identified for future experimental design and implementation of internal, 490 spike-in controls. First, the strain should normally be absent in the sample, but still identifiable 491 by the analysis/database used. This can be tricky because databases often focus on the 492 organisms commonly encountered in each type of sample, and because the users of 493 bioinformatic pipelines may not have easy access to the underlying reference databases at the 494 time of analysis. Second, spike-in abundance should be sufficiently high that it can withstand 495 potential losses in the processing and still be identified, while not significantly compromising 496 the fraction of sample reads allocated the organisms native to each sample. This is in turn 497 complicated by the dependence of the observed relative abundance of any spike-in organism 498 on the MGS methods to be employed and their potential for bias with respect to each spike-in 499 organism. And third, the inclusion of additional spike-in organisms (e.g., 3-4 spike-ins total) 500 should be considered when MGS workflows have not been identified and tested a priori. This 501 provides redundancy to accommodate wide ranges of MGS methodologies and biases. In this 502 study, the inclusion of additional organisms could have avoided the problematic absence of L. 503 xyli in the reference database. 504

DNA Mock Communities 505
The DNA mixtures provided the ground-truth component in this study. Here, measurement bias 506 was observed as a disagreement between the actual ratios (black bars show the 99 % 507 confidence interval) and observed ratios (red and blue points) in Figure 8. This bias depends on 508 both the particular taxa analyzed, as well as the methods employed (16S vs. WGS is broken out 509 here). Interestingly, even where there was consensus between participating labs (i.e., a narrow 510 boxplot indicating strong consensus), substantial bias was still observed (low accuracy). The 511 consensus between participating labs is particularly apparent in the WGS analysis of the equi-512 genomic DNA mixtures (upper right panel, Figure 8)  A total of 5 donors were selected from a donor pool maintained by TBC. Figure 2 shows a Bray-541 Curtis PCoA ordination plot the entire donor pool, including the 5 donors selected, based on 542 their gut microbiome composition. The 5 donors were selected based on the dissimilarities of 543 their microbiome composition ( Figure 2). 544

Sample Collection and Processing 545
All stool samples were collected in accordance with TBC's Institutional Review Board protocol 546 and have been de-identified. The donors were provided with collection kits, and samples were 547 returned to the TBC via overnight shipping for processing. Upon receipt, the samples were 548 aseptically transferred to a zip-top bag for dispensing. The samples were stored at -80 °C in 30 g 549 aliquots until further processing. Multiple bowel movements were collected and pooled from 550 each donor. Material from each donor was processed individually (to avoid cross 551 contamination) and inside a biological safety cabinet. Using a Ninja blender, 150 g of fecal 552 material was combined with 150 g to 300 g of dry ice and homogenized into a fine powder. The 553 blender was loosely covered with a sterile lab tissue and placed in a -20 °C freezer overnight to 554 allow the remaining dry ice to sublime. For each sample, before the addition of OMNIgene 555 Stabilizing Solution (OGS), 50 g of neat powder was set aside and stored at -80 °C. 556 Approximately 90 g of stool powder was added to 750 mL of OGS. The solution was covered 557 and left to stir overnight at room temperature. The following morning, 1 mL aliquots were 558 prepared and stored at -80 °C. 559

Addition of Spike-In Bacteria and Aliquoting of Samples 560
Spike-in bacteria, Aliivibrio fischeri (formerly known as Vibrio fischeri, Gram negative) and 561 Leifsonia xyli (Gram positive), were grown to an approximate density of 10 8 CFU/mL and 562 10 9 CFU/mL, respectively. Cell concentration was confirmed via plate count and optical density. 563 The spike-in bacteria were concentrated by centrifugation, resuspended, and added to each 564 stool solution 1 hour prior to aliquoting to ensure thorough homogenization. Working in a 565 biological safety cabinet, the solution was aliquoted using wide-bore pipette tips into (800 to 566 850) aliquots. Final concentration of stool after addition of the spike-in was 100 mg/mL and 567 final concentration of each spike-in organism was 10 8 CFU/mL. Samples were stored at -80 °C 568 until distribution. 569

Sample QC 570
To assess the homogeneity of the stool samples, ten aliquots from each donor pool were 571 subjected to 16S rRNA amplicon sequencing and shotgun metagenomic sequencing. All sample 572 processing, DNA extraction, library preparation and sequencing steps were conducted at 573 CosmosID (Germantown, MD) using proprietary protocols. For the 16S sequence data, reads 574 were demultiplexed using split_libraries.py with default filtering parameters. 16S rRNA gene 575 sequences were then sorted based on sample ID using the QIIME script 576 extract_seqs_by_sample_id.py. Bacterial operational taxonomic units were selected using 577 pick_open_reference_otus.py workflow. 16S rRNA taxonomy was defined by ≥ 97 % similarity to 578 reference sequences using the core_diversity_analyses.py script. Alpha diversity, alpha 579 rarefaction curves, and taxonomy assignments were determined using the core_diversity.py 580 workflow. Data were rarefied to 100,000 sequences per sample to minimize the effect of 581 disparate sequence number on the results. Alpha diversity metrics were computed from the 582 average of 100 iterations from the alpha collated results. Microbiome features were quantified 583 from metagenome data using existing [Metaphlan2,HUMAnN2,etc.] and in-house pipelines to 584 identify strain-level taxonomic markers for all samples. 585

DNA Mixtures 586
Mixtures of purified genomic DNA from thirteen ATCC-derived strains were prepared in 1X TE 587 buffer at a final concentration of »100 ng/µL.