PT  - JOURNAL ARTICLE
AU  - Samuel M. Nicholls
AU  - Joshua C. Quick
AU  - Shuiquan Tang
AU  - Nicholas J. Loman
TI  - Ultra-deep, long-read nanopore sequencing of mock microbial community standards
AID  - 10.1101/487033
DP  - 2018 Jan 01
TA  - bioRxiv
PG  - 487033
4099  - http://biorxiv.org/content/early/2018/12/04/487033.short
4100  - http://biorxiv.org/content/early/2018/12/04/487033.full
AB  - Background Long sequencing reads are information-rich: aiding de novo assembly and reference mapping, and consequently have great potential for the study of microbial communities. However, the best approaches for analysis of long-read metagenomic data are unknown. Additionally, rigorous evaluation of bioinformatics tools is hindered by a lack of long-read data from validated samples with known composition.Methods We sequenced two commercially-available mock communities containing ten microbial species (ZymoBIOMICS Microbial Community Standards) with Oxford Nanopore GridION and PromethION. Isolates from the same mock community were sequenced individually with Illumina HiSeq.Data We generated 14 and 16 Gbp from GridION flowcells and 146 and 148 Gbp from PromethION flowcells for the even and odd communities respectively. Read length N50 was 5.3 Kbp and 5.2 Kbp for the even and log community, respectively. Basecalls and corresponding signal data are made available (4.2 TB in total).Results Alignment to Illumina-sequenced isolates demonstrated the expected microbial species at anticipated abundances, with the limit of detection for the lowest abundance species below 50 cells (GridION). De novo assembly of metagenomes recovered long contiguous sequences without the need for pre-processing techniques such as binning.Conclusions We present ultra-deep, long-read nanopore datasets from a well-defined mock community. These datasets will be useful for those developing bioinformatics methods for long-read metagenomics and for the validation and comparison of current laboratory and software pipelines.