MinION-based long-read sequencing and assembly extends the Caenorhabditis elegans reference genome

  1. Terrance P. Snutch1
  1. 1Michael Smith Laboratories and Djavad Mowafaghian Centre for Brain Health, University of British Columbia, Vancouver, British Columbia, Canada V6T 1Z4;
  2. 2Michael Smith Laboratories, University of British Columbia, Vancouver, British Columbia, Canada V6T 1Z4;
  3. 3UC Santa Cruz Genomics Institute and Department of Biomolecular Engineering, University of California, Santa Cruz, California 95064, USA;
  4. 4Department of Medical Genetics, University of British Columbia, Vancouver, British Columbia, Canada V6T 1Z3
  • 5 These authors contributed equally to this work.

  • Corresponding author: snutch{at}msl.ubc.ca
  • Abstract

    Advances in long-read single molecule sequencing have opened new possibilities for ‘benchtop’ whole-genome sequencing. The Oxford Nanopore Technologies MinION is a portable device that uses nanopore technology that can directly sequence DNA molecules. MinION single molecule long sequence reads are well suited for de novo assembly of complex genomes as they facilitate the construction of highly contiguous physical genome maps obviating the need for labor-intensive physical genome mapping. Long sequence reads can also be used to delineate complex chromosomal rearrangements, such as those that occur in tumor cells, that can confound analysis using short reads. Here, we assessed MinION long-read-derived sequences for feasibility concerning: (1) the de novo assembly of a large complex genome, and (2) the elucidation of complex rearrangements. The genomes of two Caenorhabditis elegans strains, a wild-type strain and a strain containing two complex rearrangements, were sequenced with MinION. Up to 42-fold coverage was obtained from a single flow cell, and the best pooled data assembly produced a highly contiguous wild-type C. elegans genome containing 48 contigs (N50 contig length = 3.99 Mb) covering >99% of the 100,286,401-base reference genome. Further, the MinION-derived genome assembly expanded the C. elegans reference genome by >2 Mb due to a more accurate determination of repetitive sequence elements and assembled the complete genomes of two co-extracted bacteria. MinION long-read sequence data also facilitated the elucidation of complex rearrangements in a mutagenized strain. The sequence accuracy of the MinION long-read contigs (∼98%) was improved using Illumina-derived sequence data to polish the final genome assembly to 99.8% nucleotide accuracy when compared to the reference assembly.

    Footnotes

    • [Supplemental material is available for this article.]

    • Article published online before print. Article, supplemental material, and publication date are at http://www.genome.org/cgi/doi/10.1101/gr.221184.117.

    • Freely available online through the Genome Research Open Access option.

    • Received January 30, 2017.
    • Accepted December 19, 2017.

    This article, published in Genome Research, is available under a Creative Commons License (Attribution 4.0 International), as described at http://creativecommons.org/licenses/by/4.0/.

    | Table of Contents
    OPEN ACCESS ARTICLE

    Preprint Server