TY - JOUR T1 - LINKS: Scaffolding genome assemblies with kilobase-long nanopore reads JF - bioRxiv DO - 10.1101/016519 SP - 016519 AU - René L. Warren AU - Benjamin P. Vandervalk AU - Steven J.M. Jones AU - Inanç Birol Y1 - 2015/01/01 UR - http://biorxiv.org/content/early/2015/03/17/016519.abstract N2 - Motivation Owing to the complexity of the assembly problem, we do not yet have complete genome sequences. The difficulty in assembling reads into finished genomes is exacerbated by sequence repeats and the inability of short reads to capture sufficient genomic information to resolve those problematic regions. Established and emerging long read technologies show great promise in this regard, but their current associated higher error rates typically require computational base correction and/or additional bioinformatics preprocessing before they could be of value. We present LINKS, the Long Interval Nucleotide K-mer Scaffolder algorithm, a solution that makes use of the information in error-rich long reads, without the need for read alignment or base correction. We show how the contiguity of an ABySS E. coli K-12 genome assembly could be increased over five-fold by the use of beta-released Oxford Nanopore Ltd. (ONT) long reads and how LINKS leverages long-range information in S. cerevisiae W303 ONT reads to yield an assembly with less than half the errors of competing applications. Re-scaffolding the colossal white spruce assembly draft (PG29, 20 Gbp) and how LINKS scales to larger genomes is also presented. We expect LINKS to have broad utility in harnessing the potential of long reads in connecting high-quality sequences of small and large genome assembly drafts.Availability http://www.bcgsc.ca/bioinformatics/software/linksContact rwarren{at}bcgsc.ca ER -