High Accuracy Base Calls in Nanopore Sequencing

Philippe Faucon; Robert Trevino; Parithi Balachandran; Kylie Standage-Beier; Xiao Wang

doi:10.1101/126680

Abstract

Nanopore sequencing has introduced the ability to sequence long stretches of DNA, enabling the resolution of repeating segments, or paired SNPs across long stretches of DNA. Unfortunately significant error rates >15%, introduced through systematic and random noise inhibit downstream analysis. We propose a novel method, using unsupervised learning, to correct biologically amplified reads before downstream analysis proceeds. We also demonstrate that our method has performance comparable to existing techniques without limiting the detection of repeats, or the length of the input sequence.