PT - JOURNAL ARTICLE AU - Bahar Alipanahi AU - Leena Salmela AU - Simon J. Puglisi AU - Martin Muggli AU - Christina Boucher TI - Disentangled Long-Read De Bruijn Graphs via Optical Maps AID - 10.1101/094235 DP - 2016 Jan 01 TA - bioRxiv PG - 094235 4099 - http://biorxiv.org/content/early/2016/12/14/094235.short 4100 - http://biorxiv.org/content/early/2016/12/14/094235.full AB - Pacific Biosciences (PacBio), the main third generation sequencing technology can produce scalable, high-throughput, unprecedented sequencing results through long reads with uniform coverage. Although these long reads have been shown to increase the quality of draft genomes in repetitive regions, fundamental computational challenges remain in overcoming their high error rate and assembling them efficiently. In this paper we show that the de Bruijn graph built on the long reads can be efficiently and substantially disentangled using optical mapping data as auxiliary information. Fundamental to our approach is the use of the positional de Bruijn graph and a succinct data structure for constructing and traversing this graph. Our experimental results show that over 97.7% of directed cycles have been removed from the resulting positional de Bruijn graph as compared to its non-positional counterpart. Our results thus indicate that disentangling the de Bruijn graph using positional information is a promising direction for developing a simple and efficient assembly algorithm for long reads.