%0 Journal Article %A Jiao Chen %A Yingchao Zhao %A Yanni Sun %T De novo haplotype reconstruction in viral quasispecies using paired-end read guided path finding %D 2018 %R 10.1101/254987 %J bioRxiv %P 254987 %X Motivation RNA virus populations contain closely related but different viral strains infecting an individual host. As the selection acts on clouds of mutants rather than single sequences, these viruses have abilities to escape host immune responses or develop drug resistance. Reconstruction of the viral haplotypes is a fundamental step to characterize the virus population, predict their viral phenotypes, and finally provide important information for clinical treatment and prevention. Advances of the next-generation sequencing technologies open up new opportunities to assemble full-length haplotypes. However, error-prone short reads, high similarity between related strains, unknown number of haplotypes pose computational challenges for reference-free haplotype reconstruction. There is still big room to improve the performance of existing haplotype assembly tools.Results In this work, we developed a de novo haplotype reconstruction tool PEHaplo for viral quasispecies data, which contains a group of related but different viral strains. PEHaplo employs paired-end reads to distinguish highly similar strains. We applied it to both simulated and real quasispecies data, and the results were benchmarked against several recently published haplotype reconstruction tools. The comparison shows that PEHaplo outperforms the benchmarked tools in a comprehensive set of metrics.Availability The source code and the documentation of PEHaplo is available at https://github.com/chjiao/PEHaplo.Contact yannisun{at}msu.edu %U https://www.biorxiv.org/content/biorxiv/early/2018/01/28/254987.full.pdf