Abstract
Motivation Genome assembly is increasingly performed on long, uncorrected reads. Assembly quality may be degraded due to unfiltered chimeric reads; also, the storage of all read overlaps can take up to terabytes of disk space.
Results We introduce two tools, yacrd and fpa, preform respectively chimera removal, read scrubbing, and filter out spurious overlaps. We show that yacrd results in higher-quality assemblies and is one hundred times faster than the best available alternative.
Availability https://github.com/natir/yacrd and https://github.com/natir/fpa
Contact pierre.marijon{at}inria.fr
Supplementary information Supplementary data are available online.
Footnotes
Important revision, we add many datasets (more than 60) in result and two figures to present how our tools work.
https://github.com/natir/yacrd-and-fpa-upstream-tools-for-lr-genome-assembly