MBG: Minimizer-based sparse de Bruijn Graph construction

M Rautiainen, T Marschall - Bioinformatics, 2021 - academic.oup.com
Bioinformatics, 2021academic.oup.com
Abstract Motivation De Bruijn graphs can be constructed from short reads efficiently and
have been used for many purposes. Traditionally, long-read sequencing technologies have
had too high error rates for de Bruijn graph-based methods. Recently, HiFi reads have
provided a combination of long-read length and low error rate, which enables de Bruijn
graphs to be used with HiFi reads. Results We have implemented MBG, a tool for building
sparse de Bruijn graphs from HiFi reads. MBG outperforms existing tools for building dense …
Motivation
De Bruijn graphs can be constructed from short reads efficiently and have been used for many purposes. Traditionally, long-read sequencing technologies have had too high error rates for de Bruijn graph-based methods. Recently, HiFi reads have provided a combination of long-read length and low error rate, which enables de Bruijn graphs to be used with HiFi reads.
Results
We have implemented MBG, a tool for building sparse de Bruijn graphs from HiFi reads. MBG outperforms existing tools for building dense de Bruijn graphs and can build a graph of 50× coverage whole human genome HiFi reads in four hours on a single core. MBG also assembles the bacterial E.coli genome into a single contig in 8 s.
Availability and implementation
Package manager: https://anaconda.org/bioconda/mbg and source code: https://github.com/maickrau/MBG.
Supplementary information
Supplementary data are available at Bioinformatics online.
Oxford University Press