RT Journal Article SR Electronic T1 Accelerating long-read analysis on modern CPUs JF bioRxiv FD Cold Spring Harbor Laboratory SP 2021.07.21.453294 DO 10.1101/2021.07.21.453294 A1 Saurabh Kalikar A1 Chirag Jain A1 Vasimuddin Md A1 Sanchit Misra YR 2021 UL http://biorxiv.org/content/early/2021/07/23/2021.07.21.453294.abstract AB Long read sequencing is now routinely used at scale for genomics and transcriptomics applications. Mapping of long reads or a draft genome assembly to a reference sequence is often one of the most time consuming steps in these applications. Here, we present techniques to accelerate minimap2, a widely used software for mapping. We present multiple optimizations using SIMD parallelization, efficient cache utilization and a learned index data structure to accelerate its three main computational modules, i.e., seeding, chaining and pairwise sequence alignment. These result in reduction of end-to-end mapping time of minimap2 by up to 3.5× while maintaining identical output.Competing Interest StatementSK, VM and SM are employees of Intel Corporation