PT - JOURNAL ARTICLE AU - Chirag Jain AU - Daniel Gibney AU - Sharma V. Thankachan TI - Co-linear chaining with overlaps and gap costs AID - 10.1101/2021.02.03.429492 DP - 2021 Jan 01 TA - bioRxiv PG - 2021.02.03.429492 4099 - http://biorxiv.org/content/early/2021/02/03/2021.02.03.429492.short 4100 - http://biorxiv.org/content/early/2021/02/03/2021.02.03.429492.full AB - Motivation Co-linear chaining has proven to be a powerful technique for finding approximately optimal alignments and approximating edit distance. It is used as an intermediate step in numerous mapping tools that follow seed-and-extend strategy. Despite this popularity, subquadratic time algorithms for the case where chains support anchor overlaps and gap costs are not currently known. Moreover, a theoretical connection between co-linear chaining cost and edit distance remains unknown.Results We present algorithms to solve the co-linear chaining problem with anchor overlaps and gap costs in Õ(n) time, where n denotes the count of anchors. We establish the first theoretical connection between co-linear chaining cost and edit distance. Specifically, we prove that for a fixed set of anchors under a carefully designed chaining cost function, the optimal ‘anchored’ edit distance equals the optimal co-linear chaining cost. Finally, we demonstrate experimentally that optimal co-linear chaining cost under the proposed cost function can be computed significantly faster than edit distance, and achieves high correlation with edit distance for closely as well as distantly related sequences.Implementation https://github.com/at-cg/ChainXContact chirag{at}iisc.ac.in, daniel.j.gibney{at}gmail.com, sharma.thankachan{at}ucf.eduCompeting Interest StatementThe authors have declared no competing interest.