RT Journal Article SR Electronic T1 LRTK: A unified and versatile toolkit for analyzing linked-read sequencing data JF bioRxiv FD Cold Spring Harbor Laboratory SP 2022.08.10.503458 DO 10.1101/2022.08.10.503458 A1 Yang, Chao A1 Zhang, Zhenmiao A1 Liao, Herui A1 Zhang, Lu YR 2022 UL http://biorxiv.org/content/early/2022/08/13/2022.08.10.503458.abstract AB Summary Linked-read sequencing technologies offering reads with both high base quality and long-range DNA connectedness have shown great success in genomic studies. The mainstream platforms include 10x Genomics linked-read (10x), Single Tube Long Fragment Read (stLFR) and Transposase Enzyme-Linked Long-read Sequencing (TELL-Seq). The existing data analysis pipelines, e.g., Long Ranger, have been developed to process sequencing data from particular platforms and so are unable to fully utilize the unique characteristics of other platforms; thus, users have limited tools to choose for downstream analysis. To address these limitations, we present Linked-Read ToolKit (LRTK), a unified and versatile toolkit to process linked-read sequencing data from different platforms. LRTK provides flexible functions to perform data simulation, format conversion, data preprocessing, barcode-aware read alignment, variant calling and phasing. It also allows multi-sample batch processing and generates a HTML report with key statistics and plots. We applied LRTK to the linked-read data of NA24385 obtained from all three platforms, where the results showed the advancement of LRTK in structural variation recall rate for 10x linked-reads and in increasing phase block N50 for 10x and stLFR linked-reads.Availability Source codes are available at https://github.com/ericcombiolab/LRTK. Anaconda supports the installation of LRTK and its dependencies.Contact ericluzhang{at}hkbu.edu.hkSupplementary information Supplementary data are available at Bioinformatics online.Competing Interest StatementThe authors have declared no competing interest.