RT Journal Article SR Electronic T1 Evaluation on Detection of Structural Variants by Low-Coverage Long-Read Sequencing JF bioRxiv FD Cold Spring Harbor Laboratory SP 092544 DO 10.1101/092544 A1 Li Fang A1 Jiang Hu A1 Depeng Wang A1 Kai Wang YR 2016 UL http://biorxiv.org/content/early/2016/12/17/092544.abstract AB Structural variants (SVs) in human genome are implicated in a variety of human diseases. Long-read sequencing (such as those from PacBio) delivers much longer read lengths than short-read sequencing (such as those from Illumina) and may greatly improve SV detection. However, due to the relatively high cost of long-read sequencing, users are often faced with issues such as what coverage is needed and how to optimally use the aligners and SV callers. Here, we evaluated SV calling performance of three SV calling algorithms (PBHoney-Tails, PBHoney-Spots and Sniffles) under different PacBio coverages on two personal genomes, NA12878 and HX1. Our results showed that, at 10X coverage, 76% ~ 84% deletions and 80% ~ 92 % insertions in the gold standard set can be detected by PBHoney-Spots. Combining both PBHoney-Spots and Sniffles greatly increased sensitivity, especially under lower coverages such as 6X. We further evaluated the Mendelian errors on an Ashkenazi Jewish trio dataset with low-coverage whole-genome PacBio sequencing. In addition, to automate SV calling, we developed a computational pipeline called NextSV, which integrates PBhoney and Sniffles and generates the union (high sensitivity) or intersection (high specificity) call sets. Our results provide useful guidelines for SV identification from low coverage whole-genome PacBio data and we expect that NextSV will facilitate the analysis of SVs on long-read sequencing data.