TY - JOUR T1 - OMSV enables accurate and comprehensive identification of large structural variations from nanochannel-based single-molecule optical maps JF - bioRxiv DO - 10.1101/143040 SP - 143040 AU - Le Li AU - Tsz-Piu Kwok AU - Alden King-Yung Leung AU - Yvonne Y. Y. Lai AU - Iris K. Pang AU - Grace Tin-Yun Chung AU - Angel C. Y. Mak AU - Annie Poon AU - Catherine Chu AU - Menglu Li AU - Jacob J. K. Wu AU - Ernest T. Lam AU - Han Cao AU - Chin Lin AU - Justin Sibert AU - Siu-Ming Yiu AU - Ming Xiao AU - Kwok-Wai Lo AU - Pui-Yan Kwok AU - Ting-Fung Chan AU - Kevin Y. Yip Y1 - 2017/01/01 UR - http://biorxiv.org/content/early/2017/05/27/143040.abstract N2 - Human genomes contain structural variations (SVs) that are associated with various phenotypic variations and diseases. SV detection by sequencing is incomplete due to limited read length. Nanochannel-based optical mapping (OM) allows direct observation of SVs up to hundreds of kilo-bases in size on individual DNA molecules, making it a promising alternative technology for identifying large SVs. SV detection from optical maps is non-trivial due to complex types of error present in OM data, and no existing methods can simultaneously handle all these complex errors and the wide spectrum of SV types. Here we present a novel method, OMSV, for accurate and comprehensive identification of SVs from optical maps. OMSV detects both homozygous and heterozygous SVs, SVs of various types and sizes, and SVs with and without creating/destroying restriction sites. In an extensive series of tests based on real and simulated data, OMSV achieved both high sensitivity and specificity, with clear performance gains over the latest existing method. Applying OMSV to a human cell line, we identified hundreds of SVs >2kbp, with 65% of them missed by sequencing-based callers. Independent experimental validations confirmed the high accuracy of these SVs. We also demonstrate how OMSV can incorporate sequencing data to determine precise SV break points and novel sequences in the SVs not contained in the reference. We provide OMSV as open-source software to facilitate systematic studies of large SVs. ER -