TY - JOUR T1 - SQUID: Transcriptomic Structural Variation Detection from RNA-seq JF - bioRxiv DO - 10.1101/162776 SP - 162776 AU - Cong Ma AU - Mingfu Shao AU - Carl Kingsford Y1 - 2017/01/01 UR - http://biorxiv.org/content/early/2017/07/12/162776.abstract N2 - Transcripts are frequently modified by structural variations, which leads to either a fused transcript of two genes (known as fusion gene) or an insertion of intergenic sequence into a transcript. These modifications are termed transcriptomic structural variants (TSV), and they can lead to drastic change of a downstream translation product. Detecting TSVs, especially in cancer tumor sequencing where they are known to frequently occur, is an important and challenging computational problem. This problem is made even more challenging in that often only RNA-seq measurements are available from the sample. We introduce SQUID, a novel algorithm and its implementation, to accurately and comprehensively predict both fusion-gene and non-fusion-gene TSVs from RNA-seq alignments. SQUID takes the unique approach of attempting to reconstruct an underlying genome sequence that best explains the observed RNA-seq reads. By unifying both concordant alignments and discordant read alignments into one model, SQUID achieves high sensitivity with many fewer false positives than other approaches. We detect TSVs on TCGA tumor samples using SQUID, and observe that breast cancer samples are more likely to contain a large number of TSVs than several other cancer types. We further find that non-fusion-gene TSVs are more likely to be intra-chromosomal than fusion-gene TSVs while the breakpoint separation distance tends to be larger than that of fusion-gene TSVs in intra-chromosomal case. We also identify several novel TSVs involving tumor suppressor genes, which may lead to loss-of-function of corresponding genes and play a role in tumorgenesis. ER -