TY - JOUR T1 - Multi-sample Full-length Transcriptome Analysis of 22 Breast Cancer Clinical Specimens with Long-Read Sequencing JF - bioRxiv DO - 10.1101/2020.07.15.199851 SP - 2020.07.15.199851 AU - Shinichi Namba AU - Toshihide Ueno AU - Shinya Kojima AU - Yosuke Tanaka AU - Satoshi Inoue AU - Fumishi Kishigami AU - Noriko Maeda AU - Tomoko Ogawa AU - Shoichi Hazama AU - Yuichi Shiraishi AU - Hiroyuki Mano AU - Masahito Kawazu Y1 - 2020/01/01 UR - http://biorxiv.org/content/early/2020/07/16/2020.07.15.199851.abstract N2 - Although transcriptome alteration is considered as one of the essential drivers of carcinogenesis, conventional short-read RNAseq technology has limited researchers from directly exploring full-length transcripts, only focusing on individual splice sites. We developed a pipeline for Multi-Sample long-read Transcriptome Assembly, MuSTA, and showed through simulations that it enables construction of transcriptome from the transcripts expressed in target samples and more accurate evaluation of transcript usage. We applied it to 22 breast cancer clinical specimens to successfully acquire cohort-wide full-length transcriptome from long-read RNAseq data. By comparing isoform existence and expression between estrogen receptor positive and triple-negative subtypes, we obtained a comprehensive set of subtype-specific isoforms and differentially used isoforms which consisted of both known and unannotated isoforms. We have also found that exon-intron structure of fusion transcripts tends to depend on their genomic regions, and have found three-piece fusion transcripts that were transcribed from complex structural rearrangements. For example, a three-piece fusion transcript resulted in aberrant expression of an endogenous retroviral gene, ERVFRD-1, which is normally expressed exclusively in placenta and supposed to protect fetus from maternal rejection, and expression of which were increased in several TCGA samples with ERVFRD-1 fusions. Our analyses of real clinical specimens and simulated data provide direct evidence that full-length transcript sequencing in multiple samples can add to our understanding of cancer biology and genomics in general.Competing Interest StatementThe authors have declared no competing interest. ER -