Summary
Colorectal cancer (CRC) is the second leading cause of cancer death worldwide. In recent years, short-read single-cell RNA sequencing (scRNA-seq) has been instrumental in deciphering tumor cell heterogeneities. However, these studies only enable gene-level expression quantification but neglect alterations in transcript structures, which arise from alternative end processing or splicing, and are frequently observed in cancer. In this study, we integrated short- and long-read scRNA-seq of CRC patient samples to build the first isoform-resolution CRC transcriptomic atlas. We identified 394 dysregulated transcript structures in tumor epithelial cells, including 299 resulting from various combinations of multiple splicing events. Secondly, we characterized genes and isoforms associated with epithelial lineages and subpopulations that exhibit distinct prognoses. Finally, we built an algorithm that integrated novel peptides derived from predicted ORFs of recurrent tumor-specific transcripts with mass spectrometry data and identified a panel of recurring neoepitopes that may aid the development of neoantigen-based cancer vaccines.
Competing Interest Statement
The authors have declared no competing interest.