TY - JOUR T1 - Splice junction-centric approach to identify translated noncanonical isoforms in the human proteome JF - bioRxiv DO - 10.1101/372995 SP - 372995 AU - Edward Lau AU - Yu Han AU - Damon R. Williams AU - Rajani Shrestha AU - Joseph C. Wu AU - Maggie P. Y. Lam Y1 - 2019/01/01 UR - http://biorxiv.org/content/early/2019/04/08/372995.abstract N2 - RNA sequencing has led to the discovery of many transcript isoforms created by alternative splicing, but the translational status and functional significance of most alternative splicing events remain unknown. Here we applied a splice junction-centric approach to survey the landscape of protein alternative isoform expression in the human proteome. We focused on alternative splice events where pairs of splice junctions corresponding to included and excluded exons with appreciable read counts are translated together into selective protein sequence databases. Using this approach, we constructed tissue-specific FASTA databases from ENCODE RNA sequencing data, then reanalyzed splice junction peptides in existing mass spectrometry datasets across 10 human tissues (heart, lung, liver, pancreas, ovary, testis, colon, prostate, adrenal gland, and esophagus) as well as generated data on human induced pluripotent stem cell directed cardiac differentiation. Our analysis identified 1,108 non-canonical isoforms from human tissues, including 253 novel splice junction peptides in 212 genes that are not documented in the comprehensive Uniprot TrEMBL or Ensembl RefSeq databases. On a proteome scale, non-canonical isoforms differ from canonical sequences preferentially at sequences with heightened protein disorder, suggesting a functional consequence of alternative splicing on the proteome is the regulation of intrinsically disordered regions. We further observed examples where isoform-specific regions intersect with important cardiac protein phosphorylation sites as well as generated data on human induced pluripotent stem cell directed cardiac differentiation. Our results reveal previously unidentified protein isoforms and may avail efforts to elucidate the functions of splicing events and expand the pool of observable biomarkers in profiling studies.A3SSalternative 3-prime splice siteA5SSalternative 5-prime splice siteFDRfalse discovery rateIDRintrinsically disordered regionsiPSCinduced pluripotent stem cellsMXEmutually exclusive exonsPSIpercent spliced inPTCpremature termination codonPTMpost-translational modificationsSEskipped exonRIretained intronTMTtandem mass tags ER -