Abstract
Genome-, transcriptome- and proteome-wide measurements provide valuable insights into how biological systems are regulated. However, even fundamental aspects relating to which human proteins exist, where they are expressed and in which quantities are not fully understood. Therefore, we have generated a systematic, quantitative and deep proteome and transcriptome abundance atlas from 29 paired healthy human tissues from the Human Protein Atlas Project and representing human genes by 17,615 transcripts and 13,664 proteins. The analysis revealed that few proteins show truly tissue-specific expression, that vast differences between mRNA and protein quantities within and across tissues exist and that the expression levels of proteins are often more stable across tissues than those of transcripts. In addition, only ~2% of all exome and ~7% of all mRNA variants could be confidently detected at the protein level showing that proteogenomics remains challenging, requires rigorous validation using synthetic peptides and needs more sophisticated computational methods. Many uses of this resource can be envisaged ranging from the study of gene/protein expression regulation to protein biomarker specificity evaluation to name a few.
- Abbreviations
- aTIS
- Alternative translation initiation sites
- BH
- Benjamini-Hochberg
- CIA
- Coinertia analysis
- CID
- Collision-induced dissociation
- ETD
- Electron-transfer dissociation
- EThcD
- Electron-transfer/Higher-energy collision dissociation
- FDR
- False discovery rate
- FPKM
- Fragments per kilobase million
- GPCR
- G-protein-coupled receptors
- HCD
- Higher-energy collision dissociation
- LC-MS/MS
- Liquid chromatography tandem mass spectrometry
- lncRNA
- Long non-coding RNA
- MS
- Mass spectrometry
- PTR
- Protein-to-mRNA
- SAAV
- Single amino acid variant
- SNV
- Single nucleotide variant
- TF
- Transcription factor
- uORF
- Upstream open-reading frame