Abstract
Intratumor heterogeneity (ITH) is associated with tumor progression, relapse, immunoevasion, and drug resistance. Existing algorithms for measuring ITH are limited to at a single molecular level. We proposed a set of algorithms for measuring ITH at the genome (somatic copy number alterations (CNAs) and mutations), mRNA, microRNA (miRNA), long non-coding RNA (lncRNA), protein, and epigenome level, respectively. These algorithms were designed based on a common concept: information entropy. By analyzing 33 TCGA cancer types, we demonstrated that these ITH measures had the typical properties of ITH, namely their significant correlations with unfavorable prognosis, tumor progression, genomic instability, antitumor immunosuppression, and drug resistance. Furthermore, we showed that the correlations between ITH measures at identical molecular levels were stronger than those at different molecular levels. The mRNA ITH showed stronger correlations with the miRNA, lncRNA, and epigenome ITH than with the genome ITH, supporting the regulatory relationships of miRNA, lncRNA, and DNA methylation towards mRNA. The protein ITH displayed stronger correlations with the transcriptome-level ITH than with the genome-level ITH, supporting the central dogma of molecular biology. Finally, we integrated the seven ITH measures into an ITH measure, which displayed more prominent properties of ITH than the ITH measures at a single molecular level. This analysis of multi-level ITH provides novel insights into tumor biology and potential values in clinical practice for pan-cancer.
Competing Interest Statement
The authors have declared no competing interest.
List of Abbreviations
- CN
- copy number
- CNA
- copy number alteration
- DDR
- DNA damage repair
- DFI
- disease-free interval
- DFS
- disease-free survival
- DSS
- disease-specific survival
- FA
- Fanconi anemia
- GDC
- Genomic Data Commons
- GI
- gastrointestinal cancer
- HRD
- homologous recombination deficiency
- ITH
- intratumor heterogeneity
- IE
- Information entropy
- KM
- Kaplan–Meier
- lncRNA
- long non-coding RNA
- MAF
- mutant allele fraction
- miRNA
- microRNA
- OS
- overall survival
- PFI
- progression-free interval
- ssGSEA
- single-sample gene-set enrichment analysis
- TCGA
- The Cancer Genome Atlas
- TMB
- tumor mutation burden
- ACC
- adrenocortical carcinoma
- BLCA
- bladder urothelial carcinoma
- BRCA
- breast invasive carcinoma
- CESC
- cervical squamous cell carcinoma and endocervical adenocarcinoma
- CHOL
- cholangiocarcinoma
- COAD
- colon adenocarcinoma
- DLBC
- lymphoid neoplasm diffuse large B-cell lymphoma
- ESCA
- esophageal carcinoma
- GBM
- glioblastoma multiforme
- HNSC
- head and Neck squamous cell carcinoma
- KICH
- kidney chromophobe
- KIRC
- kidney renal clear cell carcinoma
- KIRP
- kidney renal papillary cell carcinoma
- LAML
- acute myeloid leukemia
- LGG
- brain lower grade glioma
- LIHC
- liver hepatocellular carcinoma
- LUAD
- lung adenocarcinoma
- LUSC
- lung squamous cell carcinoma
- MESO
- mesothelioma
- OV
- ovarian serous cystadenocarcinoma
- PAAD
- pancreatic adenocarcinoma
- PCPG
- pheochromocytoma and paraganglioma
- PRAD
- prostate adenocarcinoma
- READ
- rectum adenocarcinoma
- SARC
- sarcoma
- SKCM
- skin cutaneous melanoma
- STAD
- stomach adenocarcinoma
- TGCT
- testicular germ cell tumors
- THCA
- thyroid carcinoma
- THYM
- thymoma
- UCEC
- uterine corpus endometrial carcinoma
- UCS
- uterine carcinosarcoma
- UVM
- uveal melanoma.