Abstract
Motivation Studying the interaction or co-expression of the proteins or markers in the tumor microenvironment (TME) of cancer subjects can be crucial in the assessment of risks, such as death or recurrence. In the conventional approach, the cells need to be declared positive or negative for a marker based on its intensity. For multiple markers, manual thresholds are required for each marker, which can become cumbersome. The performance of the subsequent analysis relies heavily on this step and thus suffers from subjectivity and lacks robustness.
Results We present a new method where different marker intensities are viewed as dependent random variables, and the mutual information (MI) between them is considered to be a metric of co-expression. Estimation of the joint density, as required in the traditional form of MI, becomes increasingly challenging as the number of markers increases. We consider an alternative formulation of MI which is conceptually similar but has an efficient estimation technique for which we develop a new generalization. With the proposed method, we analyzed a lung cancer dataset finding the co-expression of the markers, HLA-DR and CK to be associated with survival. We also analyzed a triple negative breast cancer dataset finding the co-expression of the immuno-regulatory proteins, PD1, PD-L1, Lag3 and IDO, to be associated with disease recurrence. We demonstrated the robustness of our method through different simulation studies.
Availability The associated R package can be found here, https://github.com/sealx017/MIAMI.
Contact souvik.seal{at}cuanschutz.edu
Supplementary information The Supplementary Material is attached.
Competing Interest Statement
The authors have declared no competing interest.
Footnotes
Few typos are fixed.