Abstract
Publicly available high-throughput molecular data can enable biomarker identification and evaluation in a meta-analysis. However, a continuous biomarker’s underlying distribution and/or potential confounding factors associated with outcome will inevitably vary between cohorts and is often ignored. The survivALL R package (https://CRAN.R-project.org/package=survivALL) allows researchers to generate visual and numerical comparisons of all possible points-of-separation, enabling quantitative biomarkers to be reliably evaluated within and across datasets, independent of compositional variation. Here, we demonstrate survivALL’s ability to robustly and reproducibly determine an applicable level of gene expression for patient prognostic classification, in datasets of similar and dissimilar compositions. We believe survivALL represents a significant improvement over existing methodologies in stratifying patients and determining quantitative biomarker(s) cut-points for public and novel datasets.