Abstract
One of the major challenges faced in defining clinically applicable and homogeneous molecular tumor subtypes is assigning biological and/or clinical interpretations to etiological (intrinsic) subtypes. The conventional approach involves at least three steps: Firstly, identify subtypes using unsupervised clustering of patient tumours with molecular (etiological) profiles; secondly associate the subtypes with clinical or phenotypic information (covariates) to infer some biological meaning to the redefined subtypes; and thirdly, identify clinically relevant biomarkers associated with the subtypes. Here, we report the implementation of a tool, phenotype mapping (phenMap), which combines these three steps to define functional subtypes with associated phenotypic information and molecular signatures. phenMap models meta (unobserved) variables as a function of covariates to expose any underlying clustering structure within the data and discover associations between subtypes and phenotypes. We demonstrate how this tool can more avidly identify functional subtypes that are an improvement over already existing etiological subtypes by analysing published breast cancer gene expression data.