Abstract
Unbiased assays such as shotgun proteomics and RNA-seq provide high-resolution molecular characterization of tumors. These assays measure molecules with highly varied distributions, making interpretation and hypothesis testing challenging. Samples with the most extreme measurements for a molecule can reveal the most interesting biological insights, yet are often excluded from analysis. Furthermore, rare disease subtypes are, by definition, underrepresented in cancer cohorts. To provide a strategy for identifying molecules aberrantly enriched in small sample cohorts, we present BlackSheep--a package for non-parametric description and differential analysis of genome-wide data, available at https://github.com/ruggleslab/blackSheep. BlackSheep is a complementary tool to other differential expression analysis methods that may be underpowered when analyzing small subgroups in a larger cohort.