Abstract
The objective of many high-throughput “omics” studies is to obtain a relatively low-dimensional set of observables - signature - for sample classification purposes (diagnosis, prognosis, stratification). We propose DNetPRO, Discriminant Analysis with Network PROcessing, a supervised signature identification method based on a bottom-up combinatorial approach that exploits the discriminant power of all variable pairs. The algorithm is easily scalable allowing efficient computing even for high number of observables (104 − 105). We show applications on real high-throughput genomic datasets in which our method outperforms existing results, or compares to them but with a smaller number of selected variables. Moreover the linearity of DNetPRO allows a clearer interpretation of the obtained signatures in comparison to non linear classification models