Abstract
We developed a statistical method, BIOSEA, able to identify molecules that are capable of reproducing a desired cellular phenotype, by scanning a large compound collection based on biological similarity. Our method leverages highly incomplete and noisy compound bioactivity signatures from historical high-throughput screening campaigns. We applied our approach in a phenotypic screening workflow and found novel nanomolar inhibitors of cell division that reproduce the mode of action of reference natural products. In a drug discovery setting, our biological hit expansion protocol revealed new inhibitors of the NKCC1 co-transporter for autism spectrum disorders. Furthermore, we demonstrate BIOSEA’s capabilities to predict novel targets for old compounds. We report new activities for the drugs nimedipine, fluspirilene and pimozide applicable for compound repurposing and rationalizing drug side effects. Our results highlight the opportunities of reusing public bioactivity data for prospective drug discovery applications where the target or mode of action is not known.