PT - JOURNAL ARTICLE AU - Moret, Nienke AU - Clark, Nicholas A. AU - Hafner, Marc AU - Wang, Yuan AU - Lounkine, Eugen AU - Medvedovic, Mario AU - Wang, Jinhua AU - Gray, Nathanael AU - Jenkins, Jeremy AU - Sorger, Peter K. TI - Cheminformatics tools for analyzing and designing optimized small molecule libraries AID - 10.1101/358978 DP - 2018 Jan 01 TA - bioRxiv PG - 358978 4099 - http://biorxiv.org/content/early/2018/06/29/358978.short 4100 - http://biorxiv.org/content/early/2018/06/29/358978.full AB - Libraries of highly annotated small molecules have many uses in chemical genetics, drug discovery and drug repurposing. Many such libraries have become available, but few data-driven approaches exist to compare these libraries and design new ones. In this paper, we describe such an approach that makes use of data on binding selectivity, target coverage and induced cellular phenotypes as well as chemical structure and stage of clinical development. We implement the approach as R software and a Web-accessible tool (http://www.smallmoleculesuite.org) that uses incomplete and often confounded public data in combination with user preferences to score and create libraries. Analysis of six kinase inhibitor libraries using our approach reveals dramatic differences among them, leading us to design a new LSP-OptimalKinase library that outperforms all previous collections in terms of target coverage and compact size. We also assemble a mechanism of action library that optimally covers 1852 targets of the liganded genome. Using our tools, individual research groups and companies can quickly analyze private compound collections and public libraries can be progressively improved using the latest data.