RT Journal Article SR Electronic T1 Improving Deconvolution Methods in Biology through Open Innovation Competitions: An Application to the Connectivity Map JF bioRxiv FD Cold Spring Harbor Laboratory SP 2020.01.10.897363 DO 10.1101/2020.01.10.897363 A1 Andrea Blasco A1 Ted Natoli A1 Michael G. Endres A1 Rinat A. Sergeev A1 Steven Randazzo A1 Jin H. Paik A1 N. J. Maximilian Macaluso A1 Rajiv Narayan A1 Xiaodong Lu A1 David Peck A1 Karim R. Lakhani A1 Aravind Subramanian YR 2020 UL http://biorxiv.org/content/early/2020/10/09/2020.01.10.897363.abstract AB Do machine learning methods improve standard deconvolution techniques for gene expression data? This paper uses a unique new dataset combined with an open innovation competition to evaluate a wide range of gene-expression deconvolution approaches developed by 294 competitors from 20 countries. The objective of the competition was to separate the expression of individual genes from composite measures of gene pairs. Outcomes were evaluated using direct measurements of single genes from the same samples. Results indicate that the winning algorithm based on random forest regression outperformed the other methods in terms of accuracy and reproducibility. More traditional gaussian-mixture methods performed well and tended to be faster. The best deep learning approach yielded outcomes slightly inferior to the above methods. We anticipate researchers in the field will find the dataset and algorithms developed in this study to be a powerful research tool for benchmarking their deconvolution methods and a useful resource for multiple applications.Competing Interest StatementThe authors have declared no competing interest.