RT Journal Article SR Electronic T1 Phenotype prediction from genome-wide genotyping data: a crowdsourcing experiment JF bioRxiv FD Cold Spring Harbor Laboratory SP 2020.08.25.265900 DO 10.1101/2020.08.25.265900 A1 Olivier Naret A1 David AA Baranger A1 Sharada Prasanna Mohanty A1 Bastian Greshake Tzovaras A1 Marcel Salathé A1 Jacques Fellay A1 with the openSNP and crowdAI community YR 2020 UL http://biorxiv.org/content/early/2020/08/25/2020.08.25.265900.abstract AB Background The increasing statistical power of genome-wide association studies is fostering the development of precision medicine through genomic predictions of complex traits. Nevertheless, it has been shown that the results remain relatively modest. A reason might be the nature of the methods typically used to construct genomic predictions. Recent machine learning techniques have properties that could help to capture the architecture of complex traits better and improve genomic prediction accuracy.Methods We relied on crowd-sourcing to efficiently compare multiple genomic prediction methods. This represents an innovative approach in the genomic field because of the privacy concerns linked to human genetic data. There are two crowd-sourcing elements building our study. First, we constructed a dataset from openSNP (opensnp.org), an open repository where people voluntarily share their genotyping data and phenotypic information in an effort to participate in open science. To leverage this resource we release the ‘openSNP Cohort Maker’, a tool that builds a homogeneous and up-to-date cohort based on the data available on opensnp.org. Second, we organized an open online challenge on the CrowdAI platform (crowdai.org) aiming at predicting height from genome-wide genotyping data.Results The ‘openSNP Height Prediction’ challenge lasted for three months. A total of 138 challengers contributed to 1275 submissions. The winner computed a polygenic risk score using the publicly available summary statistics of the GIANT study to achieve the best result (r2 = 0.53 versus r2 = 0.49 for the second-best).Conclusion We report here the first crowd-sourced challenge on publicly available genome-wide genotyping data. We also deliver the ‘openSNP Cohort Maker’ that will allow people to make use of the data available on opensnp.org.Competing Interest StatementThe authors have declared no competing interest.