PT - JOURNAL ARTICLE AU - Brett K. Beaulieu-Jones AU - Zhiwei Steven Wu AU - Chris Williams AU - Casey S. Greene TI - Privacy-preserving generative deep neural networks support clinical data sharing AID - 10.1101/159756 DP - 2017 Jan 01 TA - bioRxiv PG - 159756 4099 - http://biorxiv.org/content/early/2017/07/05/159756.short 4100 - http://biorxiv.org/content/early/2017/07/05/159756.full AB - Though it is widely recognized that data sharing enables faster scientific progress, the sensible need to protect participant privacy hampers this practice in medicine. We train deep neural networks that generate synthetic subjects closely resembling study participants. Using the SPRINT trial as an example, we show that machine-learning models built from simulated participants generalize to the original dataset. We incorporate differential privacy, which offers strong guarantees on the likelihood that a subject could be identified as a member of the trial. Investigators who have compiled a dataset can use our method to provide a freely accessible public version that enables other scientists to perform discovery-oriented analyses. Generated data can be released alongside analytical code to enable fully reproducible workflows, even when privacy is a concern. By addressing data sharing challenges, deep neural networks can facilitate the rigorous and reproducible investigation of clinical datasets.One Sentence Summary Deep neural networks can generate shareable biomedical data to allow reanalysis while preserving the privacy of study participants.