RT Journal Article SR Electronic T1 Privacy-preserving generative deep neural networks support clinical data sharing JF bioRxiv FD Cold Spring Harbor Laboratory SP 159756 DO 10.1101/159756 A1 Brett K. Beaulieu-Jones A1 Zhiwei Steven Wu A1 Chris Williams A1 Casey S. Greene YR 2017 UL http://biorxiv.org/content/early/2017/07/05/159756.abstract AB Though it is widely recognized that data sharing enables faster scientific progress, the sensible need to protect participant privacy hampers this practice in medicine. We train deep neural networks that generate synthetic subjects closely resembling study participants. Using the SPRINT trial as an example, we show that machine-learning models built from simulated participants generalize to the original dataset. We incorporate differential privacy, which offers strong guarantees on the likelihood that a subject could be identified as a member of the trial. Investigators who have compiled a dataset can use our method to provide a freely accessible public version that enables other scientists to perform discovery-oriented analyses. Generated data can be released alongside analytical code to enable fully reproducible workflows, even when privacy is a concern. By addressing data sharing challenges, deep neural networks can facilitate the rigorous and reproducible investigation of clinical datasets.One Sentence Summary Deep neural networks can generate shareable biomedical data to allow reanalysis while preserving the privacy of study participants.