RT Journal Article SR Electronic T1 Text Mining of Disease-lifestyle Associations to Explain Comorbidities in Electronic Health Registries JF bioRxiv FD Cold Spring Harbor Laboratory SP 168211 DO 10.1101/168211 A1 Lars Juhl Jensen YR 2017 UL http://biorxiv.org/content/early/2017/08/10/168211.abstract AB Mining of electronic health registries can reveal vast numbers of disease correlations (from hereon referred to as comorbidities for simplicity). However, the underlying causes can be hard to identify, in part because health registries usually do not record important lifestyle factors such as diet, substance consumption, and physical activity. To address this challenge, I developed a text-mining approach that uses dictionaries of diseases and lifestyle factors for named entity recognition and subsequently for co-occurrence extraction of disease–lifestyle associations from Medline. I show that this approach is able to extract many correct associations and provide proof-of-concept that these can provide plausible explanations for comorbidities observed in Swedish and Danish health registry data.