Skip to main content
bioRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search
New Results

Learning from Longitudinal Data in Electronic Health Record and Genetic Data to Improve Cardiovascular Event Prediction

View ORCID ProfileJuan Zhao, QiPing Feng, Patrick Wu, Roxana Lupu, Russel A Wilke, Quinn S Wells, View ORCID ProfileJoshua Denny, View ORCID ProfileWei-Qi Wei
doi: https://doi.org/10.1101/366682
Juan Zhao
Vanderbilt University Medical Center;
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Juan Zhao
QiPing Feng
Vanderbilt University Medical Center;
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Patrick Wu
Vanderbilt University Medical Center;
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Roxana Lupu
University of South Dakota Sanford School of Medicine
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Russel A Wilke
University of South Dakota Sanford School of Medicine
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Quinn S Wells
Vanderbilt University Medical Center;
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Joshua Denny
Vanderbilt University Medical Center;
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Joshua Denny
Wei-Qi Wei
Vanderbilt University Medical Center;
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Wei-Qi Wei
  • For correspondence: wei-qi.wei@vanderbilt.edu
  • Abstract
  • Info/History
  • Metrics
  • Supplementary material
  • Preview PDF
Loading

Abstract

Background: Current approaches to predicting Cardiovascular disease rely on conventional risk factors and cross-sectional data. In this study, we asked whether: i) machine learning and deep learning models with longitudinal EHR information can improve the prediction of 10-year CVD risk, and ii) incorporating genetic data can add values to predictability. Methods: We conducted two experiments. In the first experiment, we modeled longitudinal EHR data with aggregated features and temporal features. We applied logistic regression (LR), random forests (RF) and gradient boosting trees (GBT) and Convolutional Neural Networks (CNN) and Recurrent Neural Networks, using Long Short-Term Memory (LSTM) units. In the second experiment, we proposed a late-fusion framework to incorporate genetic features. Results: Our study cohort included 109, 490 individuals (9,824 were cases and 99, 666 were controls) from Vanderbilt University Medical Center (VUMC) de-identified EHRs. American College of Cardiology and the American Heart Association (ACC/AHA) Pooled Cohort Risk Equations had areas under receiver operating characteristic curves (AUROC) of 0.732 and areas under receiver under precision and recall curves (AUPRC) of 0.187. LSTM, CNN and GBT with temporal features achieved best results, which had AUROC of 0.789, 0.790, and 0.791, and AUPRC of 0.282, 0.280 and 0.285, respectively. The late fusion approach achieved a significant improvement for the prediction performance. Conclusions: Machine learning and deep learning with longitudinal features improved the 10-year CVD risk prediction. Incorporating genetic features further enhanced 10-year CVD prediction performance, underscoring the importance of integrating relevant genetic data whenever available in the context of routine care.

Copyright 
The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY 4.0 International license.
Back to top
PreviousNext
Posted July 11, 2018.
Download PDF

Supplementary Material

Email

Thank you for your interest in spreading the word about bioRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
Learning from Longitudinal Data in Electronic Health Record and Genetic Data to Improve Cardiovascular Event Prediction
(Your Name) has forwarded a page to you from bioRxiv
(Your Name) thought you would like to see this page from the bioRxiv website.
Share
Learning from Longitudinal Data in Electronic Health Record and Genetic Data to Improve Cardiovascular Event Prediction
Juan Zhao, QiPing Feng, Patrick Wu, Roxana Lupu, Russel A Wilke, Quinn S Wells, Joshua Denny, Wei-Qi Wei
bioRxiv 366682; doi: https://doi.org/10.1101/366682
Digg logo Reddit logo Twitter logo CiteULike logo Facebook logo Google logo Mendeley logo
Citation Tools
Learning from Longitudinal Data in Electronic Health Record and Genetic Data to Improve Cardiovascular Event Prediction
Juan Zhao, QiPing Feng, Patrick Wu, Roxana Lupu, Russel A Wilke, Quinn S Wells, Joshua Denny, Wei-Qi Wei
bioRxiv 366682; doi: https://doi.org/10.1101/366682

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Epidemiology
Subject Areas
All Articles
  • Animal Behavior and Cognition (996)
  • Biochemistry (1485)
  • Bioengineering (938)
  • Bioinformatics (6803)
  • Biophysics (2414)
  • Cancer Biology (1782)
  • Cell Biology (2514)
  • Clinical Trials (106)
  • Developmental Biology (1683)
  • Ecology (2553)
  • Epidemiology (1488)
  • Evolutionary Biology (5003)
  • Genetics (3598)
  • Genomics (4614)
  • Immunology (1156)
  • Microbiology (4222)
  • Molecular Biology (1617)
  • Neuroscience (10742)
  • Paleontology (81)
  • Pathology (236)
  • Pharmacology and Toxicology (407)
  • Physiology (552)
  • Plant Biology (1444)
  • Scientific Communication and Education (410)
  • Synthetic Biology (542)
  • Systems Biology (1868)
  • Zoology (257)