Skip to main content
bioRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search
New Results

Scaling computational genomics to millions of individuals with GPUs

View ORCID ProfileAmaro Taylor-weiner, View ORCID ProfileFrancois Aguet, View ORCID ProfileNicholas Haradhvala, Sager Gosai, Shankara Anand, View ORCID ProfileJaegil Kim, View ORCID ProfileKristin Ardlie, View ORCID ProfileEliezer Van Allen, View ORCID ProfileGad Getz
doi: https://doi.org/10.1101/470138
Amaro Taylor-weiner
The Broad Institute
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Amaro Taylor-weiner
Francois Aguet
The Broad Institute
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Francois Aguet
Nicholas Haradhvala
The Broad Institute
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Nicholas Haradhvala
Sager Gosai
The Broad Institute
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Shankara Anand
The Broad Institute
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Jaegil Kim
The Broad Institute
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Jaegil Kim
Kristin Ardlie
The Broad Institute
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Kristin Ardlie
Eliezer Van Allen
The Broad Institute
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Eliezer Van Allen
Gad Getz
The Broad Institute
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Gad Getz
  • For correspondence: gadgetz@broadinstitute.org
  • Abstract
  • Info/History
  • Metrics
  • Preview PDF
Loading

Abstract

Current genomics methods were designed to handle tens to thousands of samples, but will soon need to scale to millions to keep up with the pace of data and hypothesis generation in biomedical science. Moreover, costs associated with processing these growing datasets will become prohibitive without improving the computational efficiency and scalability of methods. Here, we show that recently developed machine-learning libraries (TensorFlow and PyTorch) facilitate implementation of genomics methods for GPUs and significantly accelerate computations. To demonstrate this, we re-implemented methods for two commonly performed computational genomics tasks: QTL mapping and Bayesian non-negative matrix factorization. Our implementations ran > 200 times faster than current CPU-based versions, and these analyses are ~5-10 fold cheaper on GPUs due to the vastly shorter runtimes. We anticipate that the accessibility of these libraries, and the improvements in run-time will lead to a transition to GPU-based implementations for a wide range of computational genomics methods.

Footnotes

  • addition of analysis of 1 million single cells and small changes to language.

Copyright 
The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license.
Back to top
PreviousNext
Posted February 09, 2019.
Download PDF
Email

Thank you for your interest in spreading the word about bioRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
Scaling computational genomics to millions of individuals with GPUs
(Your Name) has forwarded a page to you from bioRxiv
(Your Name) thought you would like to see this page from the bioRxiv website.
Share
Scaling computational genomics to millions of individuals with GPUs
Amaro Taylor-weiner, Francois Aguet, Nicholas Haradhvala, Sager Gosai, Shankara Anand, Jaegil Kim, Kristin Ardlie, Eliezer Van Allen, Gad Getz
bioRxiv 470138; doi: https://doi.org/10.1101/470138
Digg logo Reddit logo Twitter logo CiteULike logo Facebook logo Google logo Mendeley logo
Citation Tools
Scaling computational genomics to millions of individuals with GPUs
Amaro Taylor-weiner, Francois Aguet, Nicholas Haradhvala, Sager Gosai, Shankara Anand, Jaegil Kim, Kristin Ardlie, Eliezer Van Allen, Gad Getz
bioRxiv 470138; doi: https://doi.org/10.1101/470138

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Genomics
Subject Areas
All Articles
  • Animal Behavior and Cognition (996)
  • Biochemistry (1485)
  • Bioengineering (941)
  • Bioinformatics (6806)
  • Biophysics (2414)
  • Cancer Biology (1782)
  • Cell Biology (2518)
  • Clinical Trials (106)
  • Developmental Biology (1685)
  • Ecology (2556)
  • Epidemiology (1489)
  • Evolutionary Biology (5006)
  • Genetics (3603)
  • Genomics (4618)
  • Immunology (1159)
  • Microbiology (4228)
  • Molecular Biology (1618)
  • Neuroscience (10753)
  • Paleontology (81)
  • Pathology (236)
  • Pharmacology and Toxicology (407)
  • Physiology (553)
  • Plant Biology (1448)
  • Scientific Communication and Education (410)
  • Synthetic Biology (542)
  • Systems Biology (1870)
  • Zoology (258)