Skip to main content
bioRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search
New Results

Capacity-approaching DNA storage

Yaniv Erlich, Dina Zielinski
doi: https://doi.org/10.1101/074237
Yaniv Erlich
1New York Genome Center, New York, NY 10013, USA
2Department of Computer Science, Fu Foundation School of Engineering, Columbia University, New York, NY, USA
3Center for Computational Biology and Bioinformatics (C2B2), Department of Systems Biology, Columbia University, New York, NY, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • For correspondence: yaniv@cs.columbia.edu
Dina Zielinski
1New York Genome Center, New York, NY 10013, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Supplementary material
  • Preview PDF
Loading

Abstract

Humanity produces data at exponential rates, creating a growing demand for better storage devices. DNA molecules are an attractive medium to store digital information due to their durability and high information density. Recent studies have made large strides in developing DNA storage schemes by exploiting the advent of massive parallel synthesis of DNA oligos and the high throughput of sequencing platforms. However, most of these experiments reported small gaps and errors in the retrieved information. Here, we report a strategy to store and retrieve DNA information that is robust and approaches the theoretical maximum of information that can be stored per nucleotide. The success of our strategy lies in careful adaption of recent developments in coding theory to the domain specific constrains of DNA storage. To test our strategy, we stored an entire computer operating system, a movie, a gift card, and other computer files with a total of 2.14×106 bytes in DNA oligos. We were able to fully retrieve the information without a single error even with a sequencing throughput on the scale of a single tile of an Illumina sequencing flow cell. To further stress our strategy, we created a deep copy of the data by PCR amplifying the oligo pool in a total of nine successive reactions, reflecting one complete path of an exponential process to copy the file 218×1012 times. We perfectly retrieved the original data with only five million reads. Taken together, our approach opens the possibility of highly reliable DNA-based storage that approaches the information capacity of DNA molecules and enables virtually unlimited data retrieval.

Copyright 
The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC 4.0 International license.
Back to top
PreviousNext
Posted September 09, 2016.
Download PDF

Supplementary Material

Email

Thank you for your interest in spreading the word about bioRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
Capacity-approaching DNA storage
(Your Name) has forwarded a page to you from bioRxiv
(Your Name) thought you would like to see this page from the bioRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
Capacity-approaching DNA storage
Yaniv Erlich, Dina Zielinski
bioRxiv 074237; doi: https://doi.org/10.1101/074237
Digg logo Reddit logo Twitter logo CiteULike logo Facebook logo Google logo Mendeley logo
Citation Tools
Capacity-approaching DNA storage
Yaniv Erlich, Dina Zielinski
bioRxiv 074237; doi: https://doi.org/10.1101/074237

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Synthetic Biology
Subject Areas
All Articles
  • Animal Behavior and Cognition (2646)
  • Biochemistry (5265)
  • Bioengineering (3678)
  • Bioinformatics (15796)
  • Biophysics (7253)
  • Cancer Biology (5627)
  • Cell Biology (8095)
  • Clinical Trials (138)
  • Developmental Biology (4765)
  • Ecology (7516)
  • Epidemiology (2059)
  • Evolutionary Biology (10576)
  • Genetics (7730)
  • Genomics (10130)
  • Immunology (5192)
  • Microbiology (13904)
  • Molecular Biology (5384)
  • Neuroscience (30779)
  • Paleontology (215)
  • Pathology (879)
  • Pharmacology and Toxicology (1524)
  • Physiology (2254)
  • Plant Biology (5022)
  • Scientific Communication and Education (1041)
  • Synthetic Biology (1385)
  • Systems Biology (4146)
  • Zoology (812)