Skip to main content
bioRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search
New Results

Multidimensional Data Organization and Random Access in Large-Scale DNA Storage Systems

View ORCID ProfileXin Song, Shalin Shah, John Reif
doi: https://doi.org/10.1101/743369
Xin Song
†Department of Electrical and Computer Engineering, Duke University, Durham, NC 27708, USA
‡Department of Biomedical Engineering, Duke University, Durham, NC 27708, USA
§Department of Computer Science, Duke University, Durham, NC 27708, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Xin Song
  • For correspondence: xin.song@duke.edu
Shalin Shah
†Department of Electrical and Computer Engineering, Duke University, Durham, NC 27708, USA
§Department of Computer Science, Duke University, Durham, NC 27708, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
John Reif
†Department of Electrical and Computer Engineering, Duke University, Durham, NC 27708, USA
§Department of Computer Science, Duke University, Durham, NC 27708, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Preview PDF
Loading

Abstract

With impressive density and coding capacity, DNA offers a promising solution for building long-lasting data archival storage systems. In recent implementations, data retrieval such as random access typically relies on a large library of non-interacting PCR primers. While several algorithms automate the primer design process, the capacity and scalability of DNA-based storage systems are still fundamentally limited by the availability of experimentally validated orthogonal primers. In this work, we combine the nested and semi-nested PCR techniques to virtually enforce multidimensional data organization in large DNA storage systems. The strategy effectively pushes the limit of DNA storage capacity and reduces the number of primers needed for efficient random access from very large address space. Specifically, our design requires k * n unique primers to index nk data entries, where k specifies the number of dimensions and n indicates the number of data entries stored in each dimension. We strategically leverage forward/reverse primer pairs from the same or different address layers to virtually specify and maintain data retrievals in the form of rows, columns, tables, and blocks with respect to the original storage pool. This architecture enables various random-access patterns that could be tailored to preserve the underlying data structures and relations (e.g., files and folders) within the storage content. With just one or two rounds of PCR, specific data subsets or individual datum from the large multidimensional storage can be selectively enriched for simple extraction by gel electrophoresis or readout via sequencing.

Figure1
  • Download figure
  • Open in new tab
Copyright 
The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license.
Back to top
PreviousNext
Posted August 22, 2019.
Download PDF
Email

Thank you for your interest in spreading the word about bioRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
Multidimensional Data Organization and Random Access in Large-Scale DNA Storage Systems
(Your Name) has forwarded a page to you from bioRxiv
(Your Name) thought you would like to see this page from the bioRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
Multidimensional Data Organization and Random Access in Large-Scale DNA Storage Systems
Xin Song, Shalin Shah, John Reif
bioRxiv 743369; doi: https://doi.org/10.1101/743369
Digg logo Reddit logo Twitter logo CiteULike logo Facebook logo Google logo Mendeley logo
Citation Tools
Multidimensional Data Organization and Random Access in Large-Scale DNA Storage Systems
Xin Song, Shalin Shah, John Reif
bioRxiv 743369; doi: https://doi.org/10.1101/743369

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Synthetic Biology
Subject Areas
All Articles
  • Animal Behavior and Cognition (2409)
  • Biochemistry (4757)
  • Bioengineering (3300)
  • Bioinformatics (14584)
  • Biophysics (6591)
  • Cancer Biology (5132)
  • Cell Biology (7383)
  • Clinical Trials (138)
  • Developmental Biology (4326)
  • Ecology (6826)
  • Epidemiology (2057)
  • Evolutionary Biology (9842)
  • Genetics (7308)
  • Genomics (9470)
  • Immunology (4509)
  • Microbiology (12594)
  • Molecular Biology (4903)
  • Neuroscience (28110)
  • Paleontology (198)
  • Pathology (799)
  • Pharmacology and Toxicology (1372)
  • Physiology (1996)
  • Plant Biology (4452)
  • Scientific Communication and Education (969)
  • Synthetic Biology (1293)
  • Systems Biology (3894)
  • Zoology (718)