RT Journal Article SR Electronic T1 Storing and analyzing a genome on a blockchain JF bioRxiv FD Cold Spring Harbor Laboratory SP 2020.03.03.975334 DO 10.1101/2020.03.03.975334 A1 Gamze Gursoy A1 Charlotte Brannon A1 Sarah Wagner A1 Mark Gerstein YR 2020 UL http://biorxiv.org/content/early/2020/03/04/2020.03.03.975334.abstract AB The genomic characterization of individuals promises to be immensely useful for medical research. Moreover, sequencing, analysis, and interpretation of patients’ genomes is projected to be a staple of healthcare in the future. A critical barrier to expanding personal genome sequencing is the ability to store genomic data securely and with high integrity. While cloud storage offers solutions to access such data from any place and device, the security, data integrity, and robustness vulnerabilities such as single-point-of failure losses have not yet been addressed. Here, we developed novel tools for decentralized storage, access, and analysis of genome sequencing data on private blockchain networks. Storing and analyzing large-scale data on a blockchain can be challenging because of the slow transaction speed and limitations on querying data stored on-chain. Hence, current genomic blockchain applications only log links to the data. We overcome this challenge by implementing data compression techniques and nested database indexing. Our tools provide open-source blockchain-based storage and access tools for advanced genomic analyses such as variant calling.