RT Journal Article SR Electronic T1 The variant call format provides efficient and robust storage of GWAS summary statistics JF bioRxiv FD Cold Spring Harbor Laboratory SP 2020.05.29.115824 DO 10.1101/2020.05.29.115824 A1 Lyon, Matthew A1 Andrews, Shea J A1 Elsworth, Ben A1 Gaunt, Tom R A1 Hemani, Gibran A1 Marcora, Edoardo YR 2020 UL http://biorxiv.org/content/early/2020/05/30/2020.05.29.115824.abstract AB Genome-wide association study (GWAS) summary statistics are a fundamental resource for a variety of research applications 1–6. Yet despite their widespread utility, no common storage format has been widely adopted, hindering tool development and data sharing, analysis and integration. Existing tabular formats 7,8 often ambiguously or incompletely store information about genetic variants and their associations, and also lack essential metadata increasing the possibility of errors in data interpretation and post-GWAS analyses. Additionally, data in these formats are typically not indexed, requiring the whole file to be read which is computationally inefficient. To address these issues, we propose an adaptation of the variant call format9 (GWAS-VCF) and have produced a suite of open-source tools for using this format in downstream analyses. Simulation studies determine GWAS-VCF is 9-46x faster than tabular alternatives when extracting variant(s) by genomic position. Our results demonstrate the GWAS-VCF provides a robust and performant solution for sharing, analysis and integration of GWAS data. We provide open access to over 10,000 complete GWAS summary datasets converted to this format (available from: https://gwas.mrcieu.ac.uk).Competing Interest StatementTRG receives funding from GlaxoSmithKline and Biogen for unrelated research.