%0 Journal Article %A Claudio Alberti %A Tom Paridaens %A Jan Voges %A Daniel Naro %A Junaid J. Ahmad %A Massimo Ravasi %A Daniele Renzi %A Giorgio Zoia %A Idoia Ochoa %A Marco Mattavelli %A Jaime Delgado %A Mikel Hernaez %T An introduction to MPEG-G, the new ISO standard for genomic information representation %D 2018 %R 10.1101/426353 %J bioRxiv %P 426353 %X The MPEG-G standardization project is the largest coordinated international effort to specify a compressed data format that enables large scale genomic data processing, transport and sharing. It is the first ISO/IEC standard that addresses the problems and limitations of current genomic data formats towards a truly efficient and economical handling of genomic information. It provides the means to implement leading-edge compression technology achieving more than 10x improvement over the BAM format. The standard also provides a set of currently-needed functionalities, such as selective access, application programming interfaces to the compressed data, support of data protection mechanisms, and support for streaming applications. Furthermore, ISO/IEC is also engaged in supporting the maintenance of the standard to guarantee the perenniality of applications using MPEG-G. Finally, interoperability and integration with existing genomic information processing pipelines is enabled by supporting conversion from/to the FASTQ/SAM/BAM file formats.In this paper we review the MPEG-G standard in more detail, as well as the main advantages and functionalities offered by it. %U https://www.biorxiv.org/content/biorxiv/early/2018/09/27/426353.full.pdf