Genome graphs and the evolution of genome inference

  1. Erik Garrison2
  1. 1Genomics Institute, CBSE, 501C Engineering 2, University of California Santa Cruz, Santa Cruz, California 95064, USA;
  2. 2Wellcome Trust Sanger Institute, Cambridge CB10 1SA, United Kingdom
  1. Corresponding author: benedict{at}soe.ucsc.edu

Abstract

The human reference genome is part of the foundation of modern human biology and a monumental scientific achievement. However, because it excludes a great deal of common human variation, it introduces a pervasive reference bias into the field of human genomics. To reduce this bias, it makes sense to draw on representative collections of human genomes, brought together into reference cohorts. There are a number of techniques to represent and organize data gleaned from these cohorts, many using ideas implicitly or explicitly borrowed from graph-based models. Here, we survey various projects underway to build and apply these graph-based structures—which we collectively refer to as genome graphs—and discuss the improvements in read mapping, variant calling, and haplotype determination that genome graphs are expected to produce.

This article, published in Genome Research, is available under a Creative Commons License (Attribution 4.0 International), as described at http://creativecommons.org/licenses/by/4.0/.

| Table of Contents
OPEN ACCESS ARTICLE

Preprint Server