BCFtools/RoH: a hidden Markov model approach for detecting autozygosity from next-generation sequencing data

Bioinformatics. 2016 Jun 1;32(11):1749-51. doi: 10.1093/bioinformatics/btw044. Epub 2016 Jan 30.

Abstract

Runs of homozygosity (RoHs) are genomic stretches of a diploid genome that show identical alleles on both chromosomes. Longer RoHs are unlikely to have arisen by chance but are likely to denote autozygosity, whereby both copies of the genome descend from the same recent ancestor. Early tools to detect RoH used genotype array data, but substantially more information is available from sequencing data. Here, we present and evaluate BCFtools/RoH, an extension to the BCFtools software package, that detects regions of autozygosity in sequencing data, in particular exome data, using a hidden Markov model. By applying it to simulated data and real data from the 1000 Genomes Project we estimate its accuracy and show that it has higher sensitivity and specificity than existing methods under a range of sequencing error rates and levels of autozygosity.

Availability and implementation: BCFtools/RoH and its associated binary/source files are freely available from https://github.com/samtools/BCFtools

Contact: vn2@sanger.ac.uk or pd3@sanger.ac.uk

Supplementary information: Supplementary data are available at Bioinformatics online.

MeSH terms

  • Exome
  • Genomics
  • Genotype
  • High-Throughput Nucleotide Sequencing*
  • Homozygote
  • Software