fastNGSadmix: admixture proportions and principal component analysis of a single NGS sample

Bioinformatics. 2017 Oct 1;33(19):3148-3150. doi: 10.1093/bioinformatics/btx474.

Abstract

Motivation: Estimation of admixture proportions and principal component analysis (PCA) are fundamental tools in populations genetics. However, applying these methods to low- or mid-depth sequencing data without taking genotype uncertainty into account can introduce biases.

Results: Here we present fastNGSadmix, a tool to fast and reliably estimate admixture proportions and perform PCA from next generation sequencing data of a single individual. The analyses are based on genotype likelihoods of the input sample and a set of predefined reference populations. The method has high accuracy, even at low sequencing depth and corrects for the biases introduced by small reference populations.

Availability and implementation: The admixture estimation method is implemented in C ++ and the PCA method is implemented in R. The code is freely available at http://www.popgen.dk/software/index.php/FastNGSadmix.

Contact: emil.jorsboe@bio.ku.dk.

Supplementary information: Supplementary data are available at Bioinformatics online.

MeSH terms

  • Genetics, Population / methods
  • Genotype
  • High-Throughput Nucleotide Sequencing / methods*
  • Humans
  • Principal Component Analysis*
  • Probability
  • Software*