Supervised Machine Learning for Population Genetics: A New Paradigm

Trends Genet. 2018 Apr;34(4):301-312. doi: 10.1016/j.tig.2017.12.005. Epub 2018 Jan 10.

Abstract

As population genomic datasets grow in size, researchers are faced with the daunting task of making sense of a flood of information. To keep pace with this explosion of data, computational methodologies for population genetic inference are rapidly being developed to best utilize genomic sequence data. In this review we discuss a new paradigm that has emerged in computational population genomics: that of supervised machine learning (ML). We review the fundamentals of ML, discuss recent applications of supervised ML to population genetics that outperform competing methods, and describe promising future directions in this area. Ultimately, we argue that supervised ML is an important and underutilized tool that has considerable potential for the world of evolutionary genomics.

Publication types

  • Research Support, N.I.H., Extramural
  • Review

MeSH terms

  • Biological Evolution
  • Data Mining / methods*
  • Datasets as Topic
  • Genetics, Population*
  • Genome, Human*
  • Humans
  • Selection, Genetic
  • Supervised Machine Learning*