PT - JOURNAL ARTICLE AU - Battey, C. J. AU - Coffing, Gabrielle C. AU - Kern, Andrew D. TI - Visualizing Population Structure with Variational Autoencoders AID - 10.1101/2020.08.12.248278 DP - 2020 Jan 01 TA - bioRxiv PG - 2020.08.12.248278 4099 - http://biorxiv.org/content/early/2020/10/19/2020.08.12.248278.short 4100 - http://biorxiv.org/content/early/2020/10/19/2020.08.12.248278.full AB - Dimensionality reduction is a common tool for visualization and inference of population structure from genotypes, but popular methods either return too many dimensions for easy plotting (PCA) or fail to preserve global geometry (t-SNE and UMAP). Here we explore the utility of variational autoencoders (VAEs) – generative machine learning models in which a pair of neural networks seek to first compress and then recreate the input data – for visualizing population genetic variation. VAEs incorporate non-linear relationships, allow users to define the dimensionality of the latent space, and in our tests preserve global geometry better than t-SNE and UMAP. Our implementation, which we call popvae, is available as a command-line python program at github.com/kr-colab/popvae. The approach yields latent embeddings that capture subtle aspects of population structure in humans and Anopheles mosquitoes, and can generate artificial genotypes characteristic of a given sample or population.Competing Interest StatementThe authors have declared no competing interest.