Genomic analysis of local variation and recent evolution in Plasmodium vivax

Pearson, Richard D; Amato, Roberto; Auburn, Sarah; Miotto, Olivo; Almagro-Garcia, Jacob; Amaratunga, Chanaki; Suon, Seila; Mao, Sivanna; Noviyanti, Rintis; Trimarsanto, Hidayat; Marfurt, Jutta; Anstey, Nicholas M; William, Timothy; Boni, Maciej F; Dolecek, Christiane; Tran, Hien Tinh; White, Nicholas J; Michon, Pascal; Siba, Peter; Tavul, Livingstone; Harrison, Gabrielle; Barry, Alyssa; Mueller, Ivo; Ferreira, Marcelo U; Karunaweera, Nadira; Randrianarivelojosia, Milijaona; Gao, Qi; Hubbart, Christina; Hart, Lee; Jeffery, Ben; Drury, Eleanor; Mead, Daniel; Kekre, Mihir; Campino, Susana; Manske, Magnus; Cornelius, Victoria J; MacInnis, Bronwyn; Rockett, Kirk A; Miles, Alistair; Rayner, Julian C; Fairhurst, Rick M; Nosten, Francois; Price, Ric N; Kwiatkowski, Dominic P

doi:10.1038/ng.3599

Letter
Published: 27 June 2016

Genomic analysis of local variation and recent evolution in Plasmodium vivax

Richard D Pearson^1,2,
Roberto Amato^1,2^na1,
Sarah Auburn³^na1,
Olivo Miotto^1,2,4,
Jacob Almagro-Garcia²,
Chanaki Amaratunga⁵,
Seila Suon⁶,
Sivanna Mao⁷,
Rintis Noviyanti⁸,
Hidayat Trimarsanto⁸,
Jutta Marfurt³,
Nicholas M Anstey³,
Timothy William⁹,
Maciej F Boni¹⁰,
Christiane Dolecek¹⁰,
Hien Tinh Tran¹⁰,
Nicholas J White⁴,
Pascal Michon^11,12,
Peter Siba¹¹,
Livingstone Tavul¹¹,
Gabrielle Harrison^13,14,
Alyssa Barry^13,14,
Ivo Mueller^13,14,
Marcelo U Ferreira¹⁵,
Nadira Karunaweera¹⁶,
Milijaona Randrianarivelojosia¹⁷,
Qi Gao¹⁸,
Christina Hubbart²,
Lee Hart²,
Ben Jeffery²,
Eleanor Drury¹,
Daniel Mead¹,
Mihir Kekre¹,
Susana Campino¹,
Magnus Manske¹,
Victoria J Cornelius^1,2,
Bronwyn MacInnis¹,
Kirk A Rockett^1,2,
Alistair Miles^1,2,
Julian C Rayner¹,
Rick M Fairhurst⁵,
Francois Nosten^19,20,
Ric N Price^3,20 &
…
Dominic P Kwiatkowski ORCID: orcid.org/0000-0002-5023-0176^1,2

Nature Genetics volume 48, pages 959–964 (2016)Cite this article

5899 Accesses
128 Citations
112 Altmetric
Metrics details

Subjects

Abstract

The widespread distribution and relapsing nature of Plasmodium vivax infection present major challenges for the elimination of malaria. To characterize the genetic diversity of this parasite in individual infections and across the population, we performed deep genome sequencing of >200 clinical samples collected across the Asia-Pacific region and analyzed data on >300,000 SNPs and nine regions of the genome with large copy number variations. Individual infections showed complex patterns of genetic structure, with variation not only in the number of dominant clones but also in their level of relatedness and inbreeding. At the population level, we observed strong signals of recent evolutionary selection both in known drug resistance genes and at new loci, and these varied markedly between geographical locations. These findings demonstrate a dynamic landscape of local evolutionary adaptation in the parasite population and provide a foundation for genomic surveillance to guide effective strategies for control and elimination of P. vivax.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on Springer Link
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Figure 1: Defining the accessible genome.**

**Figure 3: Genetic structure of mixed infections.**

**Figure 4: Parasite population structure.**

**Figure 5: Population-specific signatures of recent positive selection.**

Distinctive genetic structure and selection patterns in Plasmodium vivax from South Asia and East Africa

Article Open access 26 May 2021

Ernest Diez Benavente, Emilia Manko, … Taane G. Clark

An analysis of large structural variation in global Plasmodium falciparum isolates identifies a novel duplication of the chloroquine resistance associated gene

Article Open access 04 June 2019

Matt Ravenhall, Ernest Diez Benavente, … Taane G. Clark

Genomics reveals heterogeneous Plasmodium falciparum transmission and selection signals in Zambia

Article Open access 06 April 2024

Abebe A. Fola, Qixin He, … Giovanna Carpi

References

Gething, P.W. et al. A long neglected world malaria map: Plasmodium vivax endemicity in 2010. PLoS Negl. Trop. Dis. 6, e1814 (2012).
Article Google Scholar
Price, R.N. et al. Vivax malaria: neglected and not benign. Am. J. Trop. Med. Hyg. 77 (suppl. 6), 79–87 (2007).
Article Google Scholar
Battle, K.E. et al. The global public health significance of Plasmodium vivax. Adv. Parasitol. 80, 1–111 (2012).
Article Google Scholar
Miller, L.H., Mason, S.J., Clyde, D.F. & McGinniss, M.H. The resistance factor to Plasmodium vivax in blacks. The Duffy-blood-group genotype, FyFy. N. Engl. J. Med. 295, 302–304 (1976).
Article CAS Google Scholar
Ménard, D. et al. Plasmodium vivax clinical malaria is commonly observed in Duffy-negative Malagasy people. Proc. Natl. Acad. Sci. USA 107, 5967–5971 (2010).
Article Google Scholar
White, N.J. Determinants of relapse periodicity in Plasmodium vivax malaria. Malar. J. 10, 297 (2011).
Article Google Scholar
Price, R.N. et al. Global extent of chloroquine-resistant Plasmodium vivax: a systematic review and meta-analysis. Lancet Infect. Dis. 14, 982–991 (2014).
Article Google Scholar
Karunaweera, N.D. et al. Extensive microsatellite diversity in the human malaria parasite Plasmodium vivax. Gene 410, 105–112 (2008).
Article CAS Google Scholar
Barry, A.E., Waltmann, A., Koepfli, C., Barnadas, C. & Mueller, I. Uncovering the transmission dynamics of Plasmodium vivax using population genetics. Pathog. Glob. Health 109, 142–152 (2015).
Article Google Scholar
Koepfli, C. et al. Plasmodium vivax diversity and population structure across four continents. PLoS Negl. Trop. Dis. 9, e0003872 (2015).
Article Google Scholar
Carlton, J.M. et al. Comparative genomics of the neglected human malaria parasite Plasmodium vivax. Nature 455, 757–763 (2008).
Article CAS Google Scholar
Dharia, N.V. et al. Whole-genome sequencing and microarray analysis of ex vivo Plasmodium vivax reveal selective pressure on putative drug resistance genes. Proc. Natl. Acad. Sci. USA 107, 20045–20050 (2010).
Article CAS Google Scholar
Hester, J. et al. De novo assembly of a field isolate genome reveals novel Plasmodium vivax erythrocyte invasion genes. PLoS Negl. Trop. Dis. 7, e2569 (2013).
Article Google Scholar
Chan, E.R. et al. Whole genome sequencing of field isolates provides robust characterization of genetic diversity in Plasmodium vivax. PLoS Negl. Trop. Dis. 6, e1811 (2012).
Article CAS Google Scholar
Neafsey, D.E. et al. The malaria parasite Plasmodium vivax exhibits greater genetic diversity than Plasmodium falciparum. Nat. Genet. 44, 1046–1050 (2012).
Article CAS Google Scholar
Bright, A.T. et al. A high resolution case study of a patient with recurrent Plasmodium vivax infections shows that relapses were caused by meiotic siblings. PLoS Negl. Trop. Dis. 8, e2882 (2014).
Article Google Scholar
Winter, D.J. et al. Whole genome sequencing of field isolates reveals extensive genetic diversity in Plasmodium vivax from Colombia. PLoS Negl. Trop. Dis. 9, e0004252 (2015).
Article Google Scholar
Flannery, E.L. et al. Next-generation sequencing of Plasmodium vivax patient samples shows evidence of direct evolution in drug-resistance genes. ACS Infect. Dis. 1, 367–379 (2015).
Article CAS Google Scholar
Auburn, S. et al. Characterization of within-host Plasmodium falciparum diversity using next-generation sequence data. PLoS One 7, e32891 (2012).
Article CAS Google Scholar
Menard, D. et al. Whole genome sequencing of field isolates reveals a common duplication of the Duffy binding protein gene in Malagasy Plasmodium vivax strains. PLoS Negl. Trop. Dis. 7, e2489 (2013).
Article Google Scholar
Howes, R.E. et al. The global distribution of the Duffy blood group. Nat. Commun. 2, 266 (2011).
Article Google Scholar
Suwanarusk, R. et al. Amplification of pvmdr1 associated with multidrug-resistant Plasmodium vivax. J. Infect. Dis. 198, 1558–1564 (2008).
Article CAS Google Scholar
Douglas, N.M. et al. Plasmodium vivax recurrence following falciparum and mixed species malaria: risk factors and effect of antimalarial kinetics. Clin. Infect. Dis. 52, 612–620 (2011).
Article Google Scholar
Imwong, M. et al. The first Plasmodium vivax relapses of life are usually genetically homologous. J. Infect. Dis. 205, 680–683 (2012).
Article Google Scholar
Lin, J.T. et al. Using amplicon deep sequencing to detect genetic signatures of Plasmodium vivax relapse. J. Infect. Dis. 212, 999–1008 (2015).
Article CAS Google Scholar
Manske, M. et al. Analysis of Plasmodium falciparum diversity in natural infections by deep sequencing. Nature 487, 375–379 (2012).
Article CAS Google Scholar
Nair, S. et al. Single-cell genomics for dissection of complex malaria infections. Genome Res. 24, 1028–1038 (2014).
Article CAS Google Scholar
Evanno, G., Regnaut, S. & Goudet, J. Detecting the number of clusters of individuals using the software STRUCTURE: a simulation study. Mol. Ecol. 14, 2611–2620 (2005).
Article CAS Google Scholar
Miotto, O. et al. Genetic architecture of artemisinin-resistant Plasmodium falciparum. Nat. Genet. 47, 226–234 (2015).
Article CAS Google Scholar
Korsinczky, M. et al. Sulfadoxine resistance in Plasmodium vivax is associated with a specific amino acid in dihydropteroate synthase at the putative sulfadoxine-binding site. Antimicrob. Agents Chemother. 48, 2214–2222 (2004).
Article CAS Google Scholar
Imwong, M. et al. Novel point mutations in the dihydrofolate reductase gene of Plasmodium vivax: evidence for sequential selection by drug pressure. Antimicrob. Agents Chemother. 47, 1514–1521 (2003).
Article CAS Google Scholar
Alam, M.T. et al. Tracking origins and spread of sulfadoxine-resistant Plasmodium falciparum dhps alleles in Thailand. Antimicrob. Agents Chemother. 55, 155–164 (2011).
Article CAS Google Scholar
Pava, Z. et al. Expression of Plasmodium vivax crt-o is related to parasite stage but not ex vivo chloroquine susceptibility. Antimicrob. Agents Chemother. 60, 361–367 (2015).
Article Google Scholar
Suwanarusk, R. et al. Chloroquine resistant Plasmodium vivax: in vitro characterisation and association with molecular polymorphisms. PLoS One 2, e1089 (2007).
Article Google Scholar
Mu, J. et al. Multiple transporters associated with malaria parasite responses to chloroquine and quinine. Mol. Microbiol. 49, 977–989 (2003).
Article CAS Google Scholar
Raj, D.K. et al. Disruption of a Plasmodium falciparum multidrug resistance-associated protein (PfMRP) alters its fitness and transport of antimalarial drugs and glutathione. J. Biol. Chem. 284, 7687–7696 (2009).
Article CAS Google Scholar
Pagès, J.-M., James, C.E. & Winterhalter, M. The porin and the permeating antibiotic: a selective diffusion barrier in Gram-negative bacteria. Nat. Rev. Microbiol. 6, 893–903 (2008).
Article Google Scholar
Bozdech, Z. et al. The transcriptome of Plasmodium vivax reveals divergence and diversity of transcriptional regulation in malaria parasites. Proc. Natl. Acad. Sci. USA 105, 16290–16295 (2008).
Article CAS Google Scholar
Westenberger, S.J. et al. A systems-based analysis of Plasmodium vivax lifecycle transcription from human to mosquito. PLoS Negl. Trop. Dis. 4, e653 (2010).
Article Google Scholar
Tao, Z.-Y., Xia, H., Cao, J. & Gao, Q. Development and evaluation of a prototype non-woven fabric filter for purification of malaria-infected blood. Malar. J. 10, 251 (2011).
Article Google Scholar
Auburn, S. et al. Effective preparation of Plasmodium vivax field isolates for high-throughput whole genome sequencing. PLoS One 8, e53160 (2013).
Article CAS Google Scholar
Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25, 1754–1760 (2009).
Article CAS Google Scholar
Li, H. et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics 25, 2078–2079 (2009).
Article Google Scholar
McKenna, A. et al. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 20, 1297–1303 (2010).
Article CAS Google Scholar
DePristo, M.A. et al. A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nat. Genet. 43, 491–498 (2011).
Article CAS Google Scholar
Cingolani, P. et al. A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff: SNPs in the genome of Drosophila melanogaster strain w1118; iso-2; iso-3. Fly (Austin) 6, 80–92 (2012).
Article CAS Google Scholar
Logan-Klumpler, F.J. et al. GeneDB--an annotation database for pathogens. Nucleic Acids Res. 40, D98–D108 (2012).
Article CAS Google Scholar
Tachibana, S. et al. Plasmodium cynomolgi genome sequences provide insight into Plasmodium vivax and the monkey malaria clade. Nat. Genet. 44, 1051–1055 (2012).
Article CAS Google Scholar
Miles, A. et al. Genome variation and meiotic recombination in Plasmodium falciparum: insights from deep sequencing of genetic crosses. Preprint at bioRxiv 024182, http://dx.doi.org/10.1101/024182 (2015).
Alexander, D.H., Novembre, J. & Lange, K. Fast model-based estimation of ancestry in unrelated individuals. Genome Res. 19, 1655–1664 (2009).
Article CAS Google Scholar
Purcell, S. et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 81, 559–575 (2007).
Article CAS Google Scholar
Sabeti, P.C. et al. Genome-wide detection and characterization of positive selection in human populations. Nature 449, 913–918 (2007).
Article CAS Google Scholar
Voight, B.F., Kudaravalli, S., Wen, X. & Pritchard, J.K. A map of recent positive selection in the human genome. PLoS Biol. 4, e72 (2006).
Article Google Scholar

Download references

Acknowledgements

We thank the patients and communities that provided samples for this study, and our many colleagues who supported this work in the field. Sequencing, data analysis and project coordination were funded by the Wellcome Trust (098051, 090770/Z/09/Z), the Medical Research Council (G0600718) and the UK Department for International Development (M006212). A.B. and I.M. acknowledge the Victorian State Government Operational Infrastructure Support and Australian Government National Health and Medical Research Council Independent Medical Research Institutes Infrastructure Support Scheme (NHMRC IRIISS). S.A. and R.N.P. are funded by the Wellcome Trust (Senior Fellowship in Clinical Science awarded to R.N.P., 091625). This study was supported in part by the Intramural Research Program of the National Institute of Allergy and Infectious Diseases, National Institutes of Health.

Author information

Roberto Amato and Sarah Auburn: These authors contributed equally to this work.

Authors and Affiliations

Wellcome Trust Sanger Institute, Hinxton, Cambridge, UK
Richard D Pearson, Roberto Amato, Olivo Miotto, Eleanor Drury, Daniel Mead, Mihir Kekre, Susana Campino, Magnus Manske, Victoria J Cornelius, Bronwyn MacInnis, Kirk A Rockett, Alistair Miles, Julian C Rayner & Dominic P Kwiatkowski
MRC Centre for Genomics and Global Health, Wellcome Trust Centre for Human Genetics, Oxford, UK
Richard D Pearson, Roberto Amato, Olivo Miotto, Jacob Almagro-Garcia, Christina Hubbart, Lee Hart, Ben Jeffery, Victoria J Cornelius, Kirk A Rockett, Alistair Miles & Dominic P Kwiatkowski
Global and Tropical Health Division, Menzies School of Health Research and Charles Darwin University, Darwin, Northern Territories, Australia
Sarah Auburn, Jutta Marfurt, Nicholas M Anstey & Ric N Price
Mahidol-Oxford Tropical Medicine Research Unit, Mahidol University, Bangkok, Thailand
Olivo Miotto & Nicholas J White
National Institute of Allergy and Infectious Diseases, National Institutes of Health, Rockville, Maryland, USA
Chanaki Amaratunga & Rick M Fairhurst
National Centre for Parasitology, Entomology, and Malaria Control, Phnom Penh, Cambodia
Seila Suon
Sampov Meas Referral Hospital, Pursat, Cambodia
Sivanna Mao
Eijkman Institute for Molecular Biology, Jakarta, Indonesia
Rintis Noviyanti & Hidayat Trimarsanto
Infectious Diseases Society Sabah-Menzies School of Health Research Clinical Research Unit and Queen Elizabeth Hospital Clinical Research Centre, Kota Kinabalu, Sabah, Malaysia
Timothy William
Oxford University Clinical Research Unit, Ho Chi Minh City, Vietnam
Maciej F Boni, Christiane Dolecek & Hien Tinh Tran
Papua New Guinea Institute of Medical Research, Madang, Papua New Guinea
Pascal Michon, Peter Siba & Livingstone Tavul
Faculty of Medicine and Health Sciences, Divine Word University, Madang, Papua New Guinea
Pascal Michon
Division of Population Health and Immunity, The Walter and Eliza Hall Institute for Medical Research, Parkville, Victoria, Australia
Gabrielle Harrison, Alyssa Barry & Ivo Mueller
Department of Medical Biology, University of Melbourne, Parkville, Victoria, Australia
Gabrielle Harrison, Alyssa Barry & Ivo Mueller
Department of Parasitology, Institute of Biomedical Sciences, University of São Paulo, São Paulo, Brazil
Marcelo U Ferreira
Department of Parasitology, Faculty of Medicine, University of Colombo, Colombo, Sri Lanka
Nadira Karunaweera
Institut Pasteur de Madagascar, Antananarivo, Madagascar
Milijaona Randrianarivelojosia
Jiangsu Institute of Parasitic Diseases, Key Laboratory of Parasitic Disease Control and Prevention (Ministry of Health), Jiangsu Provincial Key Laboratory of Parasite Molecular Biology, Wuxi, Jiangsu, China
Qi Gao
Shoklo Malaria Research Unit, Mahidol-Oxford Tropical Medicine Research Unit, Faculty of Tropical Medicine, Mahidol University, Mae Sot, Thailand
Francois Nosten
Nuffield Department of Medicine, Centre for Tropical Medicine and Global Health, University of Oxford, Oxford, UK
Francois Nosten & Ric N Price

Authors

Richard D Pearson
View author publications
You can also search for this author in PubMed Google Scholar
Roberto Amato
View author publications
You can also search for this author in PubMed Google Scholar
Sarah Auburn
View author publications
You can also search for this author in PubMed Google Scholar
Olivo Miotto
View author publications
You can also search for this author in PubMed Google Scholar
Jacob Almagro-Garcia
View author publications
You can also search for this author in PubMed Google Scholar
Chanaki Amaratunga
View author publications
You can also search for this author in PubMed Google Scholar
Seila Suon
View author publications
You can also search for this author in PubMed Google Scholar
Sivanna Mao
View author publications
You can also search for this author in PubMed Google Scholar
Rintis Noviyanti
View author publications
You can also search for this author in PubMed Google Scholar
Hidayat Trimarsanto
View author publications
You can also search for this author in PubMed Google Scholar
Jutta Marfurt
View author publications
You can also search for this author in PubMed Google Scholar
Nicholas M Anstey
View author publications
You can also search for this author in PubMed Google Scholar
Timothy William
View author publications
You can also search for this author in PubMed Google Scholar
Maciej F Boni
View author publications
You can also search for this author in PubMed Google Scholar
Christiane Dolecek
View author publications
You can also search for this author in PubMed Google Scholar
Hien Tinh Tran
View author publications
You can also search for this author in PubMed Google Scholar
Nicholas J White
View author publications
You can also search for this author in PubMed Google Scholar
Pascal Michon
View author publications
You can also search for this author in PubMed Google Scholar
Peter Siba
View author publications
You can also search for this author in PubMed Google Scholar
Livingstone Tavul
View author publications
You can also search for this author in PubMed Google Scholar
Gabrielle Harrison
View author publications
You can also search for this author in PubMed Google Scholar
Alyssa Barry
View author publications
You can also search for this author in PubMed Google Scholar
Ivo Mueller
View author publications
You can also search for this author in PubMed Google Scholar
Marcelo U Ferreira
View author publications
You can also search for this author in PubMed Google Scholar
Nadira Karunaweera
View author publications
You can also search for this author in PubMed Google Scholar
Milijaona Randrianarivelojosia
View author publications
You can also search for this author in PubMed Google Scholar
Qi Gao
View author publications
You can also search for this author in PubMed Google Scholar
Christina Hubbart
View author publications
You can also search for this author in PubMed Google Scholar
Lee Hart
View author publications
You can also search for this author in PubMed Google Scholar
Ben Jeffery
View author publications
You can also search for this author in PubMed Google Scholar
Eleanor Drury
View author publications
You can also search for this author in PubMed Google Scholar
Daniel Mead
View author publications
You can also search for this author in PubMed Google Scholar
Mihir Kekre
View author publications
You can also search for this author in PubMed Google Scholar
Susana Campino
View author publications
You can also search for this author in PubMed Google Scholar
Magnus Manske
View author publications
You can also search for this author in PubMed Google Scholar
Victoria J Cornelius
View author publications
You can also search for this author in PubMed Google Scholar
Bronwyn MacInnis
View author publications
You can also search for this author in PubMed Google Scholar
Kirk A Rockett
View author publications
You can also search for this author in PubMed Google Scholar
Alistair Miles
View author publications
You can also search for this author in PubMed Google Scholar
Julian C Rayner
View author publications
You can also search for this author in PubMed Google Scholar
Rick M Fairhurst
View author publications
You can also search for this author in PubMed Google Scholar
Francois Nosten
View author publications
You can also search for this author in PubMed Google Scholar
Ric N Price
View author publications
You can also search for this author in PubMed Google Scholar
Dominic P Kwiatkowski
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

C.A., S.S., S.M., R.N., H.T., J.M., N.M.A., T.W., M.F.B., C.D., H.T.T., N.J.W., P.M., P.S., L.T., G.H., A.B., I.M., M.U.F., N.K., M.R. and Q.G. carried out field and laboratory work to obtain P. vivax samples for sequencing. C.H., E.D., D.M., M.K., S.C., B.M. and K.A.R. developed and implemented methods for sample processing and sequencing library preparation. R.D.P., L.H., B.J. and M.M. managed data production pipelines. S.A., O.M., V.J.C., B.M., K.A.R., A.M., J.C.R., R.M.F., F.N., R.N.P. and D.P.K. contributed to study design and management. R.D.P., R.A., S.A., O.M., J.A.-G. and D.P.K. performed data analyses. R.D.P., R.A., S.A. and D.P.K. drafted the manuscript, which was reviewed by all authors.

Corresponding author

Correspondence to Dominic P Kwiatkowski.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Integrated supplementary information

Supplementary Figure 1 Features of the core genome and the internal and subtelomeric hypervariable regions.

(a) Several metrics illustrating the genome accessibility properties of core (unshaded) and hypervariable masked (shaded) regions. Black lines show mean mapping quality per variant position on a scale from 40 to 60. The dashed black line shows the 5th percentile of mean mapping quality per 10-kb window (49.16). Blue lines show the mean proportion of missing genotypes per variant position on a scale from 0 to 0.5. The dashed blue line shows the 95th percentile of mean missingness per 10-kb window (0.224). Green lines show the number of variant positions on a scale from 0 to 2,000. Red lines show the mean number of technical replicate discordances per variant position on a scale from 0 to 0.4. All metrics are shown in non-overlapping 10-kb windows. Regions annotated as SubtelomericHypervariable are shaded in red and as InternalHypervariable are shaded in orange. (b) The read mapping properties of Core, InternalHypervariable and SubtelomericHypervariable regions. These analyses are based on a subset of 187 samples with ≥97% genome callable positions. Coverage classifications were defined as follows: no coverage (left set of bars) refers to positions with zero coverage in at least 1/187 samples; low coverage (middle) refers to positions with coverage <5 (uncallable) in at least 1/187 samples; and poor mapping (right) refers to positions where at least 1/187 samples had ≥10% of reads with mapping quality zero (ambiguously aligned reads). The bars are color-coded according to the genomic region as in a.

Supplementary Figure 2 Minor allele frequency (MAF) spectrum.

Estimates based on 237,116 SNPs segregating in 148 samples with low levels of missingness from WTH, WKH and PID.

Supplementary Figure 3 Over-representation of rare alleles.

Observed theta (number of minor allele genotype calls (number of sites × MAF)/accessible genome length)) for different MAF bins. Dark green and light green lines show MAF calculated from fractional genotype calls and majority allele genotype calls, respectively.

Supplementary Figure 4 Patterns of linkage disequilibrium.

Genome-wide values for r² were calculated between pairs of SNPs over a range of distances and corrected for the inflation caused by population structure and other confounders as described in the Online Methods. Median values of linkage disequilibrium decay over short distances, e.g., r² falls to <0.1 within 200 bp in western Thailand (green) and western Cambodia (blue) and within 500 bp in Papua Indonesia (red).

Supplementary Figure 5 Genetic structure of individual infections.

Each of the 148 samples from western Thailand (WTH), western Cambodia (WKH) and Papua Indonesia (PID) is displayed. The box on the left displays a histogram of nonreference allele frequency (NRAF) across all heterozygous SNPs. The horizontal axis is NRAF on a scale of 0 to 1. The vertical axis is the number of SNPs on a scale of 0 to 500. The box on the right displays heterozygosity in 20-kb bins across the genome. The horizontal axis represents genomic position, with vertical lines separating the 14 chromosomes. The vertical axis shows the proportion of heterozygous SNPs in a given bin on a scale of 0 to 0.03. The legend for each sample gives its geographical origin, average read depth, F_WS, RoH and the inferred number of dominant clones. The line above each of these plots is colored according to the classification in Figure 3a (right). The inferred number of dominant clones is based on these criteria: one dominant clone, F_WS ≥ 0.99; two dominant clones, F_WS < 0.99 and NRAF histogram is bimodal and symmetric; three or more dominant clones, F_WS < 0.99 and NRAF histogram is not bimodal and symmetric. RoH is the proportion of the genome occupied by runs of homozygosity, defined here as the proportion of 100-kb bins for which mean heterozygosity < 0.005. Samples are ordered by RoH. In samples with two dominant clones, the clones are classified as either unrelated (if RoH < 0.1) or related (if RoH > 0.1).

Supplementary Figure 6 ΔK for values of K used in ADMIXTURE analysis of population structure.

The ΔK metric³⁰ evaluates the second-order rate of change of the likelihood function with respect to K and aims to identify the top-level hierarchical structure of the data. Following this metric, we found K = 3 to be the best choice for the number of putative populations.

Supplementary Figure 7 Details of genomic regions demonstrating haplotype-based evidence of selection.

Each plot shows one of the regions from Supplementary Table 8 plus 20 kb of flanking sequence 5′ and 3′. The top four tracks in each plot show log₁₀ (P) values for genome-wide selection scans. The vertical axis shows log₁₀ (P) on a scale of 0 to 15. The vertical lines represent the boundaries of the regions of selection (red lines in Fig. 5). The lower two tracks in each plot show F_ST scores between different populations. The vertical axis is scaled from 0 to 1.1. Red points represent nonsynonymous SNPs, blue points represent synonymous SNPs and gray points represent noncoding SNPs. The track at the bottom of each plot illustrates the coordinates of all genes within the given region. Gene names are given where these are available (as per the December 2015 version of GeneDB).

Supplementary Figure 8 Distributions of variant filters illustrating thresholds applied in the study.

The horizontal axis shows the percentile for a given variant annotation when ranked from lowest to highest values. The vertical axis shows the mean discordance of technical replicates per variant within each percentile bin. The horizontal line on each plot shows double the mean technical replicate discordance rate per variant across all discovered variable biallelic positions in non-masked regions (0.049). The vertical line(s) on each plot indicates the thresholds used in the study.

Supplementary Figure 9 Neighbor-joining tree using the P01 reference.

This tree was built using the same set of samples as was used in Figure 4, but here we mapped to the P01 reference (www.genedb.org/Homepage/PvivaxP01), rather than the Sal1 reference, before calling SNPs using GATK best practices. The structure of the tree is essentially the same as that in Figure 4, showing that the choice of reference genome makes little difference to the conclusions drawn from analysis of variation in the core genome.

Supplementary information

Supplementary Text and Figures

Supplementary Figures 1–9, Supplementary Tables 1–8 and Supplementary Note. (PDF 3499 kb)

Supplementary Data 1

Gene-level summaries of variation data. The first sheets give aggregate metrics across all 148 samples used for population genetic analyses, and the other three sheets show metrics for WTH, WKH and ID respectively. Summaries are given for both high-quality SNPs (pass) and all discovered SNPs (all). We do not record SNPs or metrics for genes outside the core genome. N/S, nonsynonymous/synonymous ratio; π, nucleotide diversity per base; D, Tajima's D. (XLSX 3864 kb)

Supplementary Data 2

CNV calls. Start and end coordinates, and copy number for all CNV calls longer than 3 kb. (XLSX 54 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Pearson, R., Amato, R., Auburn, S. et al. Genomic analysis of local variation and recent evolution in Plasmodium vivax. Nat Genet 48, 959–964 (2016). https://doi.org/10.1038/ng.3599

Download citation

Received: 03 January 2016
Accepted: 27 May 2016
Published: 27 June 2016
Issue Date: August 2016
DOI: https://doi.org/10.1038/ng.3599

This article is cited by

Genomic insights into Plasmodium vivax population structure and diversity in central Africa
- Valerie Gartner
- Benjamin D. Redelings
- Gregory A. Wray
Malaria Journal (2024)
Genomics of Plasmodium vivax in Colombia reveals evidence of local bottle-necking and inter-country connectivity in the Americas
- Edwin Sutanto
- Zuleima Pava
- Sarah Auburn
Scientific Reports (2023)
Genomic analysis of Plasmodium vivax describes patterns of connectivity and putative drivers of adaptation in Ethiopia
- Alebachew Messele Kebede
- Edwin Sutanto
- Sarah Auburn
Scientific Reports (2023)
Genetic diversity of Plasmodium vivax reticulocyte binding protein 2b in global parasite populations
- Xuexing Zhang
- Haichao Wei
- Qinghui Wang
Parasites & Vectors (2022)
Genomic analysis of single nucleotide polymorphisms in malaria parasite drug targets
- Jasmita Gill
- Amit Sharma
Parasites & Vectors (2022)