Abstract
Genetic tools are widely used in conservation but expansion to genomic scale data that are rapid and cost effective has been slow. Most genomic tools are developed for high quality DNA sources from lab or medical settings. So far, genetic data from market or field settings assess easily amplified mitochondrial DNA or a few microsatellites. Here we use multiplex PCR with low quality DNA from feces, hair, and cooked samples to quickly provide multi-locus data from many SNPs. We demonstrate the wide range of potential applications through tools to monitor individual wild tigers and track commercial trade in Caribbean queen conch. 100 SNPs from degraded tiger samples identified individuals, discerned close relatives, and detected population differentiation. 62 SNPs from conch fritters and field collected samples identified individuals, tested for close kin and detected population structure. Our study provides proof of concept for a rapid, simple, cost-effective, and scalable method, a framework that can be applied to other conservation scenarios previously limited by low quality DNA samples. These approaches provide a critical advance for wildlife monitoring and forensics, open the door to field-ready testing, and will strengthen the use of science in policy decisions and wildlife trade.
Introduction
Stemming the tide of global species decline requires continuous monitoring and nimble, adaptive management to promote species recovery. Effective monitoring requires, minimally, the ability to identify and track presence of a species and ultimately to track specific individuals and to determine their familial relationships. Integrating local data across species range provides the ability to monitor global threats from population range reduction to illegal wildlife trade.
In principle, all of these goals can be achieved via genotyping at a modest number of loci such as microsatellites or single nucleotide polymorphisms (SNPs). Given that endangered species tend to be rare and elusive, the necessary approach must be able to accommodate non-invasive sources of DNA such as feces, shell, feathers, hair, and saliva that yield impure, mixed, and/or extremely small amounts of degraded DNA. Moreover, market samples generated by wildlife trade may be processed, cooked, dried, or mixed with other species, again providing low quality and often mixed DNA. In contrast, the current approaches tend to require large amounts of high quality, pure DNA from the target species most readily obtained by immobilizing or sacrificing an animal (Fitak, Naidu, Thompson, & Culver, 2016; Kraus et al., 2015) or demand expensive and generally inefficient enrichment strategies (Chiou & Bergey, 2018; Snyder-mackler et al., 2016) to enhance the quantity of the target DNA. As a result, these approaches are unsuitable, unaffordable, or too low throughput in most conservation settings.
The tiger (Panthera tigris) is one of the most endangered carnivores and despite drastic habitat reduction, the remaining ~5,000 wild tigers inhabit over 14 countries. Their wide distribution makes it critical that locally collected data is comparable across their range. While photographic mark-recapture is commonly used to monitor tigers, capture rates are poor in low-density habitats. Neither can photographs always be matched with confiscated tiger parts. Instead, genotyping of scat or hair, along with rapid forensic testing of confiscated skins or other traded parts can verify species, individual identity, and source populations.
Although formerly abundant, the queen conch (Strombus gigas) was listed in CITES Appendix II in 1990. Trade was further restricted in 2004 due to declining stocks. Florida is currently the largest market for queen conch seafood products, consuming 82% of international trade. Matching the populations of origin of queen conch in Florida markets with international regulations is crucial to recover this formerly lucrative fishery. The most direct access to imported conch is from the hundreds of restaurants in Florida, necessitating techniques that would allow genotyping from the most abundant menu item, fried conch fritters.
Here we demonstrate that a multiplex PCR approach (Campbell, Harmon, & Narum, 2015) followed by next-generation sequencing satisfies all the requirements necessary for inexpensive, fast, and easy genotyping of low-quality samples. We illustrate the power of this method for these two endangered species in very divergent conservation contexts and real-life settings: genotypes from fecal, shed hair, and saliva samples found on the killed prey from wild Indian tigers and from CITES-regulated Caribbean queen conch imported to the US and sold in fried fritters.
Methods
We designed primers for 192 SNP loci from a pool of 50,000 SNPs discovered through whole genome sequencing of 75 tigers of captive and wild origin from P. t. tigris, P. t. jacksoni, P. t. altaica, and P. t. sumatrae subspecies (see supplementary materials). The pool consisted of SNPs that were polymorphic in all subspecies (> 10% MAF), or differentiated subspecies from each other (see supplementary materials). We used the publicly available program Primer3 to design primer pairs (amplicon size 50–90 bp, primer size 17–25 bp, and Tm of 60–61°C).The resulting 192 primers were pooled for use in a single amplification of all target SNPs for each tiger sample. The initial primer set was tested on fecal and tissue samples from captive tigers (Table S1). The test set included fecal and tissue samples from the same individuals, multiple subspecies, samples left to weather for up to two days, and a serially diluted tissue DNA sample. Tiger DNA was quantified (see supplementary materials) and all samples were genotyped in triplicate. SNPs were then tested on a range of non-invasive samples from across India (representative photographs in Fig. 1b, sample list in Table S2). Multiple locations and differing tiger landscapes, two sympatric carnivore species, the leopard (Panthera pardus) and Indian wild dog or dhole (Cuon alpinus) were genotyped. Fecal samples, saliva and shed hair were included.
The tiger example builds on a large number of SNPs ascertained from multiple fully sequenced genomes. Often such resources are not available for endangered species, although this is changing rapidly. For queen conch, we sought to identify SNPs from the transcriptomes of 83 tissue samples from six populations in the Caribbean including Florida, collected from fishing operations or non-lethally by research permit (see Table S3). Following methods paralleling those used for tigers, we then genotyped at 192 SNPs an additional 237 conch (from 14 populations) and as well as 42 putatively conch samples isolated from 15 conch fritters purchased at several Miami restaurants (see supplementary materials).
Results
SNP genotyping for wild tigers
Of the 192 SNPs, 126 produced the most consistent results (see supplementary materials for details, Fig. S1, Table S4), and this panel of 126 SNP panel had a high overall genotyping success rate (Fig. 2). An average of 95 SNPs (range:4–114)were successfully typed across all samples.When tiger DNA was > 0.01ng/ul (irrespective of sample type) an average of 105 SNPs (range: 48–114) were typed.12 SNPs did not amplify for over 50% of samples and were excluded from subsequent analysis. The dhole sample had very poor amplification success (mean across replicates <10 SNPs) as expected.Leopards had very high amplification success (mean: 110 SNPs), but the SNPs were monomorphic and identical across two individuals (mean: 0.03% mismatches). All tiger samples in contrast had distinct genotypes and could be readily distinguished from the leopard samples.
Replications of ‘within-sample’ and ‘within individual-between tissue types’ provided important insights on the precision and accuracy of SNP identification and its efficacy for non-invasive sampling. The proportion of matches between tissue types for the same individual (n=5) was very high (precision, mean 0.97; range 0.91–0.99), showing that our approach resulted in nearly identical genotypes for different sample types (feces, saliva, hair). Sample replicates (within-sample comparisons) were also highly concordant and had a high proportion of matches (precision, mean: 0.957, range: 0.757–1.0). Among these, 77 samples genotyped at least 60% of SNPs had even higher concordance across replicates (mean proportion of matches: 0.969, range: 0.86–1.0), with only 5 samples showing values less than 0.95. These results are in sharp contrast to the high error rates observed for microsatellite genotypes from other non-invasive sampling techniques (Coates et al., 2011). Our probability of misidentifying two different individuals as the same is vanishingly low (pID = 1.6E–22 assuming a population of siblings).
Observed relatedness values (Fig. 3) and clustering of samples (Fig. S2) further highlight our ability to detect biological patterns. Our data had two pairs of samples with known relationships (zoo samples) while the remaining were sampled from different populations in the wild. The known parent offspring and sibling pairs had relatedness values close to the expected value of 0.5 (Figure 3). Observed pair-wise relatedness was higher within than between known genetic clusters or populations (see Natesh et al., 2017). As expected, relatedness between individuals from a small, isolated population (NW, Figure 3) was high. Even with the few individuals in our pilot dataset presented here, wild individuals fell into three genetic clusters/populations (Figure S2), similar to previous data (Natesh et al., 2017). For small isolated populations (such as NW) future genotyping could help identify whether individuals are also inbred, and if such inbreeding has fitness consequences through estimates of individual reproductive success. Should genetic supplementation of this population be required, SNP data from different tiger landscapes could help identify possible sources.
SNPs genotyping for queen conch
Our second example illustrates that this multiplex PCR approach can be applied even if background genetic information is not as complete (as with tiger), and again highlights use in difficult samples: in this case fried conch fritters. We identified 480,962 SNPs from the sequenced transciptomes. Possibly because we based our primer design on a transcriptome rather than a genome, our SNP success was lower for conch than tigers. Approximately half the 192 conch primer pairs failed to provide data. For the best 62 SNPs success was about the same as for tigers (average 86%, Fig. S3). Replicate samples shared 99%–100% of their alleles, but we also found several genetically identical samples among small processed conch pieces collected from fishing boats. As with genetic re-sampling of tigers in the wild, discovery of repeated individuals in a sample can inform overall fishery numbers. There were no obvious close relatives among the samples, a result expected for low sample sizes and very large populations (Fig. S3). The data also inform basic information important in design of marine protected areas. Outlying islands (Aruba, St. Eustatius) were genetically differentiated from the central Caribbean and Florida (FST=0.037, 0.048 respectively, Table S5). Samples from the same island group/on the same coast (e.g. Florida, Bahamas) were not differentiated. These patterns parallel a recent survey of queen conch with microsatellites (Truelove et al., 2017).
While the conch isolated from deep fried fritters showed lower success rates than conch sampled by fishermen or from biopsies, the success rates were high enough to allow individual identification and initial population comparison. Twenty-three of the fritter conch did not amplify with any primer, or produced data for less than 18 SNPs, and we suspect they were not Queen conch. A further 8 fritter conch showed poor SNP amplification (average 41%) but could be confirmed as Queen conch samples. The most successful 17 conch from fritters showed success rates comparable to those of fresh samples (77% vs 86%) (Fig. S3). Even with this preliminary data set, we can determine that the fritter samples showed lower (average) genetic identity to Florida populations (average 78.6 – 80.5 alleles shared, Fig. S6)compared to the Puerto Rico or St. Eustatius (81.1–82.3, Fig. S6). Positive identification of populations for imported samples will require greater geographic sampling, a larger SNP panel, and a better genotype map of potential source locations. For example, assignment testing successfully places 65% of Bahamas conch samples back into that population, but this is not yet high enough for precise population discovery. Nevertheless, we can already suggest that the Florida Keys and Nassau are less likely to be the populations of origin for the fritters we genotyped, and show that high resolution nuclear data can be readily obtained even from deeply processed commercial samples and used to address key conservation needs.
Discussion
Our results on these small, pilot datasets provide proof of concept for multiplex SNP genotyping of non-invasive and processed market samples from two species with vastly different physiologies, biologies, ecologies, and conservation challenges. This approach satisfies the key requirements for a practical conservation application. It is inexpensive to scale up, we estimate that implementation costs (for upto 1000 SNPs) can be as low as $5 per sample. The initial costs for primer design might include full-genome sequencing, which is becoming more and more affordable. The cost of a genome of reasonable quality, even for fairly large mammalian genomes, in the range of a few thousand dollars (Armstrong et al., 2017). In addition, primers based on RADSeq data or RNAseq data provide cheaper solutions for very large genomes. The largest development cost is of primer synthesis, which is expected to go down as high-throughput oligo synthesis is becoming a large-scale, commercial commodity service. Lab equipment required of the approach is minimal (centrifuge, thermocycler, pipettes), and field-based sequencers are allowing the use of this method anywhere in the field.
Increasing the number of SNPs beyond a few hundred can provide additional information. For instance, even with a few thousand SNPs it should be possible to start fully imputing large regions of the genome in cases where full genomes of individuals in the pedigree can be obtained. Moreover, strategic design of primers that would amplify clusters of closely located SNPs should allow detection of very recent inbreeding by detecting long contiguous runs of homozygosity (Kirin et al., 2010). Some of the SNPs could be located in such close proximity that they generate minihaplotypes that have been shown to be particularly useful in pedigree reconstruction (Baetscher, Clemento, Ng, Anderson, &Garza, 2018). The SNP panels can be further designed to allow simultaneous species, individual, and diet identification for clusters of species, such as, for instance, all large carnivores and all common species in their diet in India or sub-Saharan Africa.
Most importantly, our data show that multiplex PCR is successful for degraded, cooked, mixed, or small and low-concentration samples, making it an ideal tool for monitoring individuals in populations under rigors of field conditions or commercial markets. In the future, when populations maybe very small and isolated, we may need the ability to impute whole genomes, construct full pedigrees, and conduct genome-wide association studies for detrimental or advantageous traits. In principle, this method will allow such advances.
Effective monitoring of individuals, populations and species is critical to designing rapid conservation action and management. In the future, conservation will depend critically on such tools. Small, isolated populations will require inbreeding management, genotype tracking for population assessment, and data to assess when individuals should be introduced. Populations across fragmented landscapes may require assisted migration, especially in the context of climate change and the range shifts it demands. Minimizing illegal wildlife trade will require enforcement based on genetically identified sources of trafficked animals. Rapid genetic monitoring of endangered species from commonly occurring non-invasive samples will provide a data-pathway to species recovery. We believe that multiplex PCR presents an example of such rapid, accessible, cheap and efficient technology that will make this possible.
Funding
R43HG009482, Wildlife Conservation Trust grant to UR, Fulbright Nehru Academic exchange Program (UR) and Wellcome Trust/DBT India Alliance senior fellowship (UR), Summit Foundation and the Global Genome Initiative at the Smithsonian Institution (to NT and SRP and Steve Box) supported conch work. Author Contributions: MN, RWT, DP, UR, EAH, SP and NT designed the study. MN, RWT and NT conducted lab work and analyses. MN, RWT, DP, UR, EAH, SP wrote the paper.
Competing interests
Ryan W. Taylor and End2End Genomics LLC have received NIH small business funding (R43HG009482) to develop molecular tools for the study of non-model species. The other authors declare that we have no competing interests.
Data Availability
On publication, the data will be made available on the Dryad repository.
Acknowledgements
We thank San Diego Zoo Global, Conservation Society of California at Oakland Zoo, Oakland, CA, San Francisco Zoo, the Bronx Zoo, WCS Russia program, El Paso Zoo, Performing Animal Welfare Society, Wildlife Conservation Society India Program, Centre for Wildlife Studies, India, Wildlife Conservation Trust, Himanshu Chhattani, Anubhab Khan, Prachi Thatte, Aditya Joshi, Arun Zachariah, Erica Calcagno and Jackie Gai for tiger samples. We thank the Forest Departments of Rajasthan, Karnataka and Madhya Pradesh and the American Zoo Association, Tiger SSP for permissions. Awadhesh Pandit and NCBS next generation genomics facility helped with Illumina sequencing. We thank Steve Box, and Steve Canty for logistical help and support in framing conservation questions for conch.
Footnotes
email addresses: meghanan{at}ncbs.res.in, ryan{at}ryantaylor.net, truelovn{at}stanford.edu, hadly{at}stanford.edu, spalumbi{at}stanford.edu, uramakri{at}ncbs.res.in, dpetrov{at}stanford.edu