RT Journal Article SR Electronic T1 An efficient and improved laboratory workflow and tetrapod database for larger scale eDNA studies JF bioRxiv FD Cold Spring Harbor Laboratory SP 345082 DO 10.1101/345082 A1 Jan Axtner A1 Alex Crampton-Platt A1 Lisa A. Hörig A1 Charles C.Y. Xu A1 Douglas W. Yu A1 Andreas Wilting YR 2018 UL http://biorxiv.org/content/early/2018/06/12/345082.abstract AB Background The use of environmental DNA, ‘eDNA,’ for species detection via metabarcoding is growing rapidly and now, even terrestrial mammals can be monitored via ‘invertebrate-derived DNA’ or ‘iDNA’ from hematophagous invertebrates. We present a co-designed lab workflow and bioinformatic pipeline to mitigate the two most important risks of e/iDNA: sample contamination and taxonomic mis-assignment. These risks arise from the need for amplification to detect the trace amounts of DNA and the necessity of using short target regions due to DNA degradation.Findings Here we present a high-throughput laboratory workflow that minimises these risks via a three-step strategy: (1) each sample is sequenced for two PCR replicates from each of two extraction replicates; (2) we use a ‘twin-tagging,’ two-step PCR protocol; (3) and a multi-marker approach targeting three mitochondrial loci: 12S, 16S and CytB. As a test, 1532 leeches were analysed from Sabah, Malaysian Borneo. Twin-tagging allowed us to detect and exclude chimeric sequences. The smallest DNA fragment (16S) amplified best for all samples but often at lower taxonomic resolution. We only accepted assignments that were found in both extraction replicates, totalling 174 assignments for 96 samples.To avoid false taxonomic assignments, we also present an approach to create curated reference databases that can be used with the powerful taxonomic-assignment method PROTAX. For some taxonomic groups and some markers, curation resulted in over 50% of sequences being deleted from public reference databases, due mainly to: (1) limited overlap between our target amplicon and available reference sequences; (2) apparent mislabelling of reference sequences; (3) redundancy. A provided bioinformatics pipeline processes amplicons and conducts the PROTAX taxonomic assignment.Conclusions Our metabarcoding workflow should help research groups to increase the robustness of their results and therefore facilitate wider usage of e/iDNA, which is turning into a valuable source of ecological and conservation information on tetrapods.