PT - JOURNAL ARTICLE AU - Jan Axtner AU - Alex Crampton-Platt AU - Lisa A. Hörig AU - Azlan Mohamed AU - Charles C.Y. Xu AU - Douglas W. Yu AU - Andreas Wilting TI - An efficient and robust laboratory workflow and tetrapod database for larger scale eDNA studies AID - 10.1101/345082 DP - 2019 Jan 01 TA - bioRxiv PG - 345082 4099 - http://biorxiv.org/content/early/2019/01/31/345082.short 4100 - http://biorxiv.org/content/early/2019/01/31/345082.full AB - Background The use of environmental DNA, ‘eDNA,’ for species detection via metabarcoding is growing rapidly. We present a co-designed lab workflow and bioinformatic pipeline to mitigate the two most important risks of eDNA: sample contamination and taxonomic mis-assignment. These risks arise from the need for PCR amplification to detect the trace amounts of DNA combined with the necessity of using short target regions due to DNA degradation.Findings Our high-throughput workflow minimises these risks via a four-step strategy: (1) technical replication with two PCR replicates and two extraction replicates; (2) using multi-markers (12S, 16S, CytB); (3) a ‘twin-tagging,’ two-step PCR protocol;(4) use of the probabilistic taxonomic assignment method PROTAX, which can account for incomplete reference databases.As annotation errors in the reference sequences can result in taxonomic mis-assignment, we supply a protocol for curating sequence datasets. For some taxonomic groups and some markers, curation resulted in over 50% of sequences being deleted from public reference databases, due to (1) limited overlap between our target amplicon and reference sequences; (2) mislabelling of reference sequences; (3) redundancy.Finally, we provide a bioinformatic pipeline to process amplicons and conduct PROTAX assignment and tested it on an ‘invertebrate derived DNA’ (iDNA) dataset from 1532 leeches from Sabah, Malaysia. Twin-tagging allowed us to detect and exclude sequences with non-matching tags. The smallest DNA fragment (16S) amplified most frequently for all samples, but was less powerful for discriminating at species rank. Using a stringent and lax acceptance criteria we found 162 (stringent) and 190 (lax) vertebrate detections of 95 (stringent) and 109 (lax) leech samples.Conclusions Our metabarcoding workflow should help research groups increase the robustness of their results and therefore facilitate wider usage of e/iDNA, which is turning into a valuable source of ecological and conservation information on tetrapods.