RT Journal Article SR Electronic T1 A novel workflow to improve multi-locus genotyping of wildlife species: an experimental set-up with a known model system JF bioRxiv FD Cold Spring Harbor Laboratory SP 638288 DO 10.1101/638288 A1 Gillingham, Mark A.F. A1 Montero, B. Karina A1 Wihelm, Kerstin A1 Grudzus, Kara A1 Sommer, Simone A1 Santos, Pablo S.C. YR 2020 UL http://biorxiv.org/content/early/2020/01/10/638288.abstract AB Genotyping novel complex multigene systems is particularly challenging in non-model organisms. Target primers frequently amplify simultaneously multiple loci leading to high PCR and sequencing artefacts such as chimeras and allele amplification bias. Most next-generation sequencing genotyping pipelines have been validated in non-model systems whereby the real genotype is unknown and the generation of artefacts may be highly repeatable. Further hindering accurate genotyping, the relationship between artefacts and copy number variation (CNV) within a PCR remains poorly described. Here we investigate the latter by experimentally combining multiple known major histocompatibility complex (MHC) haplotypes of a model organism (chicken, Gallus gallus, 43 artificial genotypes with 2-13 alleles per amplicon). In addition to well defined “optimal” primers, we simulated a non-model species situation by designing “naive” primers, with sequence data from closely related Galliform species. We applied a novel open-source genotyping pipeline (ACACIA) to the data, and compared its performance with another, previously published, pipeline. ACACIA yielded very high allele calling accuracy (>98%). Non-chimeric artefacts increased linearly with increasing CNV but chimeric artefacts leveled when amplifying more than 4-6 alleles. As expected, we found heterogeneous amplification efficiency of allelic variants when co-amplifying multiple loci. Using our validated ACACIA pipeline and the example data of this study, we discuss in detail the pitfalls researchers should avoid in order to reliably genotype complex multigene systems. ACACIA and the datasets used in this study are publicly available at GitLab and FigShare (https://gitlab.com/psc_santos/ACACIA and https://figshare.com/projects/ACACIA/66485).