RT Journal Article SR Electronic T1 Multilocus DNA barcoding – Species Identification with Multilocus Data JF bioRxiv FD Cold Spring Harbor Laboratory SP 155861 DO 10.1101/155861 A1 Junning Liu A1 Jiamei Jiang A1 Shuli Song A1 Luke Tornabene A1 Ryan Chabarria A1 Gavin J P Naylor A1 Chenhong Li YR 2017 UL http://biorxiv.org/content/early/2017/06/27/155861.abstract AB Species identification using DNA sequences, known as DNA barcoding has been widely used in many applied fields. Current barcoding methods are usually based on a single mitochondrial locus, such as cytochrome c oxidase subunit I (COI). This type of barcoding is not always effective when applied to species separated by short divergence times or that contain introgressed genes from closely related species. Herein we introduce a more effective multi-locus barcoding framework that is based on gene capture and “next-generation” sequencing and provide both empirical and simulation tests of its efficacy. We examine genetic distinctness in two pairs of fishes that are sister-species: Siniperca chuatsi vs. S. kneri and Sicydium altum vs. S. adelum, where the COI barcoding approach failed species identification in both cases. Results revealed that distinctness between S. chuatsi and S. kneri increased as more independent loci were added. By contrast S. altum and S. adelum could not be distinguished even with all loci. Analyses of population structure and gene flow suggested that the two species of Siniperca diverged from each other a long time ago but have unidirectional gene flow, whereas the two species of Sicydium are not separated from each other and have high bidirectional gene flow. Simulations demonstrate that under limited gene flow (< 0.00001 per gene per generation) and enough separation time (> 100000 generation), we can correctly identify species using more than 90 loci. Finally, we selected 500 independent nuclear markers for ray-finned fishes and designed a three-step pipeline for multilocus DNA barcoding.