PT - JOURNAL ARTICLE AU - Eric Lombaert AU - Thomas Guillemaud AU - Emeline Deleury TI - A simulation-based evaluation of STRUCTURE software for exploring the introduction routes of invasive species AID - 10.1101/094029 DP - 2017 Jan 01 TA - bioRxiv PG - 094029 4099 - http://biorxiv.org/content/early/2017/06/01/094029.short 4100 - http://biorxiv.org/content/early/2017/06/01/094029.full AB - Population genetic methods are widely used to retrace the introduction routes of invasive species. The unsupervised Bayesian clustering algorithm implemented in STRUCTURE is amongst the most frequently use of these methods, but its ability to provide reliable information about introduction routes has never been assessed. We used computer simulations of microsatellite datasets to evaluate the extent to which the clustering results provided by STRUCTURE were misleading for the inference of introduction routes. We focused on the simple case of an invasion scenario involving one native population and two independently introduced populations, because it is the sole scenario with two introduced populations that can be rejected when obtaining a particular clustering with a STRUCTURE analysis at K = 2 (two clusters). Results were classified as “misleading” or “non-misleading”. We then investigated the influence of two demographic parameters (effective size and bottleneck severity) and different numbers of loci on the type and frequency of misleading results. We showed that misleading STRUCTURE results were obtained for 10% of our simulated datasets and at a frequency of up to 37% for some combinations of parameters. Our results highlighted two different categories of misleading output. The first occurs in situations in which the native population has a low level of diversity. In this case, the two introduced populations may be very similar, despite their independent introduction histories. The second category results from convergence issues in STRUCTURE for K = 2, with strong bottleneck severity and/or large numbers of loci resulting in high levels of differentiation between the three populations.