SEX-DETector: A Probabilistic Approach to Study Sex Chromosomes in Non-Model Organisms

Genome Biol Evol. 2016 Aug 29;8(8):2530-43. doi: 10.1093/gbe/evw172.

Abstract

We propose a probabilistic framework to infer autosomal and sex-linked genes from RNA-seq data of a cross for any sex chromosome type (XY, ZW, and UV). Sex chromosomes (especially the non-recombining and repeat-dense Y, W, U, and V) are notoriously difficult to sequence. Strategies have been developed to obtain partially assembled sex chromosome sequences. Most of them remain difficult to apply to numerous non-model organisms, either because they require a reference genome, or because they are designed for evolutionarily old systems. Sequencing a cross (parents and progeny) by RNA-seq to study the segregation of alleles and infer sex-linked genes is a cost-efficient strategy, which also provides expression level estimates. However, the lack of a proper statistical framework has limited a broader application of this approach. Tests on empirical Silene data show that our method identifies 20-35% more sex-linked genes than existing pipelines, while making reliable inferences for downstream analyses. Approximately 12 individuals are needed for optimal results based on simulations. For species with an unknown sex-determination system, the method can assess the presence and type (XY vs. ZW) of sex chromosomes through a model comparison strategy. The method is particularly well optimized for sex chromosomes of young or intermediate age, which are expected in thousands of yet unstudied lineages. Any organisms, including non-model ones for which nothing is known a priori, that can be bred in the lab, are suitable for our method. SEX-DETector and its implementation in a Galaxy workflow are made freely available.

Keywords: Galaxy workflow; RNA-seq; UV; XY; ZW; sex-linked genes.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Female
  • Male
  • Models, Genetic*
  • Probability
  • Sex Chromosomes / genetics*
  • Sex Determination Processes
  • Silene / genetics
  • Software*