RT Journal Article SR Electronic T1 De novo identification, differential analysis and functional annotation of SNPs from RNA-seq data in non-model species JF bioRxiv FD Cold Spring Harbor Laboratory SP 035238 DO 10.1101/035238 A1 Hélène Lopez Maestre A1 Lilia Brinza A1 Camille Marchet A1 Janice Kielbassa A1 Sylvère Bastien A1 Mathilde Boutigny A1 David Monnin A1 Adil El Filali A1 Claudia Marcia Carareto A1 Cristina Vieira A1 Franck Picard A1 Natacha Kremer A1 Fabrice Vavre A1 Marie-France Sagot A1 Vincent Lacroix YR 2015 UL http://biorxiv.org/content/early/2015/12/24/035238.abstract AB SNPs (Single Nucleotide Polymorphisms) are genetic markers whose precise identification is a prerequisite for association studies. Methods to identify them are currently well developed for model species, but rely on the availability of a (good) reference genome, and therefore cannot be applied to non-model species. They are also mostly tailored for whole genome (re-)sequencing experiments, whereas in many cases, transcriptome sequencing can be used as a cheaper alternative which already enables to identify SNPs located in transcribed regions. In this paper, we propose a method that identifies, quantifies and annotates SNPs without any reference genome, using RNA-seq data only. Individuals can be pooled prior to sequencing, if not enough material is available for sequencing from one individual. Using human RNA-seq data, we first compared the performance of our method with Gatk, a well established method that requires a reference genome. We showed that both methods predict SNPs with similar accuracy. We then validated experimentally the predictions of our method using RNA-seq data from two non-model species. The method can be used for any species to annotate SNPs and predict their impact on proteins. We further enable to test for the association of the identified SNPs with a phenotype of interest.