Transcriptomic analyses reveal tissue-specific selection on genes related to apoptotic processes in the subterranean rodent, Ctenomys sociabilis

Specialization for a subterranean existence is expected to impact multiple aspects of an organism’s biology, including behavior, physiology, and genomic structure. While the phenotypic correlates of life underground have been extensively characterized, the genetic bases for these traits are not well understood, due in part to the challenges of generating large, multi-locus data sets using traditional DNA sequencing strategies. To begin exploring the genomic architecture of adaptation to a subterranean existence, we generated high-quality de novo transcriptome assemblies for 8 different tissue types (hippocampus, hypothalamus, kidney, liver, spleen, ovary, testis, skin) obtained from the colonial tuco-tuco (Ctenomys sociabilis), a group-living species of subterranean rodent that is endemic to southwestern Argentina. From these transcriptomes, we identified genes that are evolving more rapidly in the C. sociabilis lineage compared to other subterranean species of rodents. These comparisons suggest that genes associated with immune response, cell-cycle regulation, and heavy metal detoxification have been subject to positive selection in C. sociabilis. Comparisons of transcripts from different tissues suggest that the spleen and liver - organs involved in immune function and detoxification - may be particularly important sites for these adaptations, thereby underscoring the importance of including multiple tissue types in analyses of transcriptomic variation. In addition to providing an important resource for future genomic studies of C. sociabilis, our analyses generate new insights into the genomic architecture of functionally significant phenotypic traits in free-living mammals.


54
Convergent traits provide critical opportunities to examine interactions between shared environmental 55 challenges, selection, and the evolution of phenotypic and genotypic variation (Mares, 1975;Muschick,

56
Indermaur & Salzburger, 2012; Parker et al., 2013). One well-characterized suite of convergent 57 phenotypic traits occurs among subterranean rodents, which are defined by their tendency to spend 58 virtually their entire lives in underground burrows (Nevo, 1979;Lacey & Patton, 2000). This designation 59 encompasses more than 120 species representing 6 families and 3 suborders of rodents (Lacey & Patton, (Table S1) and thus all downstream analyses were 136 conducted using assemblies generated from the non-normalized datasets.

138
Compiled Transcriptome Assembly, Annotation and Analysis

140
In addition to tissue-specific transcriptomes, read data from all tissue types were pooled to generate a 141 single, merged transcriptome assembly. We produced 12 alternative merged assemblies through 142 combinations of read subsampling, transrate optimization, and merging algorithms. Each assembly was 143 evaluated for quality using TransRate v1.0.1 (Smith-Unna et al., 2016), which generates both quality 144 metrics and an optimized assembly. In addition, we evaluated each assembly for completeness using the 145 Vertebrata database within BUSCO v1.1b1 (Simão et al., 2015). Based on these analyses, we selected the 146 assembly with the highest quality and completeness. The pipeline for producing this selected assembly is 147 described below. Because previous research has revealed that little information is gained from using 148 datasets above 40M reads (MacManes, 2015), a random subset of 50 million paired-end reads were 149 selected for analyses (seqtk v1.0-r82 (https://github.com/lh3/seqtk)) from the entire dataset (N= 339 150 million reads). The subset of reads was assembled with both Trinity v2.

180
For each of the species in this comparative data set, coding sequences were identified using TransDecoder 181 v3.0.0 (Haas et al., 2013). Orthologous relationships among these species (including the M. musculus 182 outgroup) were identified using the output from BUSCO v2.0 and the associated database of mammalian 183 sequences. The resulting groups of orthologous transcripts were then edited to include only single copy 184 transcripts, which were then aligned using Prank v150803. Sequence alignments were refined using 185 pal2nal v14 (Suyama, Torrents & Bork, 2006) and a gene tree was constructed using RAxML v8.2.8 186 (Stamatakis, 2014). To explore potential evidence of selection on the genes included in our dataset, we 187 used PAML v4.9a (Yang, 2007), with our gene tree as the phylogenetic framework. Specifically, we 188 tested for positive selection using the M7 versus M8 models in PAML. We then tested for evidence of 189 lineage-specific selection using the PAML branch-site model with C. sociabilis as the foreground lineage.

190
We controlled the false discovery rate for multiple comparisons following the procedure of Benjamini and 191 Hockberg (1995). Genes determined to be under positive selection were then examined using the Gene

192
Ontology Consortium Enrichment Analysis (http://geneontology.org/page/go-enrichment-analysis) tool to 193 determine if these loci were grouped according to ontology terms.

195
To explore potential tissue-specific patterns of gene expression among loci identified as being under  Sequence read files for this study are available on the NCBI Short Read Archive (PRJNA358281). All 207 code used in transcriptome assembly, annotation, analyses, and data visualization is freely available 208 online at (https://github.com/macmanes-lab/tuco_manuscript and https://github.com/macmanes-lab/paml).

212
These files will be uploaded to Dryad upon acceptance of this manuscript for publication.  (Table S1). The TransRate optimized assemblies, which 220 included only highly-supported transcripts, contained on average 7% fewer BUSCOs than the original 221 assemblies. Due to this pronounced reduction in completeness, the TransRate optimized assemblies were 222 not used for subsequent analyses. While individual, non-optimized tissue-specific assemblies were of 223 acceptable quality and completeness, they were notably inferior in quality and completeness to the 224 compiled, transfused assembly described below.

226
Compiled Transcriptome Assembly, Annotation and Analysis

228
The most complete and highest quality assembly was generated from a 50 million read-pair subsample of 229 the full dataset (Table 1). This assembly was annotated and all non-annotated transcripts were removed to 230 produce the final assembly (annotation_only; Table 1   Each of the tissue types included in this study has been well characterized with respect to its function in 296 mammalian biology. Accordingly, we examined whether functional differences between tissues were 297 reflected in the identities of the most abundant transcripts unique to each tissue. We also assessed loci 298 under positive selection, highlighting aspects we believe may be key factors associated with live in 299 underground burrows. The functions of many of the most abundant transcripts that were unique to a given 300 tissue type have been characterized as part of empirical studies, as described below:

302
The hippocampus. The hippocampus is integrally involved in neurotransmission (Vianna et al., 2000; 303 Shatz, 2009). In particular, the hippocampus has been studied with regard to spatial memory and

378
Interestingly, the spleen was found to express more genes under positive selection than expected (Fig. 1), 379 suggesting this tissue may be an active target for adaptation. Three of these genes (Sperm Associated    In this study, we present a high quality and complete transcriptome for the colonial tuco-tuco (C. 439 sociabilis). By characterizing transcriptomes generated from eight tissue types, we provide preliminary 440 14 insights into how transcript abundance differs across tissues. Notably, the most abundant transcripts and 441 the genes subject to positive selection were generally consistent with the primary physiological 442 function(s) of the tissues from which they were derived, with a prevalence of transcripts associated with 443 cell proliferation. We also identify a set of genes that appear to be under positive selection; the number of 444 genes subject to selection that were expressed in the liver and spleen were greater than expected, 445 suggesting that these tissues are of particular functional importance to the colonial tuco-tucos. The