Signatures of Relaxed Selection in the CYP8B1 Gene of Birds and Mammals

The CYP8B1 gene is known to catalyse reactions that determine the ratio of primary bile salts and the loss of this gene has recently been linked to lack of cholic acid in the bile of naked-mole rats, elephants and manatees using forward genomics approaches. We screened the CYP8B1 gene sequence of more than 200 species and test for relaxation of selection along each terminal branch. The need for retaining a functional copy of the CYP8B1 gene is established by the presence of a conserved open reading frame across most species screened in this study. Interestingly, the dietary switch from bovid to cetacean species is accompanied by an exceptional ten amino acid extension at the C-terminal end through a single base frame-shift deletion. We also verify that the coding frame disrupting mutations previously reported in the elephant are correct, are shared by extinct Elephantimorpha species and coincide with the dietary switch to herbivory. Relaxation of selection in the CYP8B1 gene of the wombat (Vombatus ursinus) also corresponds to drastic change in diet. In summary, our forward genomics-based screen of bird and mammal species identifies recurrent changes in the selection landscape of the CYP8B1 gene concomitant with a change in dietary lipid content.

1 3 8 evolutionary proximity and further analyses were performed within these groups. Robustness of the results obtained was assessed using multiple tree topologies. We also evaluated the 1 4 0 sequences for substitution saturation using the index to measure substitution saturation (Iss) species that has been used in our study (Supplementary Table S3). In all cases we find that 1 4 5 Iss < Iss.c and largely shows a significant difference. Hence, the sequences used in our study  The nucleotide composition bias across the CYP8B1 gene sequence was calculated for each 1 4 8 species.
The mean GC content is reported on the github page: 1 9 8 has been reported for the CYP3A gene cluster in humans (Hu and Ng 2012; Wagh et al.  The bottlenose dolphin (Tursiops truncatus) stop codon has been acquired subsequently 2 1 0 through an independent (G to T) transversion event that converts the GAG codon to a TAG 2 1 1 stop codon (see Figure 2B). The recent (after the split from the pacific white sided dolphin However, it is not clear whether this distinctive bile composition has a genetic basis. gene needs to be investigated further to understand if it has had any functional consequences. The multi-species alignment spanning cetaceans was analysed using the general descriptive 2 2 1 model implemented in RELAX. We did not use the information obtained from the general 2 2 2 descriptive model for performing any other tests as that would amount to using the same data twice. Although none of the cetacean species showed (see Supplementary Table S4) 2 2 4 significant intensification or relaxation, it can be seen (Supplementary Figure S3A)  changes (see Supplementary Table S6).

4 6
The West Indian manatee (Trichechus manatus) shares only one of the five coding frame selection landscape within Afrotheria coincides with changes in diet.

5 4
We used seven Afrotherian species and performed independent tests for relaxed selection in tenrec. While significant relaxed selection has also been reported for Trichechus manatus 2 5 9 latirostris and Chrysochloris asiatica in a previous study (Sharma and Hiller 2018), we found 2 6 0 that neither of these two species showed significant relaxed selection when the six species 2 6 1 used in our alignment were used as the background species (see Supplementary Table S4).

6 2
Since the set of species used as the background was different in our test compared to the 2 6 3 previous report we systematically investigated the effect of using different species as the background as well as different tree topologies. In order to evaluate the effect of using different background species in our tests of relaxed  Table S7).

0 2
The Natal long-fingered bat (Miniopterus natalensis) continued to show intensification of  Loss of the CYP8B1 gene in naked-mole rat (Heterocephalus glaber) has been shown by the The diet of the North American beaver (Castor canadensis) mainly consists of tree bark and gene is still required in the beaver genome as it is possible produce both CDCA and UDCA supported by raw reads (see https://github.com/ceglab/CYP8B1/SAMs). However, we could 3 3 3 find relaxation of selection in the beaver lineage (see Supplementary Table S4). The relative It has previously been reported that the bile composition of the family Lemuridae consists identify gene disrupting mutations that might be present in one of the bile pathway genes.

4 6
Based on an alignment of six species spanning Monotremata, Marsupialia and Xenarthra we found evidence for significant (even after correcting for multiple testing) relaxed selection 3 4 8 (see Supplementary Table S4) in the wombat (Vombatus ursinus). In contrast to this, 3 4 9 significant intensified selection was seen in the Queensland koala (Phascolarctos cinereus).

5 0
Multiple copies of varying lengths have been assembled and annotated for the CYP8B1 gene we did not include these two species in our alignment. It has been reported that the wombat the wombat sequence from the alignment. Material for details). The chicken genome assembly is a good starting point within birds for performing gene To maintain consistency with the previous assemblies, the same highly inbred red jungle fowl CYP8B1 gene (see Supplementary Table S10).

8 5
Chicken bile salts are known to contain cholic acid and the inferred premature stop segregating as a polymorphism in chicken breeds, we screened the re-sequenced whole screened. This suggests that this premature stop codon inducing polymorphism was 3 9 5 potentially acquired in domesticated chicken. To further verify whether the CYP8B1 gene is 3 9 6 lost in chicken due to relaxed selective constraints, we decided to use the strategy used by Candidate species whose bile composition needs further investigation 4 1 0 Based on our analysis of the CYP8B1 gene across more than 200 species we have identified bile composition for this species might need to be re-evaluated. However, we cannot rule out  While comprehensive genome-wide studies that rely upon the abundance of publically picture. Hence, it is not surprising that gene loss inferences are also prone to be affected by In contrast to this, sequencing of thousands of individuals from diverse human populations 4 7 1 has allowed the reliable and robust estimation of allele frequencies across the human genome. Efforts are on to assemble Pan-genomes that incorporate population level variation into sequences that reflect changes in regulatory regions would be extremely useful.

0 8
The accumulation of multiple stop codon/frame-shift changes in a gene is generally considered strong evidence for the loss of that gene as the truncated protein is unlikely to proxy for identifying changes in function is to look for signatures of intense relaxed selection.

1 3
Our study uses the CYP8B1 gene sequence of more than 200 species to look for signatures of other genes in the bile pathway (see Supplementary Table S4). Similarly, the prevalence of rates in the presence of error in genome assembly and annotation using CAFE 3. Mol genotype to phenotype using independent phenotypic losses among related species. Cell