Recurrent erosion of COA1/MITRAC15 demonstrates gene dispensability in oxidative phosphorylation

Skeletal muscle fibers rely upon either oxidative phosphorylation or glycolytic pathway to achieve muscular contractions that power mechanical movements. Species with energy-intensive adaptive traits that require sudden bursts of energy have a greater dependency on fibers that use the glycolytic pathway. Glycolytic fibers have decreased reliance on OXPHOS and lower mitochondrial content compared to oxidative fibers. Hence, we hypothesized that adaptive gene loss might have occurred within the OXPHOS pathway in lineages that largely depend on glycolytic fibers. The protein encoded by the COA1/MITRAC15 gene with conserved orthologs found in budding yeast to humans promotes mitochondrial translation. We show that gene disrupting mutations have accumulated within the COA1/MITRAC15 gene in the cheetah, several species of galliforms, and rodents. The genomic region containing COA1/MITRAC15 is a well-established evolutionary breakpoint region in mammals. Careful inspection of genome assemblies of closely related species of rodents and marsupials suggests two independent COA1/MITRAC15 gene loss events co-occurring with chromosomal rearrangements. Besides recurrent gene loss events, we document changes in COA1/MITRAC15 exon structure in primates and felids. The detailed evolutionary history presented in this study reveals the intricate link between skeletal muscle fiber composition and dispensability of the chaperone-like role of the COA1/MITRAC15 gene.


0 6
Functional studies have implicated a role for COA1/MITRAC15 in promoting mitochondrial 1 0 7 translation and complex I and IV biogenesis (Wang et al., 2020). However, overexpression of 1 0 8 other genes easily compensates for the mild effect of COA1/MITRAC15 gene knockout (Hess 1 0 9 et al., 2009;Pierrel et al., 2007). Notably, the COA1/MITRAC15 gene was also identified as a suggests that despite its mild phenotype, COA1/MITRAC15 can contribute to fitness increases 1 1 2 through its role as a chaperone. COA1/MITRAC15 resembles TIMM21, a subunit of the 1 1 3 TIM23 complex (Mick et al., 2012). Such TIMM21 gene duplicates interacting with the 1 1 4 mitochondrial import apparatus and respiratory chain complexes occur in Arabidopsis 1 1 5 Despite drastic variation in body size within mammals, the relative speed of locomotion is 1 4 9 thought to be largely independent of body mass, at least in small mammals no evidence of transcription of COA1/MITRAC15 (see Supplementary Figure S486-S489).

9 7
In contrast to the rabbit, intact COA1/MITRAC15 gene is present in the Royle's pika showing robust expression (see Supplementary Figure S490). Screening of RNA-seq 4 0 0 datasets from the root ganglion, spinal cord, ovary, liver, spleen, and testis in the naked mole- Chromosomal rearrangement in rodent species has resulted in the movement of genes The genome assemblies of rodents such as the mouse and rat are well-curated and represent 4 4 6 some of the highest-quality reference genomes (Rhie et al., 2021). To ensure that the  Although repeat regions are a major contributing factor for the misassembly of genomes, the The COA1/MITRAC15 gene is intact and robustly expressed in the platypus fatigue. Hence, the prediction from our hypothesis is that gene loss would not occur in Cervid The COA1/MITRAC15 gene has undergone duplication within the primate lineage. We transcriptome also supports the COA1/MITRAC15 gene annotation.

2 4
Skipping of the dog-like-exon-3 occurs in the transcriptomes of tiger (Panthera tigris 5 2 5 altaica), lion (Panthera leo persica), cat (Felis catus), and puma (Puma concolor) (see exists in the cheetah (Acinonyx jubatus), we found no transcripts in the skin RNA-seq data cheetah suggests gene loss. We further compared the splice isoforms found in canine and splice silencer elements were also compared between cat and dog (see Supplementary   5 3 4 Figure S746).

3 5
Co-expressed genes tend to perform related functions and are lost together. Hence, to identify of chicken and mouse using ENSEMBL BioMart (Supplementary Table S7). None of these 5 4 0 co-expressed genes appear lost in Galliformes or rodents. (Chrysolophus pictus and Phasianus colchicus) do not express the COA1/MITRAC15 gene.

0
Hence, we looked for signatures of relaxed selection in each of the terminal branches leading 5 5 1 to each Galliform species. We quantified branch-specific selection patterns using the program labeled the focal species as the foreground and used the Anseriformes species as the 5 5 5 background species. We downloaded the phylogenetic tree with branch lengths from the 5 5 6 TimeTree website. Although we found some evidence of relaxed selection in some of the Galliform species, the RELAX program also reported intensification of selection (see  Table S10). None of the internal branches were under relaxed selection.

9
We used the same phylogenetic tree and multiple sequence alignment to obtain branch-  Penelope pileata) with intact COA1/MITRAC15 gene, the values of ω are all less than 1.

6 5
Except for chicken (Gallus gallus), species with gene-disrupting changes are not under 5 6 6 purifying selection (see Supplementary Table S10 and S11). We evaluated the internal 5 6 7 nodes leading to the terminal branches for signatures of relaxed selection to ascertain whether  We relied upon multiple sequence alignments of carnivores (see Supplementary Table S12 identify gene disrupting mutations and changes in intron-exon structure. We evaluated each taxonomic group for lineage-specific relaxed selection (see Supplementary Table S15).  Different ω values were estimated for both of these labels (see Supplementary Table S17).

8 4
The ω values for mixed(ω m ) and functional(ω f ) branches were estimated using two different   Based on the assumptions of 1ds and 2ds, we could get a confidence interval for the 5 8 9 estimated time of gene loss (see Supplementary Table S17). Gene loss timing was estimated 5 9 0 separately in rodents and carnivores (see Supplementary Table S17). The GC content range (minimum and maximum possible values of GC% for a given amino 5 9 3 acid sequence) was calculated (see Supplementary Table S18) for COA1/MITRAC15 and 5 9 4 PDX1 amino acid sequences in rodent and primate species using the window-based tool extrapolates the evolution of GC content along the phylogeny for both genes (see  We calculated the (gBGC) for COA1/MITRAC15 gene sequences of more than 200 species 6 0 4 using the program mapNH(v1.3.0) implemented in the testNH package (Dutheil, 2008). In 6 0 5 mapNH, we used multiple sequence alignments of the COA1/MITRAC15 gene and species 6 0 6 tree as input with the flag model=K80. A single gene-wide estimate of gBGC termed GC* is 6 0 7 obtained for each species (see Supplementary Table S20). These estimates of GC* (GC* > 6 0 8 0.9 is significant) help understand the evolution of gBGC along the phylogeny using the 6 0 9 ContMap function of the phytools package. Additionally, we also calculated the gBGC for 6 1 0 taxonomic group-wise alignments using the programs phastBias and phyloFit implemented in 6 1 1 the PHAST (v1.3) package (Capra et al., 2013;Hubisz et al., 2011). In the first step, we use 6 1 2 the phyloFit program to fit phylogenetic models to multiple sequence alignments using the subst-mod flag with HKY85 model as argument). Next, the phastBias program with the -bgc The regulation of gene expression and splicing tends to be determined by the RNA binding introns of the gene. In contrast to felids, the splicing pattern in canid species matches the 6 2 6 ancestral state. Hence, we compared the COA1/MITRAC15 gene sequences of canid and felid 6 2 7 species to identify differences in the RNA binding motifs. We used the RBPmap (Paz et al.,

Results
We identified that the TIMM21 gene is a distant homolog of COA1/MITRAC15 based on PSI- search results from HHblits, 59 have annotation as "Cytochrome C oxidase assembly factor" 6 3 5 or "Cytochrome C oxidase assembly protein" or "COA1", and 120 as "TIMM21" homologs. The annotation of 13 proteins are "hypothetical", nine are "membrane" proteins, eight are 6 3 7 "DUF1783 domain-containing" proteins, and 27 proteins are from diverse proteins. The Arabidopsis thaliana homolog At2g37940. The distinct COA1/MITRAC15 and TIMM21 6 5 0 groups found by the cluster analysis suggest that TIMM21 is a very distant homolog of The list of proteins identified as homologs of human COA1/MITRAC15 (Supplementary homology include the membrane-spanning domain and covers >100 residues (see Fig. 1B).

0
In addition to the primary sequence-homology detected, both TIMM21 and COA1/MITRAC15 6 6 1 are known to play prominent roles in the mitochondria and have comparable secondary structures (see Fig. 1C, 1D). The strong homology between these proteins also allows for The sequence divergence between COA1/MITRAC15 and TIMM21 appears to result from haplotypes. For example, the blast search of sequencing raw read data from the human 6 7 7 genome with COA1/MITRAC15 gene sequence as a query results in two distinct haplotypes. One set of reads correspond to the intact COA1/MITRAC15 gene in humans, and the other set of reads are from the pseudogenic copy (see Fig. 2A). Comparative analysis of primate and Catarrhini is the lack of the internal start codon in Cercopithecidae, where Catarrhini has 6 8 9 a start codon. Since proteome level data is not available for these species, we rely solely on expression. To evaluate the relevance of the gene disrupting mutations and signatures of were screened to assess the transcriptional status of COA1/MITRAC15. We evaluated RNA- Galloanserae species have RNA-seq data available for very few tissues. We evaluated the 8 0 5 RNA-seq datasets from six tissues (Brain, Spleen, Skin, Liver, Gonad, and Blood) available  but had a few reads from exons three and four. We reasoned that this heterogeneity in the The heterogeneity in sequencing coverage of COA1/MITRAC15 exons demonstrates the  levels than PDX1, we could not rule out the possibility of gBGC affecting some of the exons.