Polymorphism within the mitochondrial genome of the ctenophore, Pleurobrachia bachei and its ongoing rapid evolution

The mitochondrial genomes in ctenophores are among the most compact in the animal kingdom with multiple rearrangements and examples of gene loss. Here, by resequencing of the Pleurobrachia bachei mitochondrial genome, we show that the high level of polymorphism (>10%) in Pleurobrachia might contribute to the ongoing fast evolution of ctenophores including the presence of truncated versions of apparently canonical genes such as cox1. Second, the codon interpretations in ctenophores, without robust proteomic data related to mitochondrial genes, is still a challenging issue, which is open for future experimental analyses.

Mitochondrial genomes from non-model/under-investigate organisms can be challenging to analyze, and ctenophores are one of the most difficult cases. Ctenophora or comb-jellies is a lineage of basal metazoans with relatively compact mitochondrial and nuclear genomes 1,2 . The initial and the most recent phylogenomic analyses of nuclear genomes and comparative analyses of 37 transcriptomes from representatives of this group indicated that Ctenophora is the sister lineage to the rest of extant animals 1,3,4 . However, it was also suggested that ctenophores might undergo an evolutionary bottleneck around the Permian time with a potential loss of their ancestral diversity and subsequent radiation and rapid evolution 4 . Mitochondrial genomes in ctenophores were affected by such complex evolutionary history and lost a significant fraction of the ancestral mitochondrial gene complement in Metazoa.
In 2011/2012 we showed that the mitochondrial genome of the ctenophore Pleurobrachia bachei 5 was highly divergent and fast evolving compared to other animals. Simultaneously published analysis of the mitochondrial genome of Mnemiopsis leidyi 6 came to similar conclusions. It was also discovered that ctenophore genomes are the smallest of all metazoan genomes sequenced to date.
Recently three mitochondria genomes from the benthic ctenophores Coeloplana loyai, Coeloplana yulianicorum and Vallicula multiformis 7 were sequenced with a discovery of even the smallest animal mitochondrial genome (9,961 bases in Vallicula multiformis 7 ). The data presented in this study further confirm that ctenophore mitochondrial genomes are highly divergent. Here, we indicate that the same is true even within the same species from this clade.
Arafat et al. also suggested that the comparative data from the benthic Platyctenida infer a possible need for reannotation of the original P. bachei mitochondrial genome 7 . Although more detailed annotation is certainly desirable, here we show that the level of polymorphism in the Pleurobrachia might contribute to the ongoing rapid evolution of ctenophore genomes including the presence of truncated versions of apparently canonical genes. Second, the codon interpretations in ctenophores, without robust proteomic data related to mitochondrial genes is still a challenging issue, which opens future experimental analyses.

The resequencing and reannotation of the Pleurobrachia mitochondrial genome
The recent comparative analyses of Platyctenida mt genomes 7 suggested "erroneous sequencing in the P. bachei cox1 gene". At the time of the original publication 5 , no other ctenophore mitochondrial sequences were available to compare with Pleurobrachia. Thus, we now investigated this point by looking at the more confounding issue of possible polymorphism. The Pleurobrachia mtDNA genome was PCR amplified, cloned and then sequenced from a single animal and several animals by Sanger technology. To date, Sanger sequencing is the gold standard for all sequencing.
Arafat et al. 7 proposed that the "region upstream to the cox1 gene includes several poly-T, and thus, they suspect the introduction of a frame shift due to erroneous sequencing of the number of T in these homopolymers" in the Pleurobrachia mtDNA genome. To sequence and annotate Pleurobrachia mtDNA genome, we also analyzed the original 454-genome assembly, Illumina, and the Sanger sequencing to validate each other mutually, and none of these technologies were intended to be used exclusively.
In contrast to next-generation sequencing, homopolymers are well-captured in Sanger sequencing. The recently implemented reannotation 7 used only one P. bachei nuclear genome run, SRR1174875 1 submitted to NCBI two years after the original P. bachei mtDNA genome paper was published, and concluded the presence of an extra T as the effect of erroneous sequencing.
The current overall P. bachei whole genome shotgun sequencing project BioProject: PRJNA213480, is comprised of fourteen nuclear genomic DNA sequencing projects that were assembled and deposited under GenBank: AVPN00000000.1 1 . Our P. bachei assembly contains one contig representing the mitochondrial genome and does not contain an extra T in the cox1 gene.
Secondly, we decided to further examine the issue of polymorphism in the Pleurobrachia mtDNA genome by cloning and then sequencing this region of interest with Sanger technology from ten individual P. bachei. Of the ten individual animals analyzed six had the extra T or 60% (see Fig.1a and b). Thus, we can conclude that this deletion/insertion of a T is not erroneous sequencing as suggested 7 , but a reflection of high polymorphism within the species.   Next, we then checked available ctenophore transcriptomes to see if we could detect a deletion/insertion of a T at a similar position in the cox1 gene. Both Mnemiopsis leidyi and Vallicula multiformis had sequences that detected a T-deletion, Fig 2. in a similar position as the P. bachei cox1 gene.
These data on other ctenophore species also suggest high polymorphisms in the mitochondrial genomes within the clade (as it was correctly captured in the original submission, JN392469.1). We estimated the polymorphism of the Pleurobrachia mitochondria genome to be approximately 10% in the Friday Harbor populations (Pacific North West). These data confirm the need to sequencing multiple single animals collected in their natural habitats.

Comments on the codon usage in ctenophore genomes
Considering the suggested rapid evolution of the ctenophore mitochondrial genomes, the correct interpretation of the codon usage is also challenging, and no mitochondrial proteomic data are available to date from this animal lineage. Arafat et al. 7 reannotation states "P. bachei the codon TGA, which codes for tryptophan in other ctenophores, is reassigned to serine as indicated by Pett and Lavrov 8 ".
TGA codon corresponds to UGA in RNA intermediates to protein synthesis. Earlier, we noted "In most metazoan mitochondrial genomes the UGA codon is a Tryptophan residue (Trp), however the UGA codon is undetermined in P. bachei's mitochondrial genome. Based on alignments with other conserved proteins, UGA has the potential to encode different amino acids (Table 1S and Table 3S in Supplementary Material)" 5 . Thus, the UGA codon was not robustly designated as a Tryptophan residue 5 .
In the computational re-analysis of previously published comparative mitochondrial sequences and novel genome data from ctenophores 1,2 , Pett and Lavrov in 2015 8 stated: "the mt-genome from Pl. bachei (NC_016697) using GenDecoder 9 showed that 38% of TGA codons in highly conserved positions correspond to serine". However, a 38% probability in our opinion doesn't warrant making a definitive identification at this moment. We feel it is best to state the UGA codon is undetermined in P. bachei's mitochondrial genome 5 , and experimentally test this hypothesis by proteomic approaches using more than one ctenophore species.
Finally, in the analyses of Platyctenida mitochondrial genomes 7 , it was indicated: "the true nad3 was not identified by us" and the authors proceed to state "the nad3 gene shares the same translation frame as the downstream nad4 gene and possesses an incomplete stop codon". However, no phylogenetic analysis on Nad3 was presented. In essence, a prediction of putative proteins, with no unique start and no stop, is also challenging without proteomic data in such enigmatic animals as ctenophores.
Conclusion. We conclude that the original annotation for the Pleurobrachia mitochondrial genome Genbank: JN392469.1 is consistent with the data available today, and the existing population data suggest both high level of polymorphism and the ongoing rapid evolution within this lineage. Nevertheless, as soon as more comparative/population data would be available, the reassignment of the codon usage in Ctenophora should be performed and further validated by experimental proteomic approaches.