Studying the dawn of de novo gene emergence in mice reveals fast integration of new genes into functional networks

The de novo emergence of new transcripts has been well documented through genomic analyses. However, a functional analysis, especially of very young protein-coding genes, is still largely lacking. Here we focus on three loci that have evolved from previously intergenic sequences in the house mouse (Mus musculus) and are not present in its closest relatives. We have obtained knockouts and analyzed their phenotypes, including a deep transcriptomic analysis, based on a dedicated power analysis. We show that the transcriptional networks are significantly disturbed in the knockouts and that all three genes have effects on phenotypes that are related to their expression patterns. This includes behavioral effects, skeletal differences and the regulation of the reproduction cycle in females. Substitution analysis suggests that all three genes have directly obtained an activity, without new adaptive substitutions. Our findings support the hypothesis that de novo genes can quickly adopt functions without extensive adaptation. Impact statement New protein-coding genes emerging out of non-coding sequences can become directly functional without signatures of adaptive protein changes


Introduction 33
The evolution of new genes through duplication-divergence processes is well understood ( Reinhardt et al., 2013). However, in each of these cases, the genes were already relatively 50 old, especially when taking the short generation times of these organisms into account. The most details 51 for a very recent de novo evolved gene are so far available for Pldi in mice, which emerged 2.5-3.5 52 million years ago. In this case the knockout was shown to affect sperm motility and testis weight. But 53 Pldi codes for a long non-coding RNA, not for a protein (Heinen, Staubach, Haming, & Tautz, 2009). 54 Here, we focus on protein coding genes that have emerged less than 1.5 million years ago in the lineage 55 towards the house mouse (Mus musculus). 56 Expression of these genes is found throughout all tissues analyzed, with notable differences. Testis and 109 brain express the highest fraction, while the digestive system and liver express the lowest fraction ( Figure  110 1A). Expression levels of these genes are generally lower than those of other genes (FPKM medians: 0.63 111 vs. 8.18; two-tailed Wilcoxon rank sum test, P-Value < 2.2 × 10 -16 ; Figure 1C). Most overall molecular 112 patterns are similar to previous findings (Neme & Tautz, 2013; Schmitz, Ullrich, & Bornberg-Bauer, 113 2018). They have fewer exons (medians: 2 vs. 7; two-tailed Wilcoxon rank sum test, P-Value < 2.2 × 10 -114 16 ) and fewer coding exons than other genes (medians: 1 vs. 6; two-tailed Wilcoxon rank sum test, P-115 Value < 2.2 × 10 -16 ). The lengths of their proteins are shorter than those of other proteins (medians: 125 116 vs. 397; two tailed Wilcoxon rank sum test, P-Value < 2.2 × 10 -16 ). However, their proteins are predicted 117 to be less disordered than other proteins (medians: 0.20 vs. 0.27; two-tailed Wilcoxon rank sum test, P-118 Value = 0.0024; Figure 1D) and equally hydrophobic to other proteins (medians: 0.56 vs. 0.57; two-tailed 119 Wilcoxon rank sum test, P-Value = 0.52; Figure 1E). Note that the two sets of values show a broad 120 distribution. 121 values for the three genes studied here (see Table 1) are indicated in the three violin plots. 134

Genes for functional analyses 136
We selected three genes from the above list for in-depth analyses, including knockouts, transcriptomic 137 studies and phenotyping (Table 1). For convenience we will call these genes in the following "Unnamed 138 de novo genes -Udng", i.e., Udng1, Udng2, and Udng3, but note that we propose new formal names in 139 the discussion. The criteria for selecting these three genes were as follows: (i) they have clear 140 transcriptional expression evidence, (ii) have at least two exons, (iii) their translation is supported by 141 ribosome profiling and/or proteomic evidence and (iv) they are specific to the M. musculus lineage, i.e., 142 have emerged less than 1.5 million years ago (see below). Further, they cover also a range from low to 143 high intrinsic structural disorder scores and hydrophobicities, as well as lower to higher expression levels 144 Udng1 shows a relatively high expression (up to FPKM 3) in multiple tissues, with the highest in brain 150 tissues at different stages as well as in embryonic limbs ( Figure 1B; Figure 1-figure supplement 1). 151 Udng2 shows on average a lower expression (up to FPKM 0.6), also mostly in brain tissues at different 152 stages ( Figure 1B; Figure 1-figure supplement 1). Udng3 is only expressed in two tissues, the ovary of 8 153 weeks old females (FPKM 0.135), as well as the subcutaneous adipose tissue of 8 weeks old animals 154 (FPKM 0.115) ( Figure 1B). Given that the ovary is a very small organ, with closely attached tissues, such 155 as oviduct and gonadal fat pad, there could be contamination between these different tissue types. Hence, 156 we were interested whether there is specificity for one of them. We used RT-PCR for the respective 157 carefully prepared tissue samples for Udng3 and a control gene (Uba1) and found that Udng3 is not 158 expressed in the ovary, but predominantly in the oviduct with only a weak signal from the adjacent fat 159 pad (Table 1-supplement 1). 160 161 Evolutionary emergence of the three candidate genes 162 We used whole genome sequencing data (Harr et al., 2016) and Sanger sequencing data of PCR fragments 163 from mouse populations, subspecies and related species to trace the emergence of the ORFs for the three 164 candidate genes. We found the respective genomic regions covering the ORFs in all species analyzed, 165 which include the wood mouse Apodemus that has split from the Mus lineage about 10 million years ago. 166 However, in these more distant species, the reading frames are interrupted by early stop codons and/or 167 non-frame indels. Full reading frames were only found in populations and subspecies of M. musculus, but show on average more substitutions, if evolution was accelerated due to positive selection after the 202 acquisition of the ORF. However, we find that this is not the case, the observed number of substitutions is 203 very similar between both groups (Table 2). However, we noted that Udng3 shows more substitutions for 204 both groups. To obtain an estimate for the expected number of substitutions, we have used the average 205 distances between the taxa derived from whole genome comparisons. These should reflect approximately 206 the neutral rates, given that most of the genome is not expected to be subject to evolutionary constraints. 207 The results are also provided in Table 2 (the full matrix of pairwise differences is included in Figure 2 -208 figure supplement 2). We find that Udng1 and Udng2 evolve at the expected average rate while Udng3 is 209 indeed faster than expected. Still, when testing observed versus expected values between each group for 210 each locus, we find that none of them is significant (Table 2). Hence, in spite of the region specific rate

Generation of gene knockouts and power analysis 221
For the further functional characterization of the three genes, we obtained knockout lines. Udng1 and 222 Udng2 represent constructs in which all or most of the ORFs were substituted by lacZ, Udng3 was 223 generated by creating a frame shift in the ORF through CRISPR/Cas9 mutagenesis. All three lines were 224 homozygous viable and showed only subtle phenotypes (further details below). We were therefore 225 interested in studying their impact on the transcriptional network in the tissues in which they are 226 predominantly expressed. Given the recent evolution of the genes, one would expect only a small 227 influence. Hence, we first did a power analysis to get an estimate on how deeply we can trace changes in 228 the networks. 229

230
Several conditions have to be considered for such a power analysis. When using RNA-Seq read count 231 (fragment count for paired-end sequencing) data, we assume (1) read counts follow a negative binomial 232 distribution; (2) all samples are sequenced at the same depth; (3) significance level after Bonferroni 233 adjusted is 0.05 and in total 15,000 genes are tested, i.e., the significance level before adjustment is 3.3 × 234 10 -6 . The power to detect a differentially expressed gene can then be estimated by the given (1) sample 235 size, (2) fold change between knockouts and wildtypes, (3) average read count, and (4) dispersion, which 236 is the measurement of biological and technical variance considering the effect of mean read count ( Figure  237 3A). Based on this a priori analysis, we used at least 10 biological replicates of knockouts and wildtypes, 238 performed deep sequencing and minimized variance by using standardized rearing conditions for the mice, 239 as well as standardized and parallel preparation and sequencing procedures. Under these conditions, it is 240 expected to be possible to detect significant differences even when the fold-changes are as low as 1.05 to 241 1.25. We found that these expectations fitted well with our real data described below ( Figure 3B   In the light of these controls, we conclude that the effects shown for the knockouts in the following can 281 indeed be ascribed to the knockouts themselves, rather than a confounding factor. We describe the results 282 for each gene in turn. 2). Interestingly, we found only one differentially expressed gene between females, Udng1 itself (DESeq2, 298 adjusted P-Value ≤ 0.01). This can be ascribed to a higher dispersion in the female samples ( Figure 3C), 299 which results in a loss of power. The reason for the higher dispersion in females in these samples is 300 currently unclear. Functional enrichment analysis of the 1,718 differentially expressed genes (except for 301 Udng1 itself) in males revealed 501 distinct Gene Ontology functional terms and 137 distinct pathways 302 (KOBAS, corrected P-Value ≤ 0.05; Table 1 -supplement 2). 303 RNA was also obtained from 10 to 14 12.5-day embryos of the four sex (female or male) and genotype 305 (homozygous knockout or wildtype) combinations. On average, we could map 67.1 million unique reads 306 per sample (range from 36.9 to 92.7 million reads; Figure  The relatively high expression of Udng1 in the CNS and the RNA-Seq results of the heads of postnatal 320 pups indicate that it may have an effect on the behavior of the mice. We performed three standardized 321 behavioral tests: elevated plus maze, open field, and novel object to test this possibility. We found a 322 significant difference for the open field test with respect to total distance moved (nested ranks test, P-323 Value = 0.0023; Table 3; full data in Table 3 -supplement 1). 324

325
Given that Udng1 is also expressed in limbs, we asked whether there would also be differences in limb 326 morphology. We scanned the skeletons of the respective wildtype and knockout mice and analyzed their 327 bone lengths, following the procedures described in (Skrabar, Turner, Pallares, Harr, & Tautz, 2018). We 328 found that the knockout mice had significantly longer metatarsals (two-tailed Wilcoxon rank sum test, P-Value = 0.020) and significantly shorter metacarpals (two-tailed Wilcoxon rank sum test, P-Value = 330 0.043), and in tendency also longer tibias (Table 3; full data in Table 3 -supplement 1) 331 332 This raises the question whether the limb length phenotype could cause the "distance moved" phenotype 342 in the open field test (see above). However, given that "distance moved" was also recorded in the novel 343 object test and showed no significant difference between WT and KO (see also discussion), we do not 344 consider the small differences in limb length elements as factors that would impair movement. Hence, it is 345 more likely that these phenotypes are independent of each other and relate to the different expression 346 aspects in limbs and brains. 347 The RNA-Seq results of the heads of postnatal pups indicate that Udng2 may be involved in mouse 367 behavior too. We performed the same four behavioral tests as for Udng1. We found significant effects in 368 the elevated plus maze test (Table 3 and Table 3 -supplement 1), but note that only fewer animals were 369 available for the other tests. We found that knockout males stayed shorter in the center (nested ranks test,  (Table 3).

Udng3 knockout effect on the transcriptome 375
The Udng3 knockout line was generated using CRISPR/Cas9 mutagenesis in a laboratory strain that is 376 hierarchical clustering methods, which allowed to distinguish three major clusters ( Figures 4A and 4B). 391 To confirm that these correspond to three different phases of the estrous cycle, we analyzed the 392 expression of three known cycle dependent genes in the respective clusters, progesterone receptor (Pgr) 393 and estrogen receptors (Esr1 and Gper1). We found that these genes change indeed in the expected 394 directions, both in the wildtype as well as the knockout animals ( Figures 4C-E). Based on this finding, we 395 performed the differential expression analysis on the three clusters separately. We found 21 differentially 396 expressed genes in cluster 1 (DESeq2, adjusted P-Value ≤ 0.01; fold changes range from 0.75 to 1.59; 397 Given that the Dcpp genes are more highly expressed in Udng3 knockouts, one could predict a higher 409 implantation frequency of embryos, as it has been shown through experimental manipulation of Dcpp 410 levels (Lee et al., 2006). We assessed the litters of pairs that were produced during our breeding 411 experiments and found that the first litters from homozygous knockout females were produced after the 412 same time as those from wildtype or heterozygous females (medians: 23 vs. 22 days; Table 3 -413 supplement 1), while we found that the second litters from homozygous knockout females were produced 414 faster than those from wildtype or heterozygous females (medians: 23 vs. 38 days;

Functional de novo gene emergence 459
It has long been assumed that the emergence of function out of non-coding DNA regions must be rare, 460 and if it occurs, the resulting genes would be far away from assuming a function. Our results do not 461 support these assumptions. It is easy to find many well supported transcripts that could be considered to 462 be true de novo genes. And three out of three chosen such genes can be shown to have functions. Hence, 463 it would seem likely that most of the candidate genes in our curated list contribute aspects to the 464 phenotype. Further, the fact that we neither observe patterns of ongoing positive selection, nor 465 specifically accelerated evolution around these genes, suggests that they did not need additional We note also that an experiment that has expressed random peptides in plant (Arabidopsis) had a very 478 high success rate of identifying associated phenotypes (Bao, Clancy, Carvalho, Elliott, & Folta, 2017). 479 One of the peptides that were functionally studied by these authors mediates an early flowering phenotype, 480 which would self-evidently be a possible function for an ecological adaptation. 481 482 483

Transcriptome changes and phenotypes 485
The fact that we see the disturbance of a whole transcriptomic network in the knockouts should of course 486 not be interpreted to mean that the new genes interact directly with all of these other genes. We expect 487 that even a single or a few interactions with other genes that are already part of a network could trigger 488 this. Since our experimental design allowed a very high sensitivity to detect this, we were able to see the 489 disturbance of many further interacting genes. We emphasize that the power of our analysis is much 490 higher than in most transcriptomic studies, i.e., we can see effects that would otherwise not be noted. 491 For Udng2 / Lzhz the disturbed network has some functional coherence (extracellular matrix or cell 492 motility functions), while the Udng1 / Swk knockout results in rather broad effects. The fact that much 493 fewer gene expression changes are seen for Udng3 / Shj can be explained by the reduced power that we 494 had in this experiment, due to the need to separate the data into three clusters. Similarly, the differences 495 between females and males in the postnatal samples may be entirely due to different dispersions, rather 496 than to sex-specific effects. But this question will need further study. 497 498

Phenotype changes 499
None of the three knockout lines showed an overt phenotype, but we considered this also as a priori 500 unlikely, given that a de novo evolved gene is expected to be only added to an existing network of genes. 501 However, given the observed transcriptome changes, we were encouraged to apply a small set of 502 phenotypic tests, relating to the respective major expression patterns of the genes. However, we consider 503 the results from these tests only as preliminary at this stage. The behavioral tests in particular could be 504 influenced by a variety of factors and would need repetition in much larger numbers. For example, the 505 fact that "total distance" moved was measured in two behavioral tests (open field and novel object tests), 506 but showed a significant difference in only one of the tests for Udng1 / Swk suggests a higher complexity. 507 But at least the tendency was the same in both tests (shorter distance in knockouts). Still, we decided to 508 not extend these tests for a larger number of Udng2 mice. 509 For Udng3 / Shj we identified a possible direct link between the identified phenotype of a shorter 510 gestation length in the knockouts and the transcriptomic changes. We found that the expression level of 511 all three copies of Dcpp genes in C57BL/6N mice are enhanced in the Udng3 / Shj knockout animals. 512 Dcpp expression is induced in the oviduct by pre-implantation embryos and is then secreted into the 513 oviduct. This in turn stimulates the further maturation of the embryos and eventually the implantation 514

Genome-wide identification of de novo genes 542
We modified previous phylostratigraphy and synteny-based methods to identify Mus-specific de novo 543 protein-coding genes from intergenic regions. We started with mouse proteins annotated in Ensembl 544 we first used NCBI BLASTP (2.5.0+) to align low complexity region masked mouse protein sequences to 548 rat protein sequences annotated in Ensembl (Version 80) and filtered out the mouse sequences having hits 549 with E-values smaller than 110 -7 . This removes all conserved genes. Next we used NCBI BLASTP 550 (2.5.0+) to align the remaining low complexity region masked sequences to NCBI nr protein sequences 551 with E-values smaller than 110 -3 according to (Neme & Tautz, 2013). The genes remaining after these 553 filtering steps are the candidates for the de novo evolved genes. In order to deal also with proteins having 554 low complexity regions, we further applied a synteny-based strategy on the rest proteins by taking 555 advantage of the Chain annotation from Comparative Genomics of UCSC Genome Browser 556 ("http://genome.ucsc.edu/") (Kent et al., 2002). We filtered out the proteins encoded on unassembled 557 scaffolds because their chromosome information is not compatible between Ensembl and UCSC 558 annotations. We only compared rat and human proteins with mouse proteins because their genomes are 559 well assembled and genes are well annotated. We performed the same procedures on rat and human data 560 separately, and used "mm10.rn5.all.chain" and "rn5ToRn6.over.chain" from UCSC and gene annotation 561 from Ensembl (Version 80) for rat, and "mm10.hg38.all.chain" from UCSC and gene annotation from 562   performed the same procedures on PRIDE and PeptideAtlas data separately following the method 605 described in (Xie et al., 2012). In brief, if the whole sequence of a peptide was identical to one fragment 606 of the tested de novo protein sequence, and had at least two amino acids difference compared to all the 607 fragments of other protein sequences in the mouse genome, the peptide was considered to be convincing 608 evidence for the translational expression of the respective de novo protein.        Specifically, we used the est_power function, and set parameters w (ratio of normalization factors 686 between two groups) as 1, alpha (significance level) as 3.3 × 10 -6 . Then we traversed all 98,670 possible 687 combinations of N (sample size) from 3 to 13, rho (fold change) from 1.05 to 1.5, lambda0 (read count) 688 from 4 to 65,536, and phi0 (dispersion) from 0.00025 to 1.024 to calculate the power values. 689 To calculate the power of each gene in each of our real RNA-Seq datasets, we also used the est_power 690 function with the parameters w as 1, alpha as 3.3 × 10 -6 , and n as the real sample size, and rho 691 (2 |log2FoldChange| ), lambda0 (baseMean), and phi0 (dispersion) estimated by DESeq2 (1.14.1) (

RNA-Seq and data analysis 719
The heads of postnatal 0.5-day Udng1 and Udng2 pups, the 12.5-day Udng1 embryos, and the oviducts of 720 10-11 weeks old Udng3 females were carefully collected and immediately frozen with liquid nitrogen. 721 Then, for all these samples, total RNAs were purified using QIAGEN RNeasy Microarray Tissue Mini 722 Kit (Catalog no. 73304). All RNA samples, including the total RNAs purified from the transfected MEF

Behavioral tests 768
The following behavioral tests were performed on the Udng1 and Udng2 mice used in this study: elevated 769 plus maze test, open field test and novel object test. All tests were recorded on video using a VK-13165 770 Eneo camera mounted directly above the experimental set-up and behaviors were measured using 771 VideoMot2 (TSE Systems). All tests were filmed in the same room under similar lighting conditions (less 772 than 200 lux). All lights faced the ceiling in order to avoid any glare or reflections within the test arenas. 773 For the elevated plus maze we used an arena that was designed for testing wild mice. It was constructed 774 as two perpendicular arms using PVC plastic and acrylic glass, and was 80 cm above ground. The dark 775 arms of the maze were made with grey PVC plastic sides, with a white PVC plastic bottom. The dark 776 arms were 50 cm long, 10 cm wide and 40 cm deep. Open arms had same dimensions, except that the 777 walls were made of acrylic glass instead of grey plastic. For testing, each mouse was placed at the center 778 of the arena at the beginning of the test using a transparent plastic transfer pipe. Mice were filmed inside 779 the test arena for 5 minutes (Holmes, Parmigiani, Ferrari, Palanza, & Rodgers, 2000). VideoMot2 (TSE 780 Systems) was used to measure the time which the mouse spent in the dark arm, the light arm, and the 781 center of the maze. After each experiment, the test arena was cleaned with 30% ethanol. 782 The open field arena was made of white PVC plastic and measured 60 x 60 cm, and the walls were 60 cm 783 high. The arena was placed directly beneath a security camera and measurements were taken using 784 VideoMot2 (TSE Systems). At the beginning of the experiment, the mouse was placed at the center of the 785 arena using a transparent plastic transfer pipe. Each mouse was filmed for 5 minutes. The novel object test was carried out in the same arena as the open field test. The arena was placed 790 directly beneath a security camera and measurements were taken using VideoMot2 (TSE Systems). At the 791 beginning of the experiment, the mouse was placed at the center of the arena using a transparent plastic 792 transfer pipe along with a toy made of colored building blocks (Lego). Each mouse was filmed for 5 793 minutes. Measurements taken during the novel object test included the latency to investigate the novel 794 object, the number of visits to the novel object, and the distance travel during the experiment. The number 795 of visits to the novel object was accessed based on visits to an area of 7.5 cm around the novel object 796 (Yuen et al., 2016). After each experiment, the test arena and novel object were cleaned with 30% ethanol. 797 All the measured Udng1 and Udng2 mice are adult males. They were genotyped in advance, matched 798 between knockouts and wildtypes. The genotypes were then masked to the experimenter. Their ages were 799 from 11 to 17 weeks old for Udng1 and from 15 to 25 weeks old for Udng2. Each mouse stayed alone in 800 the cage in a room with only male mice at least two weeks before measurements. 40