Abstract
Organ development is regulated by complex interactions of multiple regulatory pathways. These pathways (Wnt, Tgfβ, Fgf, Hh, Eda, Notch) are becoming increasingly better known, with many identified genes having well-characterized effects on the phenotype. We classify genes required for normal organogenesis into different categories that range from essential to subtle modification of the phenotype. We focus on the mouse tooth development in which over 70 genes are known to be required for normal odontogenesis. These genes were classified into progression, shape, and tissue categories based on whether their null mutations cause early developmental arrests, altered morphologies, or hard tissue defects, respectively. Collectively, we call these here the developmental keystone genes. Additionally, we identified 100 developmental genes with no phenotypic effects on molars when null mutated, thereby providing the means to contrast expression dynamics between keystone and non-keystone genes. Transcriptome profiling using microarray and RNAseq analyses of patterning stage mouse molars show elevated expression levels for progression and shape genes, the former category showing the most significant upregulation. Single-cell RNAseq analyses reveal that even though the size of the expression domain, measured in number of cells, is the main driver of organ-level expression, the progression genes show high cell-level transcript abundances. In contrast, high proportion of the shape genes are secreted ligands that are found to be expressed in fewer cells than their receptors and intracellular components. Overall, we postulate that genes essential for the progression of organ patterning are characterized by high level of expression, whereas fine-tuning of the pattern is more dependent on spatially restricted production of ligands. The combination of phenotypically defined gene categories and transcriptomes allow the characterization of the expression dynamics underlying different aspects of organogenesis.
Introduction
Much of the functional evidence for the roles of developmental genes comes from natural mutants or experiments in which the activity of gene is altered. Most often these experiments involve deactivation, or null mutations where the production of a specific gene product is prevented altogether. In the cases where development of an organism is arrested altogether, the specific gene is considered to be absolutely required or essential for development (1, 2). Through a large number of experiments in different organisms, an increasingly nuanced view of developmental regulation has emerged showing that some genes appear to be absolutely required, whereas others may cause milder effects on the phenotype (3, 4). Yet, there are a large number of genes that despite being dynamically regulated during individual organ development, have no detectable phenotypic effect when null mutated.
Within the framework of distinct phenotypic outcomes of gene deactivation, it can be argued for a gradation from developmentally ‘more essential’ to ‘less essential’ genes. Collectively, these can be considered to be analogous to the keystone species concept used in ecology (5). These genes, which can be called ‘developmental keystone genes’, are not necessarily essential for development. Rather, compared to all the genes, developmental keystone genes exert a disproportional effect on the phenotype. To be operational, developmental keystone genes are defined within an organ (or individual) of interest. This organ focus also means that genes that have no effects in one organ may be critical for the development of another organ. Therefore, in the context of evolution, both the organ specific regulation and protein function would be the targets of natural selection on keystone genes.
As large-scale analyses of transcriptomes result in expression profiles of thousands of genes, it is now possible to address whether there might be any systematic differences between the regulation of keystone and other genes during organogenesis. Here we address such differences using the mammalian tooth. Especially the development of the mouse molar is well characterized, with over 70 genes that are known to be individually required for normal tooth development (6, 7). The detailed effects of null mutations of these genes are also well characterized, ranging from a complete developmental arrest to relatively mild modifications of morphology, or defects in the mineralized hard tissue (6, 7).
Classification of developmental keystone genes
Here our main focus is on a critical step in tooth development, namely the formation of the cap stage tooth germ (Fig. 1). At this stage the patterning of tooth crown begins, and the effects of experimental modifications in several signaling pathways first manifest themselves around this time of development (8). To classify the studied genes, we divide them into four different categories. The first category is the progression category containing genes that cause a developmental arrest of the tooth when null mutated (Fig. 1, genes with references in Appendix S1). The second set of genes belongs to the shape category and they alter the morphology of the tooth when null mutated. Unlike the progression genes, many of the shape gene caused modifications of teeth are subtle and functional, hence these genes are not strictly essential for tooth development. The third category is the tissue category and null mutations in these genes cause defects in the tooth hard tissues, enamel and dentine. Both the progression and shape categories include genes that are required for normal cap-stage formation. In contrast, the tissue category is principally related to the formation of extracellular matrix and these genes are known to be needed later in the development (9). Because there is more than a five-day delay from the cap-stage to matrix secretion in the mouse molar, here we considered the tissue category as a control for the first two categories. Additionally, we compiled a second control set of developmental genes that, while expressed during tooth development, are reported to lack phenotypic effects when null mutated (Table S1). This ‘non-keystone’ or dispensable category is defined purely within our operational framework of identifiable phenotypic effects and we do not imply that these genes are necessarily unimportant even within the context of tooth development. Many genes function in concert and the effects of their deletion only manifests when mutated in combinations (also known as synthetic mutations). We identified five such redundant pairs of paralogous genes and a single gene whose null phenotype surfaces in heterozygous background of its paralogue. Altogether these 11 genes were tabulated separately as a double category. In the progression, shape, tissue, and dispensable categories we tabulated 15, 28, 27, and 100 genes respectively (Fig. 1, Appendix S1, Table S1). While still limited, these genes should represent a robust classification of validated experimental effects. We note that these groupings do not exclude the possibility that a progression gene, for example, can also be required for normal hard tissue formation. Therefore, the keystone gene categories can be considered to reflect the temporal order in which they are first required during odontogenesis.
In addition to the categories studied here, there are genes required for the initiation of tooth development, of which many are also potentially involved in tooth renewal. Because the phenotypic effect of these initiation genes on tooth development precedes the visible morphogenesis, and the phenotype might include complete lack of cells of the odontogenic lineage, we excluded these genes from our analyses. Similarly, we excluded genes preventing tooth eruption with no specific effect on the tooth itself (Appendix S1).
To examine the keystone genes in the context of whole transcriptomes, we compared the expression levels with all the developmental-process genes (GO:00332502, ref. 10), as also with all the other protein coding genes of the mouse genome.
Results
Progression and shape category genes show elevated expression at the onset of tooth patterning
For a robust readout of gene expression profiles, we first obtained gene expression levels using both microarray and RNAseq techniques from E13 (bud stage) and E14 (cap stage) mouse molars (Materials and Methods). From dissected tooth germs we obtained five microarray and seven RNAseq replicates for both developmental stages. The results show that especially the progression category genes are highly expressed during E13 compared to the control gene sets (tissue, dispensable, and developmental-process categories, p values range from 0.0003 to 0.0416 for RNAseq and microarray experiments, tested using random resampling, for details and all the tests, see Materials and Methods, Fig. 2, Tables S2, S3). Comparable differences are observed in E14 molars (p values range from 0.0000 to 0.05, Fig. 2, Tables S2, S3).
In general, the expression differences between progression and tissue categories appear greater than between progression and dispensable categories (p values range from 0.0027 to 0.0416 and 0.0055 to 0.05, respectively, Table S3), suggesting that some of the genes in the dispensable category may still play a functional role in tooth development. In our data we have 11 genes that cause a developmental arrest of the tooth when double mutated (Appendix S1). The expression level of this double-mutant category shows incipient upregulation compared to that of the developmental-process category (p values range from 0.0322 to 0.1796 Table S3), but not when compared to the tissue or dispensable categories (p values range from 0.0931 to 0.5007, Table S3). Therefore, it is plausible, based on the comparable expression levels between double and dispensable categories, that many of the genes in the dispensable category may cause phenotypic effects when mutated in pairs.
Even though the shape category expression levels are lower than that of the progression category (Fig. 2), at least the E14 microarray data suggests elevated expression levels relative to all the other control categories (p values range from 0.0002 to 0.0383, Table S3). The moderately elevated levels of expression by the shape category genes could indicate that they are required slightly later in development, or that the most robust upregulation happens for genes that are critical for the progression of the development. The latter option seems to be supported by a RNAseq analysis of E16 molar, showing only slight upregulation of shape category genes in the bell stage (Table S3).
One complication of our expression level analyses is that these have been done at the whole organ level. Because many of the genes regulating tooth development are known to have spatially complex expression patterns within the tooth, cell-level examinations are required to decompose the patterns within the tissue.
Single-cell RNAseq reveals cell-level patterns of keystone genes
Tooth development is punctuated by iteratively forming epithelial signaling centers, the enamel knots. The first, primary enamel knot, is active in E14 molar and at this stage many genes are known to have complex expression patterns. Some progression category genes have been reported to be expressed in the enamel knot, whereas others have mesenchymal or complex combinatorial expression patterns (8, 9). To quantify these expression levels at the cell-level, we performed a single-cell RNAseq (scRNAseq) on E14 molars (Material and Methods). We focused on capturing a representative sample of cells by dissociating the tooth germs without cell sorting (n = 4). After data filtering, 7000 to 8811 cells per tooth were retained for the analyses, providing 30930 aggregated cells for a relatively good proxy of the E14 tooth (Material and Methods).
First we examined whether the scRNAseq produces comparable expression levels to our previous analyses. For the comparisons, the gene count values from the cells were summed up and treated as bulk RNAseq data (Fig. 3A, Material and Methods). We analyzed the expression levels of different gene categories as in the bulk data (Fig 2) and the results show a general agreement between the experiments (Fig. 2, 3B). As in the previous analyses (Table S3), the progression category shows the highest expression levels compared to the control gene sets (p values range from 0.0076 to 0.0309, Table S3). Although the mean expression of the shape category is intermediate between progression and control gene sets, scRNAseq shape category is not significantly upregulated in the randomization tests (p values range from 0.7825 to 0.9971). This pattern reflects the bulk RNAseq analyses while the microarray data showed stronger upregulation (Fig. 2), suggesting a potentially discernable but subtle upregulation of shape category genes.
Unlike the bulk transcriptome data, the scRNAseq data can be used to quantify the effect of size expression domain size. The importance of expression domain size is well evident in the scRNAseq data when we calculated the number of cells that express each gene (Material and Methods). The data shows that the overall tissue level gene expression is highly correlated with the cell population size (Fig. 4A). In other words, the size of the expression domain is the key driver of expression levels measured at the whole tissue level.
To examine the cell level patterns further, we calculated the mean transcript abundances for each gene for the cells that express that gene (see Material and Methods). This metric approximates the cell-level upregulation of a particular gene, and is thus independent of the size of the expression domain. We calculated the transcript abundance values for each gene in each cell that expresses progression, shape, tissue, double, and dispensable gene categories. The resulting mean transcript abundances were contrasted to that of the dispensable category (Material and Methods). The results show that the average transcript abundance is high in the progression category whereas the other categories show roughly comparable transcript abundances (Fig. 4B). Considering that the progression category genes have highly heterogeneous expression patterns (e.g., Fig. 4C), their high cell-level transcript abundance (Fig. 4B) is suggestive about the critical role of these genes at the cell level. That is, progression category genes are not only highly expressed at the tissue level because they have broad expression domains, but rather because they are upregulated in individual cells irrespective of domain identity or size. These results suggest that high cell-level transcript abundance is a characteristic feature of genes essential for the progression of tooth patterning. We note that although the dispensable category has several genes showing comparable expression levels with that of the progression category genes at the tissue level (Fig. 2), their cell-level transcript abundances are predominantly low (Fig. 4B).
Next we examined more closely the differences between progression and shape category genes, and to what extent the upregulation of the keystone genes reflect the overall expression of the corresponding pathways.
Keystone gene upregulation in the context of their pathways
In our data the developmental-process genes appear to have slightly elevated expression levels compared to the other protein coding genes (Figs 2, 3B), suggesting an expected and general recruitment of the pathways required for organogenesis. To place the progression and shape category genes into the specific context of their corresponding pathways, we investigated whether the pathways implicated in tooth development show elevated expression levels. Six pathways, Fgf, Wnt, Tgfβ, Hedgehog (Hh), Notch, and Ectodysplasin (Eda), contain the majority of progression and shape genes (Materials and Methods). First we used the RNAseq of E14 stage molars to test whether these pathways show elevated expression levels. We manually identified 272 genes belonging to the six pathways (Materials and Methods, Table S4). Comparison of the median expression levels of the six-pathway genes with the developmental-process genes shows that the pathway genes are a highly upregulated set of genes (Fig. 5A, p < 0.0001, random resampling). This difference suggests that the experimentally identified progression and shape genes might be highly expressed partly because they belong to the developmentally upregulated pathways. To specifically test this possibility, we contrasted the expression levels of the progression and shape genes to the genes of their corresponding signaling families.
The 15 progression category genes belong to four signaling families (Wnt, Tgfβ, Fgf, Hh) with 221 genes in our tabulations. Even though these pathways are generally upregulated in the E14 tooth, the median expression level of the progression category is still further elevated (Fig. 5B, p < 0.0001). In contrast, the analyses for the 28 shape category genes and their corresponding pathways (272 genes from Wnt, Tgfβ, Fgf, Hh, Eda, Notch) show comparable expression levels (Fig. 5C, p = 0.5919). Whereas this contrasting pattern between progression and shape genes within their pathways may explain the subtle upregulation of the shape category (Fig. 2), the difference warrants a closer look. Examination of the two gene categories reveals that compared to the progression category genes, relatively large proportion of the shape category genes are ligands (36% shape genes compared to 20% progression genes, Appendix S1). In our scRNAseq data, ligands show generally smaller expression domains than other genes (roughly by half, Fig. 5D, E), and the low expression of the shape category genes seems to be at least in part driven by the ligands (Fig. 5C, Table S5).
Overall, the upregulation of the keystone genes within their pathways appears to be influenced by the kind of proteins they encode. In this context it is noteworthy that patterning of tooth shape requires spatial regulation of secondary enamel knots and cusps, providing a plausible explanation for the high proportion of genes encoding diffusing ligands in the shape category.
Discussion
Identification and mechanistic characterization of developmentally essential or important genes has motivated a considerable research effort (e.g. 1–4, 6). One general realization has been that despite the large number of genes being dynamically expressed during organogenesis, only a subset appears to have discernable effects on the phenotype. This parallels with the keystone species concept used in ecological research (5). Keystone species, that may include relatively few species in a community, are thought to have disproportionally large influence on their environment. Similarly, developmental keystone genes have disproportionally large effects on the phenotypic outcome of their system. Here we considered keystone genes strictly within a developmental system, and these genes, or more accurately their protein products, can be understood as keystone resources (11) that are required for normal development. We note that whereas keystone genes can also be considered within the context of their effects on ecosystems (12), here we limit the explanatory level to a specific organ system. In our case, the ‘ecosystem’ has been a developing mammalian tooth.
We took advantage of the in-depth knowledge on the details of the phenotypic effects of various developmental genes (Appendix S1). This allowed us to classify genes into different categories that reflect their functional role during organogenesis. Furthermore, a multitude of studies have made it possible to identify a dispensable or non-keystone gene category. Obviously, as in ecological data, our category groupings can be considered a work in progress as new genes and reclassifications are bound to refine the patterns. Nevertheless, our analyses should provide some robust inferences.
Most notably, genes that are individually required for the progression of mouse molar development were found to be highly expressed (Figs 2, 3). These genes were highly expressed even within their pathways (Fig. 5A,B) and had markedly high cell-level transcript abundances (Fig. 4B). The high expression level of these progression category genes may well signify their absolute requirement during the cap stage of tooth development. Indeed, it is typically by this stage that a developmental arrest happens when many of the progression genes are null mutated. Interestingly, mice heterozygous for the null-mutated progression genes appear to have normal teeth (Appendix S1). A possible hypothesis to be explored is to examine whether the high cell-level transcript abundance of the progression category is a form of haplosufficiency in which the developmental system is buffered against mutations affecting one allele. Another possibility, that has some experimental support (13), is that there are regulatory feedbacks to boost gene expression to compensate for a null-allele. In contrast to the progression category, gene pairs arresting tooth development as double mutants have relatively low expression levels, perhaps suggestive that many genes in the dispensable category could be redundant to each other.
Evolutionarily, because all the studied progression and shape category genes are involved in the development of multiple organ systems, our results may point to cis-regulatory differences that specifically promote the expression of these genes in an organ specific manner. Consequently, species that are less reliant on teeth (e.g., some seals) or have rudimentary teeth (e.g., baleen whales) can be predicted to have lowered expression levels of the progression genes. At an organism level, our gene categories should not be considered as indicative of having simple effects on individual fitness. For example, in our specific case of teeth, defects on enamel may be more costly for the mammalian parent than, for example, a dominant null mutation causing an arrest of tooth development with comparable defects on other organ systems.
Considering the numerous genes expressed in a developing organ system, our results point to the potential to use cell-level expression levels to identify other genes critical for organogenesis. Here the single-cell transcriptomes provided a more nuanced view into the spatial patterns of the different gene categories than the tissue level transcriptomes alone. In our tabulation over a third of the shape category genes were ligands. Tooth shape patterning involves spatial placement of signaling centers that in turn direct the growth and folding of the tissue (8). The involvement of several secreted ligands in this patterning process, and consequently in the shape category, is likely to reflect the requirement of the developmental machinery to produce functional cusp patterns. These cusp patterns are also a major target of natural selection because evolutionary diversity of mammalian teeth is largely made from different configurations of cusps. At the same time, partly due to ligands having generally more restricted expression domains compared to receptors and intracellular proteins, the shape category expression levels were found to be generally lower than that of the progression category. That ligands tend to have smaller expression domains whereas receptors have broader expression domains for tissue competence has been recognized in many individual studies (e.g. refs 14 and 15, and partly in 16), but our analyses suggest that this is a general principle detectable in large-scale transcriptome data. This pattern is also compatible with the classic concepts of tissue competence and evocators or signals produced by organizers (17). Nevertheless, it remains to be explored how the low signal-competence ratio emerges from highly heterogeneous expression domains of various genes. In our data, at least the ligand Eda is expressed in a larger number of cells than its receptor Edar, suggesting that there are individual exceptions to the general pattern. Another potentially interesting observation is that Sostdc1 and Fst, both secreted sequesters or inhibitors of signaling, were among the most broadly expressed of the ligands. Thus, at least some of the exceptions to the low signal-competence ratio may be modulators of tissue competence.
In conclusion, genes critical for tooth patterning are highly expressed, but the level of upregulation is influenced by the kind of proteins they encode. Combining phenotypic classifications of the roles of genes in development with transcriptomes enables new ways to integrate experimental data with development. With advances in the analyses of transcriptomes and gene regulation, it will be possible to explore experimental data from other organs and species to test and identify system level principles of organogenesis.
Material and Methods
Classification of genes
We performed extensive literature review of genes that are expressed in developing tooth. The genes were divided into categories based on the effect that their null mutation has on the development of the first mandibular molar in the mouse. Full null-mutant mouse information was used whenever available. Because many developmental genes function in multiple organs and stages during development, full null mutants of several genes are lethal before tooth development even begins. Therefore, we also used information based on the tooth phenotypes of conditional mutant mice (in eight cases, Appendix S1). The effect of conditional mutants can be milder and in our data we have four shape genes that could potentially be in the progression category. To test for overall the robustness of the patterns, we analyzed the expression levels also after combining the progression and shape category genes, and the pattern of overall upregulation remained largely the same.
Genes that are individually indispensable for normal tooth development were classified as keystone genes. Conversely, genes, which loss-of-function mutation has no effect on tooth development are non-keystone genes. The tabulation of these dispensable genes was done from published reports and by inspecting published figures and data. Keystone genes were further divided into four categories based on the type of effect their loss of function has on tooth development. The progression category is defined by a null mutant phenotype that is a developmental arrest of the tooth. The shape category has the genes whose null mutations alter the morphology (or shape) of the tooth. The tissue category contains the genes whose null mutations cause defects in the tooth hard tissues. Only the progression genes can be considered essential for organ development per se, whereas the effects of many of the shape category genes can be quite subtle.
We created a manually curated list of genes in the six key pathways (Wnt, Tgfβ, Fgf, Hh, Eda, Notch) based on review publications and allocated the keystone genes into these pathways where appropriate. The keystone genes, non-keystone genes, and the pathway genes were also classified as ‘ligand (signal), ‘receptor’, ‘intracellular molecule’, ‘transcription factor’ or ‘other’. Because these kinds of classifications are not always trivial as some biological molecules have multiple functions in the cell, we used the inferred primary role in teeth. The developmental-process genes with GO term “GO:0032502” and experimental evidence codes were obtained from R package “org.Mm.eg.db” (18). Only curated RefSeq genes are used in this study. All tabulations are in Appendix S1 and Table S1, and S4.
Ethics statement
All mouse studies were approved and carried out in accordance with the guidelines of the Finnish national animal experimentation board under licenses KEK16-021, ESAVI/2984/04.10.07/2014 and ESAV/2363/04.10.07/2017.
Dissection of teeth
Wild type tooth germs were dissected from embryonic stages corresponding to E13, E14 and E16 molar development. For bulk and single-cell RNAseq we used C57BL/6JOlaHsd mice, and for microarray we used NMRI mice. Minimal amount of surrounding tissue was left around the tooth germ, at the same time making sure that the tooth was not damaged in the process. The tissue was immediately stored in RNAlater (Qiagen GmbH, Hilden, Germany) for RNAseq or in TRI Reagent (Merck, Darmstadt, Germany) in - 80°C for microarray. For microarray, a few tooth germs were pooled for each sample and five biological replicas were made. For RNAseq, each tooth was handled individually and seven biological replicates were made. Numbers of left and right teeth were balanced.
RNA extraction
The tooth germ was homogenised into TRI Reagent (Merck, Darmstadt, Germany) using Precellys 24 -homogenizer (Bertin Instruments, Montigny-le-Bretonneux, France). The RNA was extracted by guanidium thiocyanate-phenol-chloroform method and then further purified by RNeasy Plus micro kit (Qiagen GmbH, Hilden, Germany) according to manufacturer’s instructions. The RNA quality was assessed for some samples with 2100 Bioanalyzer (Agilent, Santa Clara, CA) and all the RIN values were above 9. The purity of RNA was analysed by Nanodrop microvolume spectrophotometer (ThermoFisher Scientific, Waltham, USA). RNA concentration was measured by Qubit 3.0 Fluorometer (ThermoFisher Scientific, Waltham, USA). The cDNA libraries were prepared with Ovation Mouse Universal RNAseq System (Tecan, Zürich, Schwitzerland).
Bulk RNA expression analysis
Gene expression levels were measured both in microarray (Affymetrix Mouse Exon Array 1.0, GPL6096) and RNAseq (platforms GPL19057, Illumina NextSeq 500). The microarray gene signals were normalized with aroma.affymetrix (19) package using Brainarray custom CDF (Version 23, released on Aug 12, 2019) (20). Whereas the RNAseq reads (84 bp) were evaluated and bad reads are filtered out using FastQC (21), AfterQC (22) and Trimmomatic (23). This has resulted into on average 63 millions reads per sample. Then good reads were aligned with STAR (24) to (GRCm38/mm10/Ensembl release 90 - August 2017) and counts for each gene was performed by HTSeq (25) tool. On average 85% of reads were uniquely mapped to the genome.
Single cell RNA sequencing
Each tooth was processed individually in the single-cell dissociation. The tooth germ was treated with 0.1 mg/ml liberase (Roche, Basel, Schwitzerland) in Dulbecco’s solution for 15 min at 28°C in shaking at 300 rpm followed by gentle pipetting to detach the mesenchymal cells. Then the tissue preparation was treated with TrypLE Select (Life Technologies, Waltham, USA) for 15 min at 28°C in shaking at 300 rpm followed by gentle pipetting to detach the epithelial cells. The cells were washed once in PBS with 0,04% BSA. The cells were resuspended in 50 µl PBS with 0.04% BSA. We used the Chromium single cell 3’ library & gel bead Kit v3 (10x Genomics, Pleasanton, USA). In short, all samples and reagents were prepared and loaded into the chip. Then, Chromium controller was used for droplet generation. Reverse transcription was conducted in the droplets. cDNA was recovered through emulsification and bead purification. Pre-amplified cDNA was further subjected to library preparation. Libraries were sequenced on an Illumina Novaseq 6000 (Illumina, San Diego, USA). All the sequencing data are available in GEO under the accession number (<accession numbers go here>).
Data analysis
For scRNAseq, 10x Genomics Cell Ranger v3.0.1 pipelines were used for data processing and analysis. The “cellranger mkfastq” was used to produce fastq files and “cellranger count” to perform alignment, filtering and UMI counting. Alignment was done against mouse genome GRCm38/mm10. The resultant individual count data were finally aggregated with “cellranger aggr”. Further, the filtered aggregated feature-barcode matrix was checked for quality and normalization using R package Seurat (26). Only cells with ≥ 20 genes and genes expressed in at least 3 cells were considered for all the downstream analysis. For a robust set of cells for the expression level calculations, we limited the analyses to 30930 cells that had transcripts from 3000 to 9000 genes (7000 to 180000 unique molecular identifiers) with less than 10% of the transcripts being mitochondrial. For comparison with bulk RNAseq data (Figs 3), single-cell data was normalized with DeSeq2 (27) together with the corresponding bulk RNAseq samples, and median expression levels were plotted. The average cell-level expression (Fig. 4B) of a gene X was calculated as where NXk is normalized expression of gene X in cell k and the denominator is the count of cells with non-zero reads. All statistical tests corresponding to Tables S3 and S4 were performed using R package “rcompanion” (28) and custom R scripts.
Acknowledgements
We thank I. Salazar-Ciudad and S. F. Gilbert for advice and the members of the Center of Excellence in Experimental and Computational Developmental Biology Research for discussions. We thank A. Viherä for technical assistance. We thank P. Auvinen, L. Paulin and P. Laamanen at DNA Sequencing and Genomics Laboratory for bulk RNA sequencing. We thank J. Lahtela at FIMM Single Cell Analytics for single cell RNA sequencing. Financial support was provided by the Academy of Finland, and Jane and Aatos Erkko Foundation.
Footnotes
Abstract and text clarified, expanded methods, acknowledgements, author name correction.