Species-specific deployment of Runx2 isoforms and differential regulation of target genes during avian jaw development and evolution

Developmental regulation of bone formation in the jaw skeleton is essential to species-specific adaptation. The jaws are derived from neural crest mesenchyme (NCM), a progenitor population that directs skeletal patterning by exerting temporal and spatial control over molecular and cellular programs for osteogenesis. One important NCM-mediated gene is Runx2, which is a transcription factor required for osteoblast differentiation. RUNX2 protein binds many target genes involved in the deposition and resorption of bone. To determine the extent to which changes in Runx2 structure, function, and expression underlie the evolution of the jaw skeleton, we compare Runx2 across vertebrates and within birds. Runx2 contains two alternative promoters, tandem repeats of glutamine and alanine with variable lengths in different species, a conserved DNA-binding domain, an exon that is alternatively spliced, as well as two possible C-termini. Such alternative splicing produces eight potential isoforms that show distinct stage- and species-specific patterns in the jaw primordia of chick, quail and duck embryos. We also find that certain isoforms are strongly induced by TGFβ signaling whereas others are not. Overexpressing Runx2 isoforms in NCM reveals that some are transcriptionally activating, while others are repressive. But context appears to be relevant since species-specific polymorphisms in the promoter of target genes like Mmp13, can modulate the effects of different isoforms. Overall, our study indicates that the structure and species-specific deployment of Runx2 isoforms affect the transcriptional activity of target genes in ways that may have played a generative and regulatory role in the evolution of the avian jaw skeleton.


ABSTRACT
Developmental regulation of bone formation in the jaw skeleton is essential to speciesspecific adaptation. The jaws are derived from neural crest mesenchyme (NCM), a progenitor population that directs skeletal patterning by exerting temporal and spatial control over molecular and cellular programs for osteogenesis. One important NCMmediated gene is Runx2, which is a transcription factor required for osteoblast differentiation. RUNX2 protein binds many target genes involved in the deposition and resorption of bone. To determine the extent to which changes in Runx2 structure, function, and expression underlie the evolution of the jaw skeleton, we compare Runx2 across vertebrates and within birds. Runx2 contains two alternative promoters, tandem repeats of glutamine and alanine with variable lengths in different species, a conserved DNA-binding domain, an exon that is alternatively spliced, as well as two possible Ctermini. Such alternative splicing produces eight potential isoforms that show distinct stage-and species-specific patterns in the jaw primordia of chick, quail and duck embryos. We also find that certain isoforms are strongly induced by TGFβ signaling whereas others are not. Overexpressing Runx2 isoforms in NCM reveals that some are transcriptionally activating, while others are repressive. But context appears to be relevant since species-specific polymorphisms in the promoter of target genes like Mmp13, can modulate the effects of different isoforms. Overall, our study indicates that the structure and species-specific deployment of Runx2 isoforms affect the transcriptional activity of target genes in ways that may have played a generative and regulatory role in the evolution of the avian jaw skeleton.

INTRODUCTION
The craniofacial skeleton is one of the most highly adapted structures of vertebrates. Its anatomy often evolves in ways that seem perfectly suited to meet functional and ecological demands. To understand how the craniofacial skeleton becomes modified during evolution, we have been assaying for species-specific differences in molecular and cellular programs underlying skeletogenesis. Experimental work spanning almost a century has shown that neural crest mesenchyme (NCM), which is the precursor population that gives rise to the cartilages and bones of the jaws and face, is the primary source of species-specific pattern (Harrison, 1935;de Beer, 1947;Hörstadius, 1950;Harrison, 1969;Hall and Hörstadius, 1988;Noden and Schneider, 2006;Fish and Schneider, 2014b;Schneider, 2018b;Woronowicz and Schneider, 2019). For example, embryonic transplants of presumptive NCM from quail or chick into duck produce short quail-and chick-like beaks and jaw skeletons on duck hosts, and long duck-like bills on quail hosts Tucker and Lumsden, 2004;Schneider, 2005;Eames and Schneider, 2008;Solem et al., 2011;Woronowicz et al., 2018). To accomplish such a complex and profound task, NCM exerts spatial and temporal control over the expression of genes that mediate the induction, deposition, and resorption of bone (Merrill et al., 2008;Hall et al., 2014;Ealba et al., 2015;Schneider, 2015). One gene in particular, Runt related transcription factor 2 (Runx2), shows intriguing species-specific patterns with quail expressing higher endogenous levels than duck, coincident with their smaller jaw size. Runx2 is considered a master regulator of osteogenesis as evidenced by its effects on osteoblasts and the loss of bone in Runx2 mutant mice (Ducy et al., 1997;Komori et al., 1997;Otto et al., 1997;Karsenty et al., 1999;Ducy, 2000;Teplyuk et al., 2008;Komori, 2010b;Adhami et al., 2015;Vimalraj et al., 2015;Takarada et al., 2016;Shirai et al., 2019). Over-expressing Runx2 prematurely or at higher levels can reduce the size of the jaw skeleton (Eames et al., 2004;Hall et al., 2014). Exactly how higher levels of Runx2 expression relate to smaller jaw size is unclear, but presumably mechanisms involve the differential activation and/or repression of target genes.
To address this issue, we first examine the architecture of Runx2 at a high level across vertebrates and map the major transitions in Runx2 gene evolution. Our goal is to distinguish between domains that are variable or conserved. Then we perform a series of in vitro and in vivo functional analyses of Runx2 primarily using a three-taxon comparison in which two taxa are more closely related than either are to a third, since this is considered a robust strategy for making parsimonious inferences about evolution (Nelson and Platnick, 1991;Mavrodiev et al., 2019;Rineau et al., 2020). We exploit duck, chick, and quail in particular, because their jaw skeletons differ in size (from relatively large to small), their genomes are sequenced and annotated, and their embryos can be easily matched using the Hamburger and Hamilton (HH) staging system (Hamburger and Hamilton, 1951;Koecke, 1958;Padgett and Ivey, 1960;Zacchei, 1961;Hamilton, 1965;Yamashita and Sohal, 1987;Le Douarin et al., 1996;Ricklefs and Starck, 1998;Starck and Ricklefs, 1998;Nakane and Tsudzuki, 1999;Schneider and Helms, 2003;Stern, 2005;Lwigale and Schneider, 2008;Sauka-Spengler and Barembaum, 2008;Jheon and Schneider, 2009;Ainsworth et al., 2010;Mitgutsch et al., 2011;Fish and Schneider, 2014a;Gammill et al., 2019;Chu et al., 2020;Smith et al., 2020). As Galliformes, chick and quail are separated by approximately 50 million years, and they diverged from a common ancestor with Anseriformes, which include duck, around 100 million years ago (Pereira and Baker, 2006;Hackett et al., 2008;Kan et al., 2010).
Our comparative analyses reveal that RUNX2 is extremely conserved across vertebrates with the exception of the QA repeats and second C-terminus. Such conservation suggests that instead of major modifications within individual domains, spatiotemporal changes to the levels of Runx2 expression and/or alternative splicing, could serve as a primary mechanism for skeletal evolution by differentially regulating target genes. In support of this hypothesis, our spatial and temporal analyses of Runx2 gene and protein expression in the developing jaw primordia of chick, quail, and duck uncover species-specific differences at embryonic stages when the bony skeleton is undergoing deposition and resorption. Moreover, by sequencing Runx2 mRNA transcripts present in the jaw primordia during these stages, we detect the presence of eight Runx2 isoforms. A subsequent RNA isoform sequencing (Iso-Seq) experiment reveals species-specific differences in the proportions and spatial localization of these isoforms, which we then show can vary over time and become differentially expressed in response to recombinant (r) TGFβ1. To test the ability of these isoforms to regulate target genes, we perform overexpression experiments in vitro and in ovo and find that some isoforms are activating while others are repressive, and their effects vary among species. As a proof-of-principle, we focus on the regulation of the Mmp13 promoter by RUNX2. We test the functional consequences of two single nucleotide polymorphisms (SNPs) directly downstream of the RUNX2 binding element within the Mmp13 promoter that distinguish quail and chick from duck. Switching these SNPs between species alters Mmp13 promoter activity presumably by affecting interactions with RUNX2 isoforms. Overall, our studies demonstrate that alterative splicing of Runx2 is a key feature of avian jaw development and that the combinatorial, spatial, temporal, and/or species-specific expression of Runx2 isoforms by NCM differentially regulates downstream target genes involved in the deposition and resorption of bone, and likely affects jaw anatomy during evolution.

Genomic Analysis and Sequencing of Runx2
The National Center for Biotechnology Information (NCBI) GenBank database was queried using "runx2" as a search term resulting in 215 sequences from species with an annotated Runx2 gene. Each sequence was aligned to human, mouse, and chick Runx2 sequences as well as Runx2 motif sequences to determine if the full length Runx2 sequence was present. Incomplete sequences were excluded from further analysis.
FastTree (Price et al., 2010) was used in Geneious (Version 2020.2.4, Geneious Prime, Auckland, New Zealand) to generate an unrooted tree of all of the full length RUNX2 amino acid (AA) sequences via the BLOSUM62 cost matrix (Henikoff and Henikoff, 1992).
To identify sequences of the Runx2 promoters used by chick, quail, and duck, 5' rapid amplification of cDNA ends (RACE) reactions were carried out using template switching oligos (Matz, 2003;Picelli et al., 2014;Bagnoli et al., 2018). In short, reverse transcription was performed using the Maxima H Minus First Strand cDNA Synthesis Kit (K1651, Thermo Fisher Scientific, Waltham, MA, USA) following the manufacturer's directions with 2 μg of total RNA, 100 pmol of Runx2 primer, and 50 pmol of template switching oligo.
The cDNA synthesis reaction was carried out at 50ºC for 30 min, 55ºC for 20 min, cycled five times for 2 min at 55ºC and 2 min at 42ºC, cycled five times for 2 min at 60ºC and 2 min at 42ºC, cycled 10 times for 65ºC for 2 min and 2 min at 42ºC, and 85ºC for 5 min.
Amplification of cDNA ends was carried out using step-out PCR as previously described (Matz et al., 2003) with the Q5 Hot Start High-Fidelity DNA Polymerase (M0493L, NEB, Ipswich, MA).
To sequence Runx2 isoforms, full length cDNA synthesis from RNA was carried out using Maxima H-reverse transcriptase following the manufacturer's directions with 2 μg of total RNA and 100 pmol of d(T)20 VN primer. The cDNA synthesis reaction was conducted at 50°C for 30 min, 55°C for 10 min, 60°C for 10 min, 65°C for 10 min, and 85°C for 5 min.
To amplify each isoform, primers were designed targeting promoter 1 (P1) and promoter 2 (P2) sequences identified from the 5' RACE and C-terminus 1 (C1) and C-terminus 2 (C2). All primers are listed in Supplemental Table 1. Amplification was carried out using Q5 Hot Start High-Fidelity DNA Polymerase following the manufacturer's directions adding 1 μl of cDNA per 100 μl cycling 30 times at 98°C for 10 sec, 65°C for 20 sec, and 72°C for 2 min. Amplicons were purified using GeneJET PCR Purification Kit (K0702, Thermo Fisher Scientific, Waltham, MA, USA) following the manufacturer's directions and cloned using CloneJET PCR Cloning Kit (K1231, Thermo Fisher Scientific, Waltham, MA, USA) following the manufacturer's directions. At least 10 colonies were sequenced for each primer pair.

The Use of Avian Embryos
For all experiments, we adhered to accepted practices for the humane treatment of avian embryos as described in S3.4.4 of the AVMA Guidelines for the Euthanasia of Animals: 2013Edition (Leary et al., 2013. Fertilized eggs of chicken (Gallus gallus), Japanese quail (Coturnix coturnix japonica), and white Pekin duck (Anas platyrhynchos domestica) were purchased from AA Lab Eggs (Westminster, CA) and incubated at 37.5°C in a humidified chamber (GQF Hova-Bator 1588, Savannah, GA) until they reached embryonic stages appropriate for analyses. Embryos were matched at equivalent stages the using the Hamburger and Hamilton (HH) staging system, a well-established standard that utilizes an approach based on external morphological characters, that is independent of body size and incubation time, and that can be adapted to other avian species such as quail and duck (Hamburger and Hamilton, 1951;Hamilton, 1965;Ricklefs and Starck, 1998;Starck and Ricklefs, 1998;Schneider and Helms, 2003;Lwigale and Schneider, 2008;Jheon and Schneider, 2009;Ainsworth et al., 2010;Mitgutsch et al., 2011;Fish and Schneider, 2014a;Smith et al., 2015).

RNA Extractions and Quantitative PCR
Lower jaws were dissected from chick, quail, and duck embryos and total RNA was extracted using the RNeasy Plus Mini Kit (74136, Qiagen, Hilden, Germany) following the manufacturer's protocol. Lower jaws were resuspended in 600 μl of RTL plus buffer supplemented with 1% β-mercaptoethanol (M3148-100ML, MilliporeSigma, Burlington, MA, USA) and Reagent DX (19088, Qiagen, Hilden, Germany). HH31 and HH34 lower jaws were processed in a Bead Fisher Scientific,Waltham,MA,USA) at 5 m/s for 30 s with 1.4 mm ceramic beads Fisher Scientific, Waltham, MA, USA). HH37 and older lower jaws were homogenized at 5 m/s for 60 s with 2.8 mm ceramic beads Fisher Scientific,Waltham,MA,USA).
Primer sets are listed in Supplemental Table 1. Each sample was assayed in technical duplicate.

RNA Isoform Sequencing, Data Alignment, and Normalization
cDNA was synthesized using NEBNext Single Cell/Low Input cDNA Synthesis & Amplification Module (E6421, NEB, USA) following PacBio Iso-Seq template preparation guidelines. cDNA was barcoded using the PCR barcoding expansion 1-96 kit (EXP-PBC096, Oxford Nanopore Technologies, Oxford, UK). cDNA samples were pooled and a sequencing library was prepared using the ligation sequencing kit (Cat # SQK-LSK110, ONT, UK). 50 fmol of the final library was loaded onto each of the two PromethION flowcells (v R9.4.1) and the run was performed for 72 hours. Basecalling and demultiplexing was performed live on the PromethION compute module with basecaller version ont-guppy-for-minkow 4.0.11.
To determine the number of Runx2 isoforms, Nanopore sequenced reads were aligned to the runt domain of each respective species using minimap2 (Li, 2018) in Geneious Prime (Prime, 2019). Aligned reads were then aligned to Runx2 mRNA sequence using Geneious Mapper at the highest sensitivity with the maximum gap size increased to 1000.
Aligned reads that did not unambiguously include either P1 or P2 sequence, presence or absence of exon 5, and either C1 or C2 sequence were manually removed. The remaining aligned reads were then counted to determine the number reads for each Runx2 isoform and the proportion relative to other isoforms. Given the unique challenges of normalizing RNA-seq data among different species (Zhou et al., 2019), we assessed relative expression by normalizing the number of reads for each Runx2 isoform by total reads per sample.

Generation of Runx2 Isoform Constructs
To generate Runx2 overexpression constructs, full length cDNA was synthesized using Maxima H-reverse transcriptase (K1651, Thermo Fisher Scientific, Waltham, MA, USA) following the manufacturer's protocol with 2 μg of total HH37 chick, quail, or duck jaw RNA and 100 pmol of d(T)20 VN primer. The cDNA synthesis reaction was carried out at 50ºC for 30 min, 55ºC for 10 min, 60ºC for 10 min, 65ºC for 10 min, and 85ºC for 5 min.
All constructs were verified by sequencing and midi-prepped for electroporation using PureLink Fast Low-Endotoxin Midi Kit (A36227, Invitrogen, Carlsbad, CA, USA).

Electroporation
In ovo electroporations were performed as described previously . volt, with 50 ms spaces), followed by five square pulses (50-ms long, 10 volt, with 50 ms spaces), were administered (CUY21EDITII Next Generation Electroporator, BEX CO, Ltd, Tokyo, Japan) as done previously to allow unilateral entry of the DNA into the presumptive NCM destined for the mandibular arch (Creuzet et al., 2002;Krull, 2004;McLennan and Kulesa, 2007;Hall et al., 2014). The contralateral (un-electroporated) side served as an internal control. Embryos were allowed to develop until HH34 and then were treated in ovo with a single dose of 3.75 µg of doxycycline (dox) hyclate (446060250, Acros

Mmp13 Promoter Sequencing and Generation of Mmp13 Promoter Constructs
Previously, we identified RUNX2 binding sites in the Mmp13 promoter  using the JASPAR 2020 database, which contains transcription factor-binding profiles stored as position frequency matrices (Fornes et al., 2020). Briefly, to map transcription factor-binding sites onto the Mmp13 promoter sequences of chick, quail, and duck we used TFBSTools (Tan and Lenhard, 2016), which is an R package (Team, 2013). For the proximal region of the Mmp13 promoter (i.e., -184 bp for chick and quail, and -181 bp for duck) all vertebrate transcription factors were included in the analysis. Position frequency matrices were converted to position weighted matrices by setting pseudocounts to 0.8 (Nishida et al., 2009), and background frequencies of nucleotides to 0.25. The minimum threshold score was set to 95% for the -184/181 bp promoter region and 90% for the -2 kb promoter region.
In a previous study we generated luciferase constructs containing regions of the Mmp13 promoter . Briefly, Mmp13 promoter sequences for chick, quail, and duck were amplified by PCR using Q5 Hot Start High-Fidelity DNA Polymerase (M0493L, NEB, Ipswich, MA, USA). pGL3 was digested with HindIII-HF (R3104S, NEB, Ipswich, MA, USA) and XhoI (R0146S, NEB, Ipswich, MA, USA). The amplified Mmp13 promoter sequences and digested pGL3 were purified using GeneJET PCR Purification Kit and cloned using NEBuilder HiFi DNA Assembly Master Mix (E2621L, NEB, Ipswich, MA, USA). Mutations in Mmp13 promoter SNPs were generated through site-specific mutagenesis PCR with the mutations in the primers (Ho et al., 1989). All constructs were verified by sequencing and midi-prepped for transfection using PureLink Fast Low-Endotoxin Midi Kit (A36227, Invitrogen, Carlsbad, CA, USA).

Statistics and Image Processing
Data are represented as a mean ± standard error of the mean (SEM). Statistical significance was determined through two-tailed ANOVA adjusted for multiple comparisons using the Bonferoni (Prism, Version 9.1.0, GraphPad, San Diego, CA). A Student's t-test was used for comparisons between Runx2 isoform overexpression and contralateral controls. A Benjamini-Hochberg procedure was used for luciferase experiments comparing Runx2 isoform overexpression to empty vector controls. For in ovo data, "n" refers to the total number of embryos analyzed per group. For in vitro data, "n" refers to the total number of individual wells analyzed per group, with each experiment replicated at least three times. In all figures, p ≤ 0.05 is considered statistically significant, although some statistical comparisons reached significance below p ≤ 0.01, p ≤ 0.001, or p ≤ 0.0001 where noted. Formal power analyses were not conducted, and group size "n" is denoted in the figure legends. Images of tissue sections and embryos were adjusted in Adobe Photoshop 2020 (version 21.2.2) to normalize for exposure, brightness, contrast, saturation, and color balance across samples. Figures were assembled in Adobe Illustrator 2020 (Version 24.2.3).

The Overall Structure of Runx2 is Highly Conserved Across Vertebrates
We compiled a list of annotated Runx2 sequences on NCBI and find 226 species. We then compared these with NCBI sequences for human, mouse, chick, and zebrafish, as well as sequences for domains of Runx2 to determine if the NCBI sequences included the complete Runx2 gene. Around 55% (123 out of 226) of the species examined with an annotated Runx2 gene on NCBI appear to be incomplete. Nearly all of the incomplete sequences are missing N-terminal exons and some sequences only contain a single Cterminal exon. We expect that most if not all of the incomplete sequences in NCBI are related to sequencing, assembly, and bioinformatics issues rather than representing actual Runx2 gene evolution. Most Runx2 truncations appear to occur near the QA repeat, which may reflect the difficulty of sequencing and generating contigs of repetitive sequences. We also find that most of the incomplete sequences contain gaps or highly repetitive sequences directly upstream of the truncated Runx2 sequences. Additionally, every sequence analyzed that does not contain either P1 or P2 promoter sequences also lacks a QA repeat. Moreover, the duck and quail Runx2 sequences on NCBI appear to be incomplete and inaccurate. The duck Runx2 sequence on NCBI lacks both P1 and P2 promoters and the QA repeat. By running 5' RACE reactions on duck mRNA, we confirmed the presence of both P1 and P2 promoters as well as the QA repeat in the duck Runx2 gene. The quail Runx2 sequence on NCBI lacks the P1 promoter and also contains an alternative noncanonical promoter. By running 5' RACE reactions on quail cDNA we confirmed the presence of the P1 and P2 promoters but did not detect the presence of an alternative noncanonical promoter.
We compared full length RUNX2 for chondrichthyian (n = 1), actinopterygian (n = 25), sarcopterygian (n = 1), amphibian (n = 5), reptilian (n = 11), avian (n = 13), and mammalian (n = 46) taxa as listed in Supplemental Table 2. We find that most of the RUNX2 domains are conserved as shown in a similarity matrix of AA sequences ( Figure   1B). However, the QA domain encoded by exon 2 and C-terminus 2 (C2) encoded by exon 9 are so variable that we did not include them in the similarity matrix. We find that the structure and splice points for all exons of Runx2 are highly conserved among the taxa analyzed. Exon 1 encodes for 19 AA, exon 3 encodes for 52 AA, exon 4 encodes for 35 AA, exon 5 encodes for 58 AA (except for actinopterygians, which have 59 AA), exon 6 encodes for 54 AA (except for chondrichtyians, which have 55 AA), and exon 8 encodes for 159 to 169 AA. Exon 7, which is specific to mammals, encodes for 22 AA. In terms of pairwise identity of AA for individual domains across vertebrates, the most conserved is the RHD, which is encoded by part of exon 2 (with 98.3% identity), all of exon 3 (with 97.8% identity), and all of exon 4 (with 98.9% identity). Conservation of the RHD is even higher within amniotes and within each major clade. This is not surprising given that the RHD is an ancient 128 AA DNA binding domain that arose early in metazoan evolution and defines the RUNX protein family (Kagoshima et al., 1993;Rennert et al., 2003;Levanon and Groner, 2004;Coffman, 2009;Robertson et al., 2009;Chuang et al., 2013;Ito et al., 2015;Mevel et al., 2019). Interestingly in reptiles (including birds), we find that the RHD is modified by one AA, which goes from phenylalanine (F) to leucine (L). Overall, amniotes have higher conservation than vertebrates as a whole, and birds have the highest levels of conservation with seven domains in the similarity matrix showing 100% pairwise identity of AA, and two domains (alternatively spliced exon 5 and exon 8, which encodes C1) showing more than 99% identity. Consensus sequences for the major vertebrate lineages are shown in Supplemental Table 3.

RUNX2 Has Undergone Key Evolutionary Changes in Four Domains
Using parsimony, we mapped the shared derived changes in RUNX2 structure on the major vertebrate lineages. We find that the actinopterygian, amphibian, reptilian, and mammalian taxa each cluster into distinct monophyletic groups (Figure 2A). RUNX2 shows the most divergence in the actinopterygian lineage, which can be split into two branches (not shown) with one cluster comprised entirely of teleosts and the other containing four species that do not form a natural group (i.e., spotted gar, red-bellied piranha, channel catfish, and one of two Runx2 copies found in zebrafish). Perhaps the whole genome duplication that occurred in the teleost lineage (Glasauer and Neuhauss, 2014) has led to greater evolutionary divergence in Runx2 sequence. Only a single copy of Runx2 was identified for the teleosts we examined except for zebrafish, which is known to retain both copies (Flores et al., 2004;Eames et al., 2012). Therefore, one copy of Runx2 was either lost among the majority of teleosts, which has been the common fate for many genes following the teleost-specific genome duplication (Pasquier et al., 2017), or simply cannot be identified at present using the available bioinformatic data. Overall, our comparative analysis of RUNX2 reveals key evolutionary events that occurred primarily in four domains: a domain containing QA repeats encoded by exon 2, a domain containing a pentapeptide at the start of the canonical C-terminus encoded by exon 8, a domain consisting of a novel C-terminus encoded by exon 9 in amniotes, and a domain encoded by an alternatively spliced exon 7 in mammals ( Figure 2A).
For the first domain, we hypothesize that the common ancestor of actinopterygians (rayfinned fish) and sarcopterygians (lobe-finned fish) likely evolved a few poly A repeats encoded by exon 2 and this distinguishes them from chondrichthyans (cartilaginous fish).
The common ancestor of sarcopterygians and tetrapods (amphibians, reptiles (including birds), and mammals) likely had a QA domain with an additional Q near the poly A repeats. The common ancestor of amphibians and amniotes (reptiles, birds, and mammals) likely had an expansion in the length of the poly Q repeat. We find that in amphibia the poly Q repeat is interspersed with serine, proline, and/or histidine while amniotes primarily contain a homomeric poly Q repeat. In the squamate reptiles (snakes and lizards), the poly Q repeat is interspersed with histidine, and further in snakes, the poly A repeat is interspersed with valines. In the passerine birds (songbirds), a glutamic acid spacer between the poly Q and poly A repeats is changed to aspartic acid. The common ancestor of mammals likely had an expansion in the length of the poly A repeat.
While the functional significance of these evolutionary changes in AA composition is not well understood, the expansion of the QA domain corresponds with the emergence of novel traits in vertebrates (Newton and Pask, 2020) and further, the ratios of Q to A appear to affect RUNX2 transcriptional activity and correlate with facial length in some mammals (Fondon and Garner, 2004;Sears et al., 2007;Morrison et al., 2013;Mastushita et al., 2015;Ritzman et al., 2017;Ferraz et al., 2018) but not others (Pointer et al., 2012;Newton et al., 2017).
With regard to the second domain, which contains a pentapeptide at the start of the canonical C-terminus encoded by exon 8, the common ancestor of sarcopterygians and tetrapods likely underwent a change from GSSDL to GASDL. Then GASDL changed to GASEL in the common ancestor of amphibians and amniotes. In previous studies, the GASEL pentapeptide motif has been shown to be an activating domain of RUNX2 and is sufficient to drive increased transcription (Thirunavukkarasu et al., 1998). None of the fish we examined contain the exact GASEL sequence; however, there is a similar and predominant actinopterygian motif of GSSDL, which is found in 18 of 25 taxa analyzed.
Sequences from two medaka and one copy of the zebrafish RUNX2 contain a GSTDL motif while the other copy of the zebrafish RUNX2 contains GTSEL. The channel catfish and red-bellied piranha have a GTSDL motif. Surprisingly, the shark, spotted gar, and coelacanth all contain the same pentapeptide sequence of GASDL, which is only a single AA different than GASEL. With the exception of one copy of the zebrafish Runx2 and the GASDL, the other actinopterygian sequences are two AA different than GASEL. What remains unknown is whether GASDL is the ancestral motif or if this arose independently several times. One amphioxus (Stricker et al., 2003), one hagfish (Hecht et al., 2008), and three Japanese lamprey Runx genes have been identified; however, which lamprey Runx genes match with the gnathostome Runx genes or if they even match on a one-toone basis is unclear (Nah et al., 2014). The lamprey Runx genes contain ASSEL, GGSEL, and DLQGV at the motif site, all of which are at least two AA different than GASDL indicating that the GASDL sequence may not be ancestral. All of the amphibian, reptilian, avian, and mammalian sequences contain the exact same GASEL pentapeptide motif.
In relation to the third domain, the common ancestor of amniotes likely evolved a novel C-terminus. This additional alternatively spliced C-terminal domain was originally discovered in mouse, humans, and rat, and designated as exon 6.1 (Terry et al., 2004).
We label this domain as exon 9 and find homologous sequence only among amniotes ( Figure 2B). Nucleotide and protein BLAST queries using chicken and mouse exon 9 sequences against the NCBI database of actinopterygian and amphibian sequences did not result in any hits for Runx2. Compared to the rest of the RUNX2 sequence (with the possible exception of the QA repeat), C2, which is encoded by exon 9, is the least conserved (Supplemental Table 4). For example, chicken and mouse RUNX2 AA sequences are 96.4% identical when excluding the region of exon 2 that encodes the QA repeat, exon 7, and exon 9, but are only 36.3% identical when comparing the AA sequences encoded by exon 9. Moreover, chicken and duck are 99.6% identical when excluding the region of exon 2 that encodes the QA repeat and exon 9 but are only 79.7% identical when comparing the AA sequences encoded by exon 9.
For the fourth domain, the common ancestor of mammals likely evolved a novel alternatively spliced exon 7. All of the mammal Runx2 sequences analyzed (n = 46) appear to contain exon 7 but we were unable to find similar sequences after running nucleotide and protein BLAST queries using human or mouse exon 7 sequences against the NCBI database of actinopterygian, amphibian, reptilian, or avian sequences. The alternative splicing of exon 7 in mammals (along with exon 5) is known to cause changes in nuclear localization of RUNX2 protein and has different functional effects on target gene expression (Makita et al., 2008). We further confirmed that exons 5 and 7 are alternatively spliced in the developing jaw primordia of mammals by examining Runx2 transcripts in a publicly available RNA-seq dataset (Len Pennacchio, 2017) from mouse at embryonic day 13.5 (data not shown). The splicing out of exon 5 in mammals appears to be coincident with the splicing out of exon 7, and the absence of exon 7 has been shown to reduce the activating potential of RUNX2 (Terry et al., 2004;Makita et al., 2008).
We have found that exon 5 is also alternatively spliced in birds but whether this is the same for other non-amniote taxa remains unclear. Despite these evolutionary changes in RUNX2 structure, the extent of RUNX2 conservation among amniotes is striking and clearly evidenced by the fact that each exon encodes the exact same number of AA in chick, quail, duck, mouse, and human ( Figure 2B). We only find major differences in exon 2, which encodes for the QA repeats that vary in length depending on the taxon, and exon 9, which encodes for the alternatively spliced C-terminus.

Runx2 Expression is Coincident with Bone Deposition and is Species-Specific
Given, the high degree of conservation in RUNX2 structure across vertebrates, we investigated the extent to which developmental changes in spatial and temporal patterns of expression, and/or the deployment of isoforms may contribute to the evolution of the jaw skeleton. We focused on birds that show species-specific differences in jaw size and shape. Our prior work involving transplants between quail and duck embryos has demonstrated that the timing of induction and levels of Runx2 expression during osteogenesis in the jaw are mediated by NCM (Merrill et al., 2008;Hall et al., 2014).
To confirm that areas of bone deposition co-localize with RUNX2 expression in the lower jaw, we stained the osteoid matrix and assayed for RUNX2 protein using IHC on adjacent sections of HH37 quail ( Figure 3A and 3B) and duck ( Figure 3C and 3D). We find that osteoid coincides with RUNX2 protein distribution in proximal regions of the quail ( Figure   3E and 3F) and duck ( Figure 3G and 3H) lower jaws such as in the angular bone; and in the more distal dentary bone of the quail (Figure 3I and 3J) and duck ( Figure 3K and 3L). We did not observe any remarkable spatial or specific-specific differences in RUNX2 protein localization between quail and duck.
However, when we examine Runx2 mRNA levels among chick, quail, and duck from HH31 to HH40, we do observe species-specific differences in expression when using qPCR ( Figure 3M). We quantified expression in chick, quail, and duck jaws across several stages of development, and compared expression relative to HH31 for each species. In chick, we observe no significant change in total Runx2 levels until HH40, where expression decreases by 2.8-fold (p ≤ 0.05). In quail, total Runx2 expression increases 2.9-fold at HH37 (p ≤ 0.003) and 2-fold HH40 (p ≤ 0.05). At HH37, quail has 3.2-fold higher Runx2 expression compared to duck (p ≤ 0.0009). In duck, total Runx2 decreases by 2.2-fold at HH37 (p ≤ 0.05) but increases at HH40 to levels like that observed at HH31. These data are consistent with our prior observations that quail and duck express equivalent levels of Runx2 at early stages (i.e., HH24 to HH32) but that quail expresses more than twice the amount of Runx2 at later stages (i.e., HH36 to HH38) of jaw development (Hall et al., 2014).
We quantified RUNX2 protein by western blot (WB) and do not detect any significant species-specific differences but do observe a steady increase in RUNX2 levels from HH31 to HH40 for chick, quail, and duck ( Figure 3N). WB reveal that RUNX2 protein increases 3.2-fold in chick (p ≤ 0.04), 2.4-fold in quail (p ≤ 0.03), and 1.3-fold in duck (p ≤ 0.05) at HH37; and 4-fold in chick (p ≤ 0.04), 2.6-fold in quail (p ≤ 0.03), and 2.7-fold in duck (p ≤ 0.05) at HH40 with no significant differences observed among the species ( Figure 3O). Thus, although quail has increased Runx2 mRNA expression at HH37, we do not observe significant differences at the protein level either by IHC or WB. The differences between what we observe on the mRNA and protein levels suggest there may be post-transcriptional regulation of Runx2 and/or post-translational regulation of RUNX2 occurring within each species. Such regulation on the gene and protein levels has been well documented for Runx2 in a range of other contexts (Kim et al., 2003;Shen et al., 2006;Rajgopal et al., 2007;Hassan et al., 2010;Ito et al., 2015;Vimalraj et al., 2015;Narayanan et al., 2019;Kim et al., 2020). Alternatively, the discrepancies among the IHC, WB, and qPCR data may be due to limitations in sensitivity of IHC as non-quantitative and WB as semi-quantitative techniques for measuring expression.
To determine the extent to which these eight Runx2 isoforms are differentially expressed in chick, quail, and duck, we performed an RNA Iso-Seq experiment using HH37 lower jaws, which again is the stage when we observe the greatest amount of species-specific divergence in the levels of Runx2 mRNA ( Figure 3M). Also, to assess if these isoforms are spatially regulated in the developing jaw primordia, we divided the jaw samples into distal (containing the dentary bone) and proximal (containing the angular and other bones) portions ( Figure 3B and 3D). When examining the absolute proportion of Runx2 isoforms present ( Figure 4B), we find that for the proximal region of chick lower jaws, 2A1 is the highest at 54%, followed by 1A1 at 22%, 2B1 at 15%, 1B1 at 5%, and the remaining isoforms totaling approximately 4%. Similar percentages are found in distal chick lower jaws, with 2A1 being the highest at 51%, followed by 1A1 at 24%, 2B1 at 14%, 1B1 at 8%, and the remaining isoforms totaling approximately 3%. In contrast, for proximal quail lower jaws, 1A1 is the highest at 46%, followed by 2A1 at 28%, 1B1 at 15%, 1A2 at 3.9%, and the remaining isoforms totaling approximately 7%. Similar percentages are found in distal quail lower jaws, with 1A1 being the highest at 41%, followed by 2A1 at 25%, 1B1 at 20%, 1B2 at 3.8%, and the remaining isoforms totaling approximately 10%. We observe similar results for proximal duck lower jaws, with 1A1 being the highest at 46%, followed by 2A1 at 32%, and 1B1 at 12%, and the remaining isoforms totaling approximately 10%. Similar percentages are found in the distal duck lower jaws for 1A1 at 49% and 2A1 at 25%. However, duck 1B1 is at 9% compared to quail 1B1 at 20%, and duck 1B2 is at 7% compared to quail 1B2 at 2.8%. The remaining isoforms in duck total approximately 10%. Thus, in terms of absolute proportions, more than half of the Runx2 expressed in the chick lower jaw (both proximal and distal regions) is the 2A1 isoform, whereas the predominant isoform in quail and duck lower jaws (both proximal and distal regions) is the 1A1 isoform. Also, we find that in the distal lower jaw, 1B1 and 1B2 isoforms in quail are inversely proportional to 1B1 and 1B2 isoforms in duck.
To determine if the relative expression of Runx2 isoforms varies by spatial distribution, and/or by species we normalized the number of reads for each Runx2 isoform by total reads per sample. We find that in terms of overall relative expression, chick distal lower jaws show the highest levels of Runx2, followed by chick proximal, quail proximal, quail distal, duck distal, and duck proximal ( Figure 4C). We do observe some differences in spatial expression of Runx2 isoforms between proximal and distal lower jaws. For example, chick has 2.1 times more 1A1 and 1B1; and around 1.7 times more 2A1 and 2B1 in the distal lower jaw than in the proximal lower jaw. When comparing expression among species, we observe that chick 2A1 and 1A1 show the highest levels of expression. We observe higher expression of 2B1 in chick than quail but similar expression for the other isoforms. We observe higher expression of 1A1, 1B1, and 2A1 in quail than duck but similar expression for the other isoforms. Overall, we observe that at HH37, chick lower jaws have the highest Runx2 isoform expression with 2A1, 1A1, and 2B1 being the most prevalent, followed by quail where 1A1, 2A1, and 1B1 are most prevalent, and with duck having the lowest expression with 1A1, 2A1, and 1B1 being the most prevalent isoforms. Chick deploy more Runx2 isoforms utilizing P2 while quail and duck deploy isoforms utilizing P1. This species-specific and spatial distribution of Runx2 isoforms that we observe could potentially provide local developmental control over the regulation of target genes involved in the deposition and resorption of bone and lead to differential growth of the jaw skeleton. Similarly, other studies have proposed that distinct zones of bone remodeling can direct the size and shape of the developing jaw (Enlow et al., 1975;Moore, 1981;Radlanski and Klarkowski, 2001;Radlanski et al., 2004;Ealba et al., 2015).
Expression of isoforms defined by the AS-domain increases 2-fold in chick by HH37 (p ≤ 0.05) ( Figure 5D). AS-expression increases 2.5-fold in quail by HH37 (p ≤ 0.03) and is significantly higher than that observed in duck (p ≤ 0.04). AS-expression is significantly higher in duck, compared to chick (p ≤ 0.03) and quail (p ≤ 0.01) at HH31, but decreases 2.3-fold by HH37 (p ≤ 0.05) and 2.8-fold by HH40 (p ≤ 0.03). Expression of isoforms containing the C1 domain remains constant among all three species, although expression is significantly higher in duck than quail at HH31 (p ≤ 0.002) ( Figure 5E). Expression of isoforms containing the C2 domain remains constant among all three species, although expression is significantly higher in duck than chick (p ≤ 0.02) and quail (p ≤ 0.006) at HH31 ( Figure 5F). Overall, we observe changes in Runx2 domain expression during embryonic development of chick, quail, and duck. Specifically in quail at HH37, we observe an increase in P1, P2, AS+, and AS-expression, and observe a higher expression of AS+ and AS-in quail compared to duck at HH37. The change in Runx2 isoform expression during development could result in differential transcriptional activity and regulation of target genes, which could lead to species-specific regulation of phenotypic outcomes. To investigate mechanisms through which the differential expression of Runx2 isoforms could be regulated in the context of avian jaw development, we interrogated the potential role of TGFβ signaling.

Runx2 Isoforms are Differentially Expressed in Response to TGFβ Signaling
Runx2 is a known target of signaling by the TGFβ superfamily (Lee et al., 2000;Alliston et al., 2001;Lee et al., 2002;Ito and Miyazono, 2003;Kang et al., 2005;Derynck et al., 2008). Thus, we tested if TGFβ signaling can differentially regulate expression of individual Runx2 isoforms, and if this expression is species-specific. We harvested jaw primordia from chick, quail, and duck at HH34 and treated them in culture with rTGFβ1.
Twenty-four hours after treatment with rTGFβ1, we observe a 1.9-fold increase in isoforms utilizing the P1 domain in chick (p ≤ 0.001), a 1.5-fold increase for quail (p ≤ 0.02), and no change for duck relative to controls ( Figure 5G). Moreover, expression of P1 is elevated 2.2-fold in chick (p ≤ 0.0001) and 2.3-fold in quail (p ≤ 0.0001) compared to duck. P2 is not affected by rTGFβ1 for any species, although quail is 2.5-fold higher than duck (p ≤ 0.0001) overall. Isoforms containing the AS+ domain increase 1.4-fold in chick (p ≤ 0.04) and 1.6-fold in quail (p ≤ 0.05), whereas duck decreases 3.6-fold (p ≤ 0.05). Jaw primordia treated with rTGFβ1 showed 23-fold higher AS+ expression in chick (p ≤ 0.0001) and 25-fold higher in quail (p ≤ 0.0001) compared to duck. Isoforms defined by the AS-domain increase 1.4-fold in chick (p ≤ 0.05) and 1.8-fold in quail (p ≤ 0.001); we observed no change in duck. AS-isoforms are 24-fold higher in chick (p ≤ 0.0001) and 24-fold in quail (p ≤ 0.0001) relative to duck. Isoforms containing the C1 domain increase by 1.7-fold in chick (p ≤ 0.05) and 2.8-fold in (p ≤ 0.0002) with no observed change in duck. Following rTGFβ1 treatment, C1 isoforms increase by 2.3-fold in chick (p ≤ 0.005) and 2.8-fold in quail (p ≤ 0.0002) relative to duck. Treatment with rTGFβ1 does not change C2 expression in chick, quail, or duck. By taking the differential response of each Runx2 domain into account, we conclude that isoforms 1A1 and 1B1 are regulated by TGFβ signaling in chick and quail since we observe a differential response in the P1, AS+, AS-, and C1 domains. In duck however, we do not observe any TGFβ-mediated effects on 1A1, 2A1, 1A2, or 2A2, and based on the response of isoforms containing the AS+ domain, we find that TGFβ signaling likely has a repressive effect on the expression of 1A1, 2A1, 1A2, 2A2. Such findings are consistent with our prior work demonstrating that chick and quail have elevated TGFβ signaling relative to duck during these same stages of jaw development, that duck is inherently less sensitive to TGFβ signaling than chick or quail, and that TGFβ signaling can repress target genes in duck but activate them in chick and quail .
Additionally, these species-specific differences may also reflect the ability of TGFβ signaling to regulate splicing factors and alternative splicing events that promote the expression of isoforms involved in tissue remodeling and a broad range of normal and abnormal cellular behaviors (Zhao and Young, 1995

Runx2 Isoforms Can Activate or Repress the Expression of Target Genes
To determine the extent to which each isoform of Runx2 affects the transcription of target genes we performed in vitro and in ovo overexpression experiments and measured transcriptional activity using qPCR analyses of gene expression and/or luciferase assays.

For our in vitro experiments, we first tested if chick and duck fibroblast cell lines express
Runx2 isoforms by using primers to amplify all of the isoforms containing the P1, P2, AS+, AS-, C1, and C2 domains. When we measure the total amount of Runx2 mRNA we find that chick cells express 2-fold higher levels (p ≤ 0.0001) than duck cells (Figure 6A), which is consistent with what we observe in the jaw primordia of chick versus duck at HH37 ( Figure 3M). Similarly, chick cells show 10-fold higher expression (p ≤ 0.0001) of P1 domains, 30-fold higher expression (p ≤ 0.0001) of P2 domains, and 3-fold higher expression (p ≤ 0.0001) of C2 isoforms compared to that observed in duck cells. In both chick and duck cells, we observe more or less equivalent expression levels for isoforms containing the AS+ and C1 domains and find that the AS-isoforms are expressed at low levels. Thus, chick and duck cell lines exhibit species-specific differences in endogenous levels of total Runx2 and of Runx2 isoforms.

Overall, our in vitro isoform overexpression experiments demonstrate that in general 1A1
is the most transcriptionally activating Runx2 isoform while 1B1 and 2B2 are the most repressive based on the response of the OSE2 binding element and the differential expression of target genes. However, the effects on some genes do vary such as in the case of 2A1 activating OSE2 but repressing everything else except Mmp2; 1B2 activating Mmp2; or 1B1 activating Mmp13 but repressing OSE2, Tgfβr1, Ocn, and Col1a1. This gene-specific variation indicates that context is important and that other types of coincident regulation may be contributing to the effects of different isoforms.

RUNX2 Isoforms Differentially Regulate Target Genes During Jaw Development
To understand the regulatory effects of different functional domains of Runx2 within the context of jaw development, we overexpressed the 1A1 and 2B2 isoforms in quail and duck embryos. We focused on these two isoforms because they are the most distinct from one another with each defined by either an alternative promoter (P1 or P2), the presence or absence of the domain encoded by exon 5 (AS+ or AS-), or an alternative C-terminus (C1 or C2) ( Figure 4A). To restrict overexpression to late stages of jaw development, we utilized a tool we previously developed that combines a piggyBac transposon system, which integrates into the host genome and expresses GFP, and a dox-inducible promoter, which drives overexpression of RFP and a gene of interest . We unilaterally electroporated presumptive NCM at HH8.5 so that the non-electroporated contralateral side could serve as an internal control ( Figure 7A), treated embryos with dox at HH34 (Figure 7B), and then examined the extent of isoform overexpression at HH37 by screening embryos for RFP using epifluorescent or confocal microscopy ( Figure   7C). Unilateral overexpression of total Runx2 mRNA was confirmed by qPCR. We observe a 4.5-fold (p ≤ 0.01) increase of 1A1 in quail and a 4.5-fold (p ≤ 0.02) increase in duck, and a 4.4-fold (p ≤ 0.01) increase of 2B2 in quail and a 7.3-fold (p ≤ 0.003) increase in duck ( Figure 7D).
With regard to the regulation of RUNX2 target genes and relative to contralateral controls, 1A1 overexpression in quail increases Tgfβr1 by 2.4-fold (p ≤ 0.006) but does not affect Tgfβr1 in duck ( Figure 7E). 2B2 overexpression does not affect Tgfβr1 in quail but increases Tgfβr1 in duck by 2.1-fold (p ≤ 0.04). 1A1 overexpression increases Ocn by 3fold in both quail (p ≤ 0.001) and duck (p ≤ 0.002) whereas 2B2 overexpression does not affect Ocn in quail but decreases Ocn by 1.9-fold (p ≤ 0.05) in duck ( Figure 7F). 1A1 overexpression increases Col1a1 by 3.6-fold in quail (p ≤ 0.0001) and 3.3-fold in duck (p ≤ 0.03) whereas 2B2 overexpression decreases Col1a1 by 2.9-fold in quail (p ≤ 0.03) and 2.5-fold in duck (p ≤ 0.05) ( Figure 7G). 1A1 overexpression increases Mmp2 by 2.6-fold in quail (p ≤ 0.0001) but does not affect Mmp2 in duck, whereas 2B2 overexpression decreases Mmp2 by 3.1-fold in quail (p ≤ 0.04) and 2.9-fold in duck (p ≤ 0.04) ( Figure   7H). 1A1 increases Mmp13 by 6.6-fold in quail (p ≤ 0.0001) but does not change Mmp13 in duck whereas 2B2 overexpression does not alter Mmp13 in quail but decreases Mmp13 by 7.1-fold in duck (p ≤ 0.002) ( Figure 7I). In sum, we find that the 1A1 isoform transcriptionally activates all five target genes examined in quail (i.e., Tgfβr1, Ocn, Col1a1, Mmp2, and Mmp13) but only the two osteogenic genes in duck (i.e., Ocn and Col1a1). The 2B2 isoform represses two genes in quail (i.e., Col1a1 and Mmp2) whereas 2B2 represses four genes in duck (i.e., Ocn, Col1a1, Mmp2, and Mmp13) and activates one (i.e, Tgfβr1). Such species-specific differences in the ways 1A1 versus 2B2 feed into the regulatory landscape suggest that developmental context helps determine the response of target genes to RUNX2 isoforms, that the alternatively spliced domains of RUNX2 (i.e., P1, P2, AS+, AS-, C1, and C2) have distinct functional effects, and that interactions at the level of promoters and enhancers may play an essential role in Runx2mediated programs for bone deposition and resorption.

Runx2 Isoforms Differentially Regulate the Mmp13 Promoter
As a proof-of-principle to understand mechanisms through which RUNX2 isoforms interact with target genes, we focused on the Mmp13 promoter. The Mmp13 promoter in particular, is well characterized and known to be regulated by RUNX2 (Pendas et al., 1997;Wang et al., 2004;Chen et al., 2012a;Meyer et al., 2016;Takahashi et al., 2017;Young et al., 2019;Smith et al., 2020). Additionally, results from both our in vitro and in ovo experiments demonstrate that Mmp13 is among the most differentially regulated genes of the ones we examined (in terms of fold change) and shows significant and dramatic species-specific differences in its response to the same isoforms. For example, we find that 1A1 can activate Mmp13 in quail but not duck, or 2B2 can repress Mmp13 in duck but not quail ( Figure 7I). In a prior study, we cloned and characterized the first -2 kb of the Mmp13 promoters of chick, quail, and duck and identified all of the RUNX2 binding elements . While three RUNX2 binding elements are present in the Mmp13 promoters of all three species, they are found in different locations except for one common site in the most proximal region ( Figure 8A).

SNPs by a RUNX2 Binding Element Affect Isoform-Mediated Activity of Mmp13
To determine the extent to which the RUNX2 binding elements account for the differential and species-specific response of Mmp13 to RUNX2 isoform overexpression, we attached luciferase to a -184 bp fragment of the proximal Mmp13 promoter containing only one RUNX2 binding element in chick and quail, and a -181 bp fragment containing only one RUNX2 binding element in duck. In a prior study we identified two single nucleotide polymorphisms (SNPs) that distinguish chick and quail (i.e., "AG") from duck (i.e., "CA") located -160 bp upstream of the Mmp13 translational start site and immediately adjacent to a RUNX2 binding element within the -184/-181 bp fragments  ( Figure 9A). To test if these SNPs affect the differential and species-specific response of the Mmp13 promoter to various RUNX2 isoforms, we also generated constructs in which we incorporated the chick/quail "AG" SNPs into the duck promoter and the duck "CA" SNPs into the chick/quail promoter.
We find that in chick cells and relative to control -184 bp fragments alone, overexpressing 2A1 induces chick Mmp13 promoter activity by 1.5-fold (p ≤ 0.02) and quail Mmp13 promoter activity by 2.8-fold (p ≤ 0.0001) ( Figure 9B). Overexpressing 2A1 has no measurable effect on the duck Mmp13 promoter. In contrast, switching duck "CA" SNPs into the chick and quail promoter represses the activation observed with the control Mmp13 promoter by 1.5-fold (p ≤ 0.0001) but this repression is not affected by 2A1 overexpression. Switching chick and quail "AG" SNPs into the duck Mmp13 promoter increases activity by 1.5-fold (p ≤ 0.003) relative to the control duck -181 bp fragment alone, and overexpressing 2A1 induces activity by 1.9-fold (p ≤ 0.0001), which is similar to what is observed for the control chick and quail Mmp13 promoters, and which is 1.3fold (p ≤ 0.0001) higher than the control duck promoter with the chick and quail "AG" SNPs. We observe equivalent trends in duck cells although 2A1 overexpression activates the duck Mmp13 promoter by 2.7-fold (p ≤ 0.0001) and by 1.7-fold (p ≤ 0.004) when including the chick and quail "AG" SNPs ( Figure 9C). We find that in chick cells and relative to control -184 bp fragments alone, overexpressing 1B1 reduces both chick and quail Mmp13 promoter activity by 2.0-fold (p ≤ 0.0001) ( Figure 9D). Switching duck "CA" SNPs into the chick Mmp13 promoter reduces activity by 2.3-fold (p ≤ 0.0001) both without and with 1B1 overexpression. Similarly switching duck "CA" SNPs into the quail Mmp13 promoter reduces activity by 6.9-fold (p ≤ 0.0001) both without and with 1B1 overexpression. In contrast, switching chick and quail "AG" SNPs into the duck Mmp13 promoter increases activity by 1.9-fold (p ≤ 0.0001) relative to the control duck -181 bp fragment alone, but overexpressing 1B1 reduces expression by 1.9-fold (p ≤ 0.0001) relative to the control duck -181 bp fragment with the "AG" SNPs. We observe equivalent trends in duck cells ( Figure 9E). We also performed similar experiments with 1B2 and 2B2 and find that overexpressing both isoforms represses chick and quail Mmp13 promoter activity but not duck Mmp13 promoter activity. We do observe a change in activity in the duck promoter when 2B2 is overexpressed and the SNPs are switched (Supplemental Figure 1).
Thus, even though chick, quail, and duck Mmp13 promoters contain the same total number of RUNX2 binding elements, species-specific SNPs adjacent to the most proximal RUNX2 binding element appear to be important for interactions with RUNX2 isoforms. Switching chick and quail "AG" SNPs into the duck Mmp13 promoter upregulates activity, whereas switching the duck "CA" SNPs decreases chick or quail promoter activity, and these effects can be mitigated or enhanced depending on which Runx2 isoform is overexpressed. For instance, 2A1 normally activates the chick and quail Mmp13 promoter but not duck. Overexpressing 2A1 with a duck promoter that contains the chick and quail "AG" SNPs, enables 2A1 to become an activating isoform. Conversely, 1B1 normally represses the chick and quail Mmp13 promoter but activates the duck Mmp13 promoter. Overexpressing 1B1 with a duck promoter that contains the chick and quail "AG" SNPs, enables 1B1 to become a repressive isoform. Such results suggest that species-specific regulation of Mmp13 depends upon the way different RUNX2 isoforms interact with the RUNX2 binding element and adjacent SNPs.

Runx2 Contains Multiple Domains that Direct its Transcriptional Activity
Although the overall structure of Runx2 has remained highly conserved across vertebrates, and especially within amniotes, our study suggests that the stage-and species-specific deployment of Runx2 isoforms by NCM regulates the transcriptional activity of target genes in ways that likely contribute to the evolution of the jaw skeleton.
Presumably this is because each isoform is composed of distinct functional domains that allow RUNX2 protein to be transcriptionally activating and/or repressive depending on the nature of interactions with target genes. In addition, we find that species-specific polymorphisms in the Mmp13 promoter also influence the regulatory effects of various isoforms and we expect that this would be the same for many other target genes. We imagine that these regulatory effects can be modulated even further by altering the context such as where, when, and how RUNX2 protein becomes localized; the cell state or cell type in which RUNX2 protein is expressed; and the ways RUNX2 protein becomes modified post-translationally (Schneider, 2018a). Ultimately, much of this context can be defined by the inclusion or exclusion of relevant functional elements in each Runx2 isoform such as P1 or P2, a nuclear localization signal (NLS), nuclear matrix-targeting signal (NMTS), and various pentapeptide motifs ( Figure 1A).
For example, expression from the P1 promoter is associated with mesenchymal cells, pre-osteoblasts, chondrocyte precursors, and endochondral ossification while expression from P2 is associated with osteoblasts, hypertrophic chondrocytes, intramembranous ossification, and non-osteogenic expression (Park et al., 2001;Choi et al., 2002;Xiao et al., 2004;Li and Xiao, 2007;Jeong et al., 2008;Okura et al., 2014). In our study, we observe species-specific differences in expression of P1 ( Figure 5A) but not P2 ( Figure   5B) among chick, quail, and duck. Perhaps this is related to the fact that the lower jaw of birds ossifies exclusively through intramembranous ossification except for the proximal most region of Meckel's cartilage, which undergoes endochondral ossification but only at stages later than what we examined (Starck, 1989;Helms and Schneider, 2003;Eames et al., 2004;Mitgutsch et al., 2011;Ealba et al., 2015;Anthwal et al., 2017;Svandova et al., 2020). Other studies on activation and repression by isoforms utilizing P1 versus P2 indicate that both have similar transcriptional potentials (Banerjee et al., 2001). Our results do not support these conclusions since we find the 1A1 isoform activates Tgfβr1, Ocn, Mmp2, and Mmp13 mRNA expression more than 2A1 (Figure 6D to 6H). However, we find that 2A1 is a transcriptional activator of the OSE2 binding site and of the Mmp13 promoter ( Figure 6C and 6E, Figure 8B to 8E, Figure 9B to 9C). In contrast, another study has found that isoforms utilizing P2 can activate the Ocn promoter more than P1, but only when using a higher dose of plasmid for overexpression (Xiao et al., 1999).
Isoforms utilizing P1 have also been found to mediate the endothelial-mesenchymal transition (EMT) within the developing heart, which is regulated by TGFβ2 (Tavares et al., 2018), and P2 appears to be associated with the development and differentiation of the tooth germ, whereas P1 is not (Kobayashi et al., 2006). We find that P1 but not P2 is induced by TGFβ1. In a prior study we observed that TGFβ signaling is more active in the lower jaws of quail than duck during later stages of development , which supports our observations here that quail show higher 1A1 isoform expression than duck ( Figure 4C). Similarly, other experiments have found that CCAAT/Enhancer Binding-Protein β (C/EBPβ) and Specificity Protein 1 (SP1) upregulate P1 expression (Zhang et al., 2009;Henriquez et al., 2011); less is known about P2 regulation.
Additional elements critical for RUNX2 function include a classic NLS (Figure 1A), which was discovered based on its homology to the C-Myc NLS and which partially overlaps alternatively spliced exon 5 and the RHD at its C-terminus (Thirunavukkarasu et al., 1998). A second NLS is found at the N-terminus of the RHD, and mutations in either NLS result in disrupted RUNX2 localization and cleidocranial dysplasia (CCD) (Kim et al., 2006;Hansen et al., 2011). Our results also indicate that the NLS encoded by exon 5 may play an important role, since isoforms that include exon 5 appear to be more transcriptional activating (1A1, 2A1) than isoforms with exon 5 spliced out (1B1, 2B2) ( Figure 6 and Figure 9). Another essential element is the NMTS located in C1, which forms a loop-turn-loop structure (Zeng et al., 1997;Tang et al., 1999) that interacts directly with SMADS and directs them to the nuclear matrix (Zaidi et al., 2002). The NMTS domain is required for proper nuclear localization of RUNX2, which normally appears punctate but becomes diffuse following a loss of the NMTS (Zaidi et al., 2001;Harrington et al., 2002). Moreover, disruptions to the structure of the NMTS reduce RUNX2 activity and inhibit osteogenic differentiation Zaidi et al., 2006), and missense mutations in the NMTS cause CCD (Kim et al., 2006). Finally, C1 also contains two pentapeptide motifs, GASEL and VWRPY. The GASEL motif is found at the start of exon 8 and the VWRPY motif is found at the end followed directly by a stop codon. The GASEL domain is sufficient to induce transcriptional activation whereas the VWRPY motif interacts with transcriptional co-repressors (Aronson et al., 1997;Walrad et al., 2010;Kim et al., 2020). In contrast, C2 remains poorly characterized (Terry et al., 2004) and we find C2 to be highly variable across the species we examined (Supplemental Table 4).

Alternative Splicing Serves as a Key Evolutionary Developmental Mechanism
Alternative splicing is an evolutionarily conserved process yielding mRNA isoforms that can encode a diversity of protein variants with distinct activity, stability, localization, and function (Black, 2003;Matlin et al., 2005). Alternative splicing events are often differentially regulated across tissues and during development, as well as among individuals and populations, suggesting that individual isoforms may play distinct spatial or temporal roles. Estimates for human tissues indicate that up to 92-95% of multi-exon genes express multiple isoforms (Pan et al., 2008;Wang et al., 2008). Variation in isoforms appears to evolve more rapidly than actual changes in gene expression among vertebrates and therefore provides a powerful mechanism for expanding the regulatory and functional repertoire of genes (Barbosa-Morais et al., 2012;Merkin et al., 2012;Reyes et al., 2013). For example, comparisons between embryonic and adult tissues in the brain have revealed that 31% of genes are differentially regulated by alternative splicing even though their total expression levels remain unchanged (Dillman et al., 2013).
Increased levels and complexity of alternative splicing especially in higher vertebrates can underlie phenotypic change (Kim et al., 2007;Nilsen and Graveley, 2010;Chen et al., 2012c) and drive rapid adaptive radiation such as in the jaws of cichlid fish (Singh et al., 2017). Alternative splicing is critical for normal development of the craniofacial complex and dysregulation of splicing factors can disrupt patterning (Bélanger et al., 2018;Hooper et al., 2020;Lee et al., 2020;Wood et al., 2020). Multiple craniofacial disorders have spliceosomopathies as their primary etiology (Bernier et al., 2012;Lehalle et al., 2015;Merkuri and Fish, 2019). For example, deletion of splicing factors in NCM can cause cleft palate and a disrupted craniofacial skeleton (Cibi et al., 2019;Beauchamp et al., 2020). Such work emphasizes the increasing importance of understanding the role of alternative splicing during embryonic development in the context of both disease and evolution. As a case in point, our study demonstrates how a clinically relevant transcription factor like Runx2, which consists of domains that are highly conserved across vertebrates, can diversify its regulatory abilities and functional effects by deploying different combinations of activating and repressive isoforms in time and space throughout development, and by modulating this deployment among species during evolution. Office of the Director S10 OD021664 to R.A.S.

COMPETING INTERESTS
The authors declare no competing or financial interests.

DATA AVAILABILITY
All datasets and constructs will be made publicly available at the time of publication.

Figure 2. Evolution of RUNX2 domains across vertebrates. A) Shared derived
changes in RUNX2 structure can be mapped on the major vertebrate lineages. Based on parsimony, the common ancestor of actinopterygian (ray-finned fish) and sarcopterygians (lobe-finned fish) likely had a QA domain with some poly A repeats encoded by exon 2 as indicated by the "a" hash mark at the base of the ingroup from chondrichthyans ("cartilaginous fish"). The common ancestor of sarcopterygians and tetrapods (amphibians, reptiles (including birds), and mammals) likely had a QA domain with an additional Q near the poly A repeats encoded by exon 2 and a change in the GSSDL pentapeptide to GASDL encoded by exon 8 as indicated by the "b,c" hash mark. The common ancestor of amphibians and amniotes (reptiles and mammals) likely had an expansion in the length of the poly Q repeat and the pentapeptide GASDL changed to GASEL as indicated by the "d,e" hash mark. In amphibia the poly Q repeat is interspersed with serine, proline, and/or histidine (as indicated by the "f" hash mark) while amniotes primarily contain a homomeric poly Q repeat. The common ancestor of amniotes likely had a novel C-terminus as indicated by the "g" hash mark. The common ancestor of mammals likely had an expansion in the length of the poly A repeat and a novel alternatively spliced exon 7 as indicated by the "h,i" hash mark. In reptiles (including birds), the RHD is modified by one AA, which goes from phenylalanine to leucine (as indicated by the "j" hash mark). In the squamate reptiles (snakes and lizards), the poly Q repeat is interspersed with histidine (as indicated by the "k" hash mark). In snakes, the poly A repeat is interspersed with valines (as indicated by the "l" hash mark). In the passeriform birds (songbirds), a glutamic acid spacer between the poly Q and poly A repeats is changed to aspartic acid as indicated by the "m" hash mark. B) Schematic showing the number of amino acids (AA) encoded by each exon of Runx2 for chick, quail, duck, mouse, and human. Runx2 contains two alternative promoters (P1 and P2), repeats of glutamine (Q) and alanine (A), which vary in length depending on the taxon; nuclear localization signals (NLS); a runt homology domain (RHD); an alternatively spliced (AS) exon 5 in all taxa as well exon 7, which is specific to mammals; and two alternative Ctermini (C1 and C2). C1 (exon 8) encodes pentapeptide domains (GASEL and VWRPY) and a nuclear matrix targeting signal (NMTS). C2 (exon 9) is highly variable.  In quail and duck, the Runx2 1A1 isoform is most prominently expressed, followed by 2A1, 1B1, and 2B1. 1A2, 2A2, and 2B2 have low or no expression in quail and duck. C) The relative expression of each Runx2 isoform normalized to total reads shows that overall Runx2 expression highest in the chick distal portion, followed by chick proximal, quail proximal, quail distal, and duck proximal and distal. 2A1 is most prominently expressed in the chick distal region, followed by 1A1, 2B1, and 1B1. In quail and duck, 1A1 is most prominently expressed, followed by 2A1, 1B1, and 2B1. 1A2, 2A2, and 2B2 have low or no expression in quail and duck. The 1A1, 1B1, and 2A1 are more expressed in quail than duck. Other isoforms are similar among the species. Chick expresses more isoforms utilizing P2, while quail and chick express more isoforms with P1.
A) P1 increases 3.3-fold in chick (blue, n = 6) and 3.5-fold in quail (red square, n = 9) from HH31 to HH37; chick decreases at HH40. P1 does not change from HH31 to HH34 in duck (yellow triangle, n = 9) but is higher compared to quail at HH34. B) P2 decreases 5.5-fold in chick at HH40 and increases 2-fold in quail at HH37. Duck levels do not change over time and no differences are observed among species. C) AS+ remains unchanged but decreases in chick by HH40. AS+ increases 3-fold in quail from HH31 to HH37 and is elevated compared to chick and duck at HH37. AS+ decreases 2.2-fold in duck from HH34 to HH40. D) AS-increases 2-fold in chick and 2.5-fold in quail at HH37, and quail levels are higher than duck at HH37. AS-decreases 2.8-fold in chick at HH40; and decreases 2.3-fold in duck at HH37, but levels are higher in duck compared to quail and chick at HH31. E) C1 does not change over time in any species. F) C2 is unchanged in chick and quail over time. C2 decreases 2-fold in duck at HH37 and is higher in duck compared to chick and quail at HH31. * denotes significant difference (p ≤ 0.05) from HH31 within each group, # denotes significant difference (p ≤ 0.05) between quail and duck at same stage, † denotes significant difference (p ≤ 0.05) between chick and quail, ‡ denotes significant difference (p ≤ 0.05) between chick and duck. G) Chick (blue), quail (red), and duck (yellow) lower jaws treated in culture with 25 ng/ml rTGFβ1 for 24 hours. rTGFβ1 (gray) induces isoforms utilizing P1 1.9-fold in chick and 1.5-fold quail; AS+ 1.4-fold in chick and 1.6-fold in quail; AS-1.4-fold in chick and 1.8-fold in quail; and C1 1.7fold in chick and 2.8-fold in quail. rTGFβ1 induces C2 1.3-fold in chick but has no effect in quail or duck. No changes are observed for any domain in duck following treatment with rTGFβ1. Differences between quail and duck in response to rTGFβ1 treatment are observed for P1, P2, AS+, AS-, and C1, but not C2. *denotes significance at p ≤ 0.05 compared to vehicle control, # denotes significance at p ≤ 0.05 between quail and duck at the same stage, † denotes significance at p ≤ 0.05 between chick and quail, ‡ denotes significance at p ≤ 0.05 between chick and duck.

Figure 7. Transcriptional activity of Runx2 isoforms 1A1 versus 2B2 in ovo. A)
Presumptive NCM was electroporated unilaterally in quail and duck embryos as HH8.5 with species-matched 1A1 or 2B2 overexpression constructs. B) At HH34, embryos were treated with dox for 48 hours. C) At HH37, specimens were screened by epifluorescence for positive unilateral RFP signal (arrows). D) Total Runx2 following overexpression of 1A1 (gray) and 2B2 (black) in quail (red) and duck (yellow) lower jaws as measured by qPCR. Runx2 is 4.5-fold higher in quail and duck following 1A1 overexpression compared to the contralateral side; and 4.5-fold higher in quail and 7.3-fold higher in duck following 2B2 overexpression. E) Tgfβr1 is 2.4-fold higher in quail following 1A1 overexpression, and 2.1-fold higher in duck following 2B2 overexpression. F) Ocn is 3-fold higher in quail and duck following 1A1 overexpression, and 1.9-fold lower in duck following 2B2 overexpression. G) Col1a1 is 3.6-fold higher in quail and 3.3-fold in duck following 1A1 overexpression, and 2.9-fold lower in quail and 2.5-fold lower in duck following 2B2 overexpression. H) Mmp2 is 2.6-fold higher in quail following 1A1 overexpression, and 3.1-fold lower in quail and 2.9-fold lower in duck following 2B2 overexpression. I) Mmp13 is 6.6-fold higher in quail following 1A1 overexpression and 7.1-fold lower in duck following 2B2 overexpression. Compared to contralateral control side, * denotes significance at p ≤ 0.05, ** denotes significance at p ≤ 0.01, and *** denotes significance at p ≤ 0.001. n = 7 to 11 for each treatment group. promoter activity by 2-fold. Compared to control, * denotes significance at p ≤ 0.05, ** denotes significance at p ≤ 0.01, and *** denotes significance at p ≤ 0.001. n = 18 to 36 for each treatment group. Protein-1 (SP1), and RUNX2. However, chick and quail contain two SNPs ("AG") adjacent to the RUNX2 binding element that differ from the duck ("CA"). B) Chick cells transfected with chick (blue) or quail (red) Mmp13 promoters containing either a -184 bp control fragment including the chick/quail "AG" SNPs; a -184 bp fragment including the chick/quail "AG" SNPs plus a Runx2 2A1 over-expression (OE) construct; a -184 bp fragment including the duck "CA" SNPs; or a -184 bp fragment including the duck "CA" SNPs plus a 2A1 OE construct. Chick cells were also transfected with the duck (yellow) Mmp13 promoter containing either a -181 bp control fragment including the duck "CA" SNPs; a -181 bp fragment including the duck "CA" SNPs plus a 2A1 OE construct; a -181 bp fragment including the chick/quail "AG" SNPs; or a -181 bp fragment including the chick/quail "AG" SNPs plus a 2A1 OE construct. In chick cells, 2A1 OE increases activity of the chick Mmp13 promoter by 1.5-fold and quail promoter by 2.8-fold. Switching duck "CA" SNPs into chick and quail promoters decreases activity by 1.9-fold and by 1.3-fold with 2A1 OE. Switching chick/quail SNPs into the duck promoter leads to an increase in activity by 1.6-fold with 2A1 OE. C) In duck cells, 2A1 OE increases activity of the chick Mmp13 promoter by 1.5-fold and quail promoter by 1.7-fold. Switching the duck SNPs decreases activity in the chick promoter by 4.9-fold and quail by 9.7-fold even with 2A1 OE. 2A1 OE increases activity of the duck promoter by 2.7-fold and switching the "CA" SNPs to the duck promoter increases activity by 1.7-fold and 2.6-fold with 2A1 OE. D) In chick cells, 1B1 OE decreases activity of the chick and quail Mmp13 promoter by 2.0-fold.
Switching duck SNPs into the chick promoter decreases activity by 2.3-fold and by 6.9fold in the quail promoter. Switching chick/quail SNPs into the duck promoter leads to an increase in activity by 1.9-fold but when combined with 1B1 OE, duck promoter activity decreases. E) In duck cells, 1B1 OE decreases activity of the chick Mmp13 promoter by 5-fold and quail promoter by 1.6-fold. Switching duck SNPs into the chick promoter decreases activity by 10.3-fold and by 9.3-fold in the quail promoter and also with 1B1 OE. Switching chick/quail SNPs into the duck promoter increases activity by 1.6-fold but 1B1 OE decreases activity by 1.3-fold. Compared to the endogenous promoter within each species, * denotes significance at p ≤ 0.05, ** denotes significance at p ≤ 0.01, and *** denotes significance at p ≤ 0.001. # denotes significance at p ≤ 0.05 compared to control SNP switch. n = 18 to 36 for each treatment group.

isoforms 1B2 and 2B2. A) Chick cells transfected with chick (blue) or quail (red) Mmp13
promoters containing either a -184 bp control fragment including the chick/quail "AG" SNPs; a -184 bp fragment including the chick/quail "AG" SNPs plus a Runx2 1B2 overexpression (OE) construct; a -184 bp fragment including the duck "CA" SNPs; or a -184 bp fragment including the duck "CA" SNPs plus a 1B2 OE construct. Chick cells were also transfected with the duck (yellow) Mmp13 promoter containing either a -181 bp control fragment including the duck "CA" SNPs; a -181 bp fragment including the duck "CA" SNPs plus a 1B2 OE construct; a -181 bp fragment including the chick/quail "AG" SNPs; or a -181 bp fragment including the chick/quail "AG" SNPs plus a 1B2 OE construct. In chick cells, 1B2 OE decreases activity of the chick and quail Mmp13 promoters by 1.9-fold. Switching duck "CA" SNPs into the chick and quail promoters decreases activity by 1.7-fold and by 3.3-fold with 1B2 OE. Switching chick/quail SNPs into the duck promoter leads to an increase in activity by 1.5-fold both with and without 1B2 OE. B) In duck cells, 1B2 OE decreases activity of the chick Mmp13 promoter by 1.7-fold and quail promoter by 1.5-fold. Switching the duck SNPs into the chick and quail promoters decreases activity by 3.8-fold even with 1B2 OE. C) In chick cells, 2B2 OE decreases activity of the chick Mmp13 promoter by 2-fold and quail by 1.5-fold. Switching the duck SNPs into the chick promoter decreases activity by 2.5-fold and by 4.9-fold in the quail promoter. Switching chick/quail SNPs into the duck promoter leads to an increase in activity by 1.4-fold but when combined with 2B2 OE, duck promoter activity decreases. D) In duck cells, 2B2 OE decreases activity of the chick Mmp13 promoter by 2.0-fold. Switching duck SNPs into the chick promoter decreases activity by 2.5-fold and by 11.2-fold in the quail promoter and also with 2B2 OE. Switching chick/quail SNPs into the duck promoter increases activity by 1.5-fold. Compared to the endogenous promoter within each species, * denotes significance at p ≤ 0.05, ** denotes significance at p ≤ 0.01, and *** denotes significance at p ≤ 0.001. # denotes significance at p ≤ 0.05 compared to control SNP switch. n = 18 to 36 for each treatment group.