PT - JOURNAL ARTICLE AU - Morris F. Maduro TI - Evolutionary dynamics of the SKN-1 → MED → END-1,3 regulatory gene cascade in <em>Caenorhabditis</em> endoderm specification AID - 10.1101/769760 DP - 2019 Jan 01 TA - bioRxiv PG - 769760 4099 - http://biorxiv.org/content/early/2019/09/14/769760.short 4100 - http://biorxiv.org/content/early/2019/09/14/769760.full AB - Gene regulatory networks (GRNs) with GATA factors are important in animal development, and evolution of such networks is an important problem in the field. In the nematode, Caenorhabditis elegans, the endoderm (gut) is generated from a single embryonic precursor, E. The gut is specified by an essential cascade of transcription factors in a GRN, with the maternal factor SKN-1 at the top, activating expression of the redundant med-1,2 divergent GATA factor genes, with the combination of all three contributing to activation of the paralogous end-3 and end-1 canonical GATA factor genes. In turn, these factors activate the GATA factors genes elt-2 and elt-7 to regulate intestinal fate. In this work, genome sequences from over two dozen species within the Caenorhabditis genus are used to identify putative orthologous genes encoding the MED and END-1,3 factors. The predictions are validated by comparison of gene structure, protein conservation, and putative cis-regulatory sites. The results show that all three factors occur together, but only within the Elegans supergroup of related species. While all three factors share similar DNA-binding domains, the MED factors are the most diverse as a group and exhibit unexpectedly high gene amplifications, while the END-1 orthologs are highly conserved and share additional extended regions of conservation not found in the other GATA factors. The MEME algorithm identified both known and previously unrecognized cis-regulatory motifs. The results suggest that all three genes originated at the base of the Elegans supergroup and became fixed as an essential embryonic gene regulatory network with several conserved features, although each of the three factors is under different evolutionary constraints. Based on the results, a model for the origin and evolution of the network is proposed. The set of identified MED, END-3 and END-1 factors form a robust set of factors defining an essential embryonic gene network that has been conserved for tens of millions of years, that will serve as a basis for future studies of GRN evolution.