Kinetic networks identify key regulatory nodes and transcription factor functions in early adipogenesis

Department of Biochemistry and Molecular Genetics, University of Virginia, Charlottesville, Virginia, United States of America Department of Stem Cell and Regenerative Biology, Harvard University, Cambridge, Massachusetts, United States of America Center for Cell Analysis and Modeling, University of Connecticut, Farmington, Connecticut, United States of America Department of Genetics and Genome Sciences, University of Connecticut, Farmington, Connecticut, United States of America


Introduction
Differential transcription factor (TF) activity defines cell identity and drives cellular responses to environmental stimuli by enforcing gene regulatory programs (Takahashi and Yamanaka 2006). Sequence-specific TFs bind to conserved motifs (Ptashne 1967) in regulatory elements (REs) within promoters and enhancers to regulate different mechanistic steps in transcription (Fuda et al. 2009). TFs recruit cofactors such as chromatin remodelers, acetyltransferases, methyltransferases, and general transcription machinery to REs. TFs are generally characterized as activators or repressors based upon their interaction partners, and recent studies more specifically describe TFs based upon their molecular function and which mechanistic steps they regulate (Danko et al. 2013;Duarte et al. 2016;Hah et al. 2011;Sathyan et al. 2019;Scholes et al. 2017). For example, pioneer transcription factors specialize in chromatin opening (Zaret and Carroll 2011). In addition to chromatin opening and RNA polymerase recruitment, many transcription steps are pre-cisely regulated, such as RNA polymerase pausing, elongation, and termination. RNA polymerase pauses ∼30-50 base pairs downstream of the transcription start site (TSS) (Rasmussen and Lis 1993;Rougvie and Lis 1988) and a vast majority of genes exhibit pausing (Core et al. 2008;Muse et al. 2007;Zeitlinger et al. 2007). Further modifications to the RNA polymerase complex triggers pause release and productive elongation (Marshall and Price 1995). Defining the steps regulated by TFs is necessary to understand how TFs coordinate with one another productively or antagonistically to regulate complex gene expression programs.
High-throughput sequencing has led to the development of hundreds of molecular genomics assays that query mechanistic events associated with transcription regulation. While each assay delivers a tremendous amount of information, each is limited in the biology that it measures. ChIPseq directly quantifies chromatin occupancy of proteins, but the assay is dependent upon availability of antibodies and limited to a single factor at a time. ATAC-seq and DNaseseq quantify chromatin accessibility, which is often used as a proxy measurement to infer RE activity (Boyle et al. 2008;Buenrostro et al. 2015). Typically, activator binding increases local chromatin accessibility and the presence of a cognate TF binding motif within a dynamic RE is used to infer TF binding without the need to perform individual ChIP-seq experiments for each factor (Siersbaek et al. 2011(Siersbaek et al. , 2014Wu et al. 1979). However, these assays do not directly inform on changes in transcription and RNA polymerase dynamics. Conversely, nascent transcription profiling with PRO-seq captures RNA polymerase density genome-wide at high spatial and temporal resolution (Kwak et al. 2013). PRO-seq is limited in its ability to identify potential upstream REs and regulatory factors. Only by combining multiple approaches can one fully capture the signaling dynamics driving transcription regulatory cascades.
Cell differentiation is a tightly regulated process involving many chromatin and transcriptional changes downstream of TF binding (Madsen et al. 2020;Rauch et al. 2019;Siersbaek et al. 2011;Thompson et al. 2016;Tsankov et al. 2015). Adipogenesis is a commonly used in vitro model of differentiation. Differentiation of immortalized 3T3-L1 mouse preadipocytes into mature adipocytes is a tractable and well-studied model of cell state transitions (Bernlohr et al. 1984;Green and Kehinde 1974). Mature adipocytes contribute to a multitude of metabolic processes by regulating energy balance, producing hormones, and providing structural and mechanical support (Rosen and Spiegelman 2006). Adipocyte hyperplasia downstream of increased adipogenesis is associated with pathogenesis of obesity, type 2 diabetes, and cardiovascular disease (Unamuno et al. 2018;Van Kruijsdijk et al. 2009). Adipogenic factors represent opportunities for intervention and possible mitigation of obesity-related sequelae (Ahmad et al. 2020;Ghaben and Scherer 2019). Prior studies have extensively characterized the TFs and gene expression changes required for adipogenesis (Lefterova and Lazar 2009;Rosen and MacDougald 2006;Siersbaek et al. 2012). However, previous work describing adipogenic signaling relied upon molecular readouts such as RNA-seq, which measures accumulation of mature mRNA molecules hours after transcriptional activation. Molecular events, such as TF binding, chromatin remodeling, and redistribution of RNA polymerase, occur on a time scale of seconds to minutes (Chen et al. 2014;Duarte et al. 2016;McNally et al. 2000). It is difficult to deconvolve mechanistic insights from RNA-seq data, which measures secondary and compensatory transcription as well as long-lived RNA species predating adipogenesis stimulation. Therefore, directly measuring nascent transcription with PRO-seq is better suited to measure rapid and potentially transient transcriptional events, such as those driving the first few hours of the adipogenesis cascade.
Transcriptional networks consist of multiple rapid waves of signaling through time with potential regulatory feedback and signal propagation through activation and repression of regulatory factors. Differentiating one wave from the next requires observations at multiple, closely spaced time points. In this study, we perform ATAC-seq and PRO-seq on 3T3-L1 cells at seven time points within the first four hours of adipogenesis. We incorporate accessibility and transcription changes into a multi-wave signaling network and identify TF families driving the regulatory cascade. We find that different factor families regulate distinct mechanistic steps in the transcription cycle and coordinate with one another to orchestrate the transcriptional response.

Results
TFs from at least 14 families modulate changes in chromatin accessibility in early adipogenesis. TFs bind promoters and enhancers to modify chromatin structure and influence transcription of nearby genes. To identify dynamic REs and potential regulatory TFs in early adipogenesis, we induced adipogenesis in 3T3-L1 mouse preadipocytes (see Methods), harvested samples at 8 time points, and performed genome-wide chromatin accessibility assays (ATAC-seq) ( Figure 1A). Chromatin accessibility is a molecular measurement used to infer TF binding and RE activity. We identified over 230,000 accessibility peaks and 13% change significantly over the 4 hour time course (Figure S1A). We clustered dynamic peaks based on kinetic profiles ( Figure S1C), which resulted in five general response classes ( Figure 1B). To identify candidate sequence-specific TFs that drive RE dynamics, we performed de novo motif analysis on dynamic peaks (Bailey et al. 2015). This approach yielded 14 potential TF family motifs including CEBP, TWIST, SP, KLF, GR, and AP1 ( Figure 1C & S1D). TF families comprise multiple proteins containing paralogous DNA binding domains that recognize very similar sequence motifs ( Figure 1C). We identified AP1, CEBP, and GR, which are known positive effectors of adipogenesis (Distel et al. 1987;Flodby et al. 1996;Freytag et al. 1994; Moitra et al. 1998;Ramji and Foka 2002;Rubin et al. 1978;Siersbaek et al. 2011;Steger et al. 2010;Tanaka et al. 1997;Wang et al. 1995;Yeh et al. 1995). Members of the KLF and SP families are known to be associated with both pro-adipogenic (Birsoy et al. 2008;Inuzuka et al. 1999;Li et al. 2005;Mori et al. 2005;Oishi et al. 2005;Pei et al. 2011) and anti-adipogenic functions (Banerjee et al. 2003;Kawamura et al. 2006;Sue et al. 2008;Tang et al. 1999). The TWIST family of TFs have previously unappreciated roles in adipogenesis, but have been shown to be important for differentiation of other mesenchymal cell types, such as osteoblasts (Bialek et al. 2004;Yousfi et al. 2001).
TF binding or dissociation from DNA leads to enrichment of cognate motifs in dynamic peaks. The biological functions of the TFs determine whether binding or dissociation results in increased or decreased accessibility. Binding of TFs that recruit activating cofactors, such as histone acetyltransferses or remodeling enzymes that eject nucleosomes, can increase accessibility; dissociation of these factors decreases chromatin accessibility. Likewise, binding and dissociation of factors that recruit deacetylases, repressive methyltransferases, or DNA methyltransferases can affect accessibility. We found that the majority of peaks containing CEBP, KLF, GR, or AP1 motifs increase accessibility, while peaks containing TWIST or SP motifs decrease accessibility ( Figure 1D). We performed the reciprocal analysis and plotted the density of motif instances relative to the summits of increased, decreased, and nondynamic peak classes to confirm the classification ( Figure 1E). AP1, GR, and CEBP motifs are strongly enriched around summits of increased peaks, while TWIST and SP motifs are enriched around summits of decreased peaks. SP and KLF families have paralogous DNA binding domains and recognize similar motif sequences; however, we confidently associate chromatin decondensation to KLF factors and chromatin condensation to SP factors ( Figure S1F). The SP family is composed of canonical activators, therefore SP TFs are likely dissociating from the chromatin to reduce accessibility (McKnight and Kingsbury 1982). Although we ascribe opening and closing functions to the KLF and SP families, it is impossible to determine the relative contribution of KLF and SP factors at any individual motif. We believe that the dual enrichment of KLF motifs at both increased and decreased peak summits is due to erroneous classification of SP-bound REs as KLF-bound REs. This complication is not limited to closely related motifs, as many dynamic peaks contain multiple factor binding motifs, making it difficult to isolate the contribution of individual factors. To address this complication, we plotted the changes in accessibility at peaks that contain only a single motif ( Figure 1F). This confirmed that the majority of isolated AP1, GR, CEBP, and KLF motif-containing peaks increase in accessibility, while TWIST and SP motif-containing peaks decrease. The biological interpretation of these results is that the adipogenic cocktail activates members of the AP1, GR, CEBP, and KLF TF families, leading to RE binding and chromatin decondensation. SP and TWIST motifs are associated with decreased accessibility. TFs can act as repressors by binding to chromatin and recruiting chromatin modifies such as deacetylases. Alternatively, dissociation of an activating TF can lead to gene repression. These results confirm the importance of several TF families and suggest that previously unappreciated TF families, such as TWIST, contribute to early adipogenesis.

SP, NRF, E2F6, KLF, and AP1 factors drive changes in bidirectional transcription at regulatory elements.
Coordinate TF binding ultimately results in the recruitment of RNA polymerases and initiation of transcription. In mammals, core promoters and enhancers often lack sequence information that consistently orient initiating RNA polymerases (Core et al. 2014). Therefore we sought to identify bidirectional transcription signatures as a complement to chromatin accessibility assays as a means to identify REs (Core et al. 2008;Danko et al. 2013;Seila Fig. 1. CEBP, TWIST, SP, KLF, GR, and AP1 TF families drive either increased or decreased chromatin accessibility in early adipogenesis. A) Preadipocyte fibroblast 3T3-L1 cells were treated with an adipogenesis cocktail and harvested at the indicated time points for ATAC-seq and PRO-seq experiments. B) Temporal classification of ATAC-seq peaks revealed five major dynamic classes. Each dynamic ATAC peak is a red or blue trace with the number of peaks in the class indicated in the lower right; the x-axis represents time and the y-axis indicates normalized accessibility. C) De novo motif analysis identified the top six DNA motifs enriched within dynamic peaks. The individual TFs listed in the wedge below the seqLogo recognize the respective DNA motifs. The heatmap quantifies the local protein sequence alignment of the DNA binding domains for the genes, as determined by the Smith-Waterman algorithm (Farrar 2007). D) Dynamic ATAC-seq peaks are classified by the presence of each DNA motif. The red bars represent the number of dynamic ATAC-seq peaks within the Immediate Increase, Transient Increase, and Gradual Increase categories; the blue bars correspond to the Transient Decrease and Gradual Decrease classes. E) Red, blue, and grey traces are composite motif densities relative to ATAC peak summits for the increased, decreased, and nondynamic peak classes. The y-axis quantifies the density of the indicated position-specific weight matrix and each motif instance is weighted by its conformity to a composite motif. F) Dynamic traces of peaks that exclusively contain the specified motif indicate that CEBP, GR, and AP1 associate with increasing accessibility; SP and TWIST associate with decreasing accessibility. Peak traces are colored as in panel (B). These conclusions are consistent with the reciprocal analysis from panel (E). et al. 2008). We captured the short lived divergent transcripts found at active REs with PRO-seq in parallel with the ATAC-seq adipogenesis time points ( Figure 1A). We used discriminative regulatory-element detection (dREG) to identify peaks of bidirectional transcription from our PROseq data (Wang et al. 2019). We identified over 180,000 dREG peaks (Figure 2A & B) and 18% change significantly over the time course ( Figure S2A). ATAC-seq and PROseq measure distinct but related biological phenomenon, therefore they identify different but overlapping sets of REs. Approximately 22% of dynamic dREG peaks overlap with dynamic ATAC-seq peaks, compared to 20% of dynamic ATAC-seq peaks in the inverse comparison. To further analyze the two classes of REs, we separated the dynamic dREG and ATAC-seq peaks into intragenic, intergenic, and promoter regions ( Figure 2C). Both methods effectively identify REs within promoters ( Figure S2B). We find PRO-seq more sensitively detects intragenic REs relative to other the other categories, while ATAC-seq efficiently detects intergenic REs. We sought to identify TFs that drive bidirectional transcription by further characterizing PROseq-defined REs.
We hypothesized that different sets of TF motifs are enriched within REs defined by ATAC-seq and PRO-seq. For instance, the cognate motifs of TFs that recruit initiation machinery may be preferentially enriched at dREG-defined REs. We performed de novo motif analysis on dynamic dREG peaks and found enrichment of AP1, SP, KLF, NRF, and E2F6 motifs ( Figure 2D). Of these, only the E2F6 motif was not also enriched in ATAC-seq peaks. We plotted motif density around the summits of either dynamic ATAC or dREG peaks to further differentiate ATAC and dREGdefined REs ( Figure 2E & Figure S2C). Of the motifs found de novo in dREG peaks, only E2F6, NRF, and SP were  The intragenic RE is only identified by its bidirectional PRO-seq signature while the upstream intergenic RE is only identified by ATAC-seq. C) Dynamic ATAC-seq and dREG-defined REs largely overlap in promoter regions. Intragenic regions are defined based on primary transcript annotation of PRO-seq data, promoters are between 150 bp upstream and 50 bp downstrean of TSSs, and intergenic regions are the remainder of the genome. D) Dynamic ATAC-seq peaks are enriched for a more diverse set of TF motifs than dynamic dREG peaks. E) Motif density distinguishes TFs associated with dynamic bidirectional transcription from those associated with dynamic accessibility. For example, TWIST and GR motifs are enriched within dynamic ATAC-seq peaks but are rarely found within dynamic dREG peaks. F) SP is only associated with bidirectional transcription at promoters and not distal REs. The top plot shows the average normalized PRO-seq signal for plus and minus strands around all 1,135,731 SP motif instances while the bottom plot displays all SP motifs excluding those in promoters (1,118,185). The distinct dual peak profile of bidirectional transcription collapses when only considering SP motifs outside promoters. G) Dynamic bidirectional transcription peaks found in promoters are stratified by the presence or absence of TF motifs. The left plot quantifies the total number of peaks and the right plot scales to the proportion of peaks in each category. The x-axis factor motif categories are defined by the presence or absence of ATAC-associated factors (AP1, CEBP, GR, KLF, and TWIST) and dREG-associated factors (SP, E2F6, and NRF). dREG-associated factor motifs are enriched in peaks that decrease bidirectional transcription, suggesting a link between SP, NRF, and E2F6 factors and an early and pervasive decrease in promoter initiation at their target genes. more enriched in dynamic dREG versus dynamic ATAC-seq peaks. We hypothesize that these three factor families regulate bidirectional transcription in early adipogenesis. The SP motif is found in over 25% of human and mouse promoters, making the SP motif the most enriched cis-regulatory element within promoters (Benner et al. 2013). To determine whether divergent transcription signatures found at SP motifs are dominated by SP factors within promoters, we plotted plus and minus strand nascent transcription at all SP motif instances ( Figure 2F top). Indeed, when SP motifs within promoters are removed from the composite input, divergent transcription peaks collapse ( Figure 2F bottom). We also observe this phenomenon with E2F6 and NRF motifs ( Figure S2D), implying that these factors and SP preferentially regulate divergent transcription at promoters. Next, we wanted to determine whether SP, NRF, and E2F6 motifs within promoters associate with increasing or decreasing divergent transcription. We find that bidirectional transcription tends to decrease in REs with dREG-enriched motifs as opposed to those without dREG-enriched motifs (Figure 2G). This further supports the previous conclusions that SP and NRF motifs associate with decreases in RE activity ( Figure 1E & Figure S1E). Although ATAC-seq and PROseq are both orthogonal methods to identify putative REs, we find that ATAC-seq is more sensitive at identifying distal regulatory TFs and PRO-seq primarily captures functional TFs within promoters.
Defining trans-edges. Genomic chromatin accessibility assays have revolutionized the detection of putative REs. We can determine the candidate functional TFs within the set of REs by searching for over-represented sequence motifs and determining the expression levels of TF family members. However, inferring TF binding from accessibility, motif, and expression data at any individual site remains a challenge (Guertin et al. 2012;Li et al. 2019). In addition to chromatin accessibility, expression of the TF, and presence of the TF's cognate motif, we leverage the change in accessibility over the time course to infer TF binding and dissociation events in adipogenesis. We term these changes in TF occupancy, which are directed linkages from TFs to REs, as trans-edges in our networks. We define the following rules for trans-edge inference: 1) The RE must first be defined as an ATAC-seq peak at any time point. 2) The binding motif of the upstream TF must be present in RE. 3) Chromatin accessibility must change significantly between two time points to infer binding or dissociation. 4) The direction of accessibility changes must match with the molecular function of the TF as defined in Figure 1. 5) Members of the TF family must be expressed at the appropriate time point. For example, the TF must be expressed at the later time point for binding and the earlier time point for dissociation. Mechanistically, TFs have short residency times on DNA and they are continually binding and dissociating from their sites in vivo (Chen et al. 2014;McNally et al. 2000). When we refer to binding and dissociation within the network, we are strictly referring to overall changes in occupancy at a genomic site within the popula-tion of cells.
The following examples highlight implementations of these rules. The Nr3c1 gene, which encodes GR, decreases expression immediately upon treatment ( Figure S3A). The rapid return to baseline accessibility at GR-bound REs is likely due to the decrease in GR expression as well as the degradation/dissociation of dexamethasone (Stavreva et al. 2015) ( Figure 1F). Therefore, we restrict edges attributed to GR to the first 40 minutes of the time course. In the case of SP, we find that Sp1, Sp3, and Sp4 are all repressed early in the time course ( Figure S3B). We hypothesize that the delayed accessibility decrease associated with SP motifs is due to transcriptional repression and natural turnover of the SP pool, which results in overall dissociation of SP on chromatin ( Figure 1F). We restrict trans-edges for SP to the later part of the time course. Conversely, we observe Twist2 gene activation early in the time course ( Figure S3C). Therefore, we predict that TWIST-associated repression is a result of increased TWIST binding and recruitment of negative cofactors. Twist2 expression levels have returned to baseline in mature adipocytes ( Figure S3D), suggesting that TWIST's effects are transient and can only be captured with an early, high-resolution time course. By focusing only on the REs that change accessibility and integrating with transcription data, we infer TF binding and dissociation events that drive adipogenesis.

Proximal changes in accessibility are tightly linked to transcription.
Chromatin accessibility positively correlates with local gene transcription. We confirmed this assertion by quantifying transcription of genes within 10 kilobases (kb) of dynamic ATAC-seq peak sets that exclusively increase or decrease accessibility ( Figure 3A). The majority of genes (63%) with one proximal increasing ATAC-seq peak are activated, likewise 68% of genes proximal to a single decreasing ATACseq peak are repressed. Genes near two or more increased accessibility peaks are much more likely to be associated with transcription activation, and vice versa ( Figure 3A). To further validate this association and explore the relationship between RE and target gene distance, we focused on all genes near one dynamic peak and stratified gene/peak pairs based on distance between the TSS and peak summit ( Figure S3E-G). The closer the peak and the gene, the more likely gene transcription and peak accessibility correlate in the same direction. This result indicates that proximal REs have a greater impact on gene expression than distal elements. Moreover, we plotted change in gene transcription against distance-scaled local accessibility changes and observed the expected positive correlation between transcription and accessibility at both early and late phases of the time course ( Figure S3H & I). These findings indicate that both accessibility dynamics and distance are important factors when considering the relationship between REs and genes.
We incorporated the distance between REs and genes as well as covariation in their accessibility and transcription to infer functional links, termed cis-edges, in our networks.
We define cis-edges as links between REs that directly regulate (i.e. cis) target gene transcription. For example, if GR binds a regulatory element within a gene's promoter and induces gene activation, we would draw a cis-edge between the RE and the gene. Within the network, we assign GR as an attribute to the edge. To confidently infer and annotate cis-edges, we must assess whether a class of TFs is associated with increasing or decreasing transcription. Since the distance between a RE and a gene influences the likelihood that accessibility and transcription will covary, we classified the function of a TF class within the context of adipogenesis by determining if peaks with a cognate TF motif are closer to activated or repressed gene classes. We expected that factor families associated with decreases in accessibility, like SP, would be closer to repressed genes on average. To test this hypothesis, we first categorized genes as significantly activated, repressed, or unchanged for each pairwise comparison within the time course. For example, Klf5 ( Figure 3B left panel) is one of a subset of 4225 genes immediately activated from 0 to 20 minutes ( Figure 3B right panel). Plotting the cumulative distribution function (CDF) for genes against the distance between the closest peak summit and the gene TSS shows that GR peaks tend to be closer to the 4225 activated genes compared to the repressed or unchanged genes ( Figure 3C). To estimate the maximum range that a factor can act, we plotted the difference between the repressed gene class CDF against the unchanged gene class CDF against distance between gene and peak ( Figure 3C inset). We find that the difference in the CDFs plateaus at around 114 kb, meaning that inferred GR binding events accumulate at the same rate for activated and unchanged genes at distances greater than 114 kb. Therefore, we conclude that GR can functionally activate gene expression from over 100 kb away. Immediately repressed genes like Stat1 are closer to SP peaks than control gene sets ( Figure 3D & E). This analysis indicates that SP acts very proximal to its target genes, with an actionable range of less than 2 kb (Figure 3E inset). This finding is consistent with our previous conclusions that decreased SP peaks are primarily found in promoters ( Figure 2F). While it is known that TFs can act proximally and distally to influence gene expression, these results provide distance constraints for specific factors.

Linking REs to target genes.
We incorporate these biological principles into logical rules to define cis-edge predictions within an adipogenesis network. First, a gene and regulatory element must be within 10 kb to infer a cis-edge. Second, RE accessibility and gene transcription must covary over the same time range. These two logical rules provisionally link REs and genes, then we employ additional rules that reflect the biology of individual TFs. For instance, the 10 kilobase distance metric is made more strict for factors like SP, for which the functional distance constraint as determined in Figure 3E is less than 10 kb. Temporal rules also influence edge predictions. For instance, GR-bound REs are only significantly closer to genes activated in comparison to the 0 minute time point such as the 20 vs. 0 comparison ( Figure 3C), meaning that genes activated later in the time course cannot be directly activated by GR binding in the network. Therefore, as with transedges, we only infer cis-edges between GR-bound REs and genes that change early in the time course. Incorporating these observations into our cis-edge rules we infer direct functional relationships between regulatory elements, bound TFs, and changes in target gene expression.

Constrained networks identify genes regulated combinatorially or by individual TF families.
Quantifying nascent transcription with PRO-seq maps the position and orientation of RNA polymerase with base-pair resolution. Nascent transcriptional profiling captures engaged RNA polymerase species throughout the genome, including intragenic features such as the proximal promoter and gene body. We can infer regulatory mechanisms of gene sets by quantifying relative changes in RNA polymerase density within the pause region and gene body. For instance, if the rate of RNA polymerase pause release increases between conditions, we expect that the signal in the pause region to decrease and the gene body signal to increase. Previous studies focus on biological systems where one TF dominates the response, such as ER, HSF, and NF-κB (Danko et al. 2013;Duarte et al. 2016;Hah et al. 2011). In these systems, the composite RNA polymerase signals at activated genes highlight differences in densities between pause and gene body compartments (Danko et al. 2013;Duarte et al. 2016;Hah et al. 2011;Sathyan et al. 2019). A complication in our system is that multiple TFs cooperate to drive transcription changes, making it difficult to identify the target steps (i.e. initiation, pause release) that TFs regulate. In order to address this complication, we identified genes that are predominantly regulated by a single TF in our network. We inferred TF binding events as trans edges and functional regulation of target genes as cis edges in our network. Genes and REs can be regulated or bound by either one or a combination of TFs. For example, we constructed a constrained network with RE and gene nodes downstream of individual TFs, including AP1. In this network, 1224 genes are solely activated by AP1 and 1847 genes activated by AP1 and at least one other factor ( Figure 4A). Most REs downstream of AP1, both individually and combinatorially bound, are not linked to any downstream genes (12608 v. 4829). This network highlights a paradigm in the transcription field that a minority of TF binding events lead to changes in gene expression (Spradling et al. 1975;Westwood et al. 1991). We constructed similar networks for GR ( Figure 4C Figure S4E). These networks illustrate the interconnectivity of gene regulation, while simultaneously identifying genes that are predominantly regulated by individual factors.
To extract mechanistic information from genes regulated by only one TF, we plotted composite RNA polymerase density from our PRO-seq data around pause peak summits at different time points for the isolated genes ( Figure 4B, Factors bind / dissociate from REs (purple circles) and regulate genes (blue squares). Colored arrows and numbers indicate the contribution of non-lead factors to RE activity. Combinatorially regulated REs are bound by the lead TF and either one or more of the other TFs. Composite PRO-seq signal is plotted relative to the promoter-proximal pause peak of (B) 1224 genes solely regulated by AP1, (D) 174 genes regulated by GR, and (F) 1127 genes regulated by SP. Inset violin plot illustrate the change in pause index for the gene set for the indicated time points. Each data point is a gene and all genes were input from the composite. D, F & S4B, D, F). The resulting traces show the characteristic pause peak centered around 0 followed by release of the RNA polymerase into the gene body. Examining RNA polymerase density traces of genes only activated by AP1 at 60 and 40 minutes show an increase in density in the pause region, suggesting increased RNA polymerase recruitment to AP1-activated genes ( Figure 4B). These time points were chosen because AP1 peaks are closest to genes activated between 60 and 40 minutes, suggesting that AP1 exerts the most transcriptional control during this time range. At first glance we see a similar result for GR when comparing traces from 0 and 20 minutes ( Figure 4D). However the situation becomes more complex when considering the ratio of pause density to gene body density, or pause index (PI). The PI for genes regulated solely by AP1 increases on average from 40 minutes to 60 minutes ( Figure 4B inset). Conversely, the PI for genes regulated solely by GR decreases on average between 0 and 20 minutes, suggesting GR primarily activates transcription by inducing pause release ( Figure 4D inset). The affected step is unique to the factor, as isolated AP1 genes don't exhibit the decrease in pause index from 0 to 20 minutes observed with isolated GR genes ( Figure S4G inset). Isolated GR genes show increases in pause index later in the time course, likely due to GR dissociation from the genome after the early phase of the time course and an associated decrease in pause release rate ( Figure S4H). The primary mechanism of SP and TWISTassociated gene repression is decreased RNA polymerase recruitment rate, as illustrated by the decrease in pause peak density and pause index ( Figure 4F & Figure S4F). While adipogenesis is a complex process involving many TFs, we are able to ascribe mechanistic functions to individual transcription factors with these networks.

Modeling changes in regulatory transcription steps.
We developed a mathematical model to further characterize how TFs target specific steps in the transcription cycle. Our model breaks up the gene unit into two compartments: a pause region and gene body region. PRO-seq directly measures RNA polymerase density within these regions for each gene. Within the model, rate constants corresponding to different transcription cycle steps, namely RNA polymerase recruitment / transcription initiation (k init ), premature termination (k pre ), pause release (k rel ), and elongation (k elong ) ( Figure 5A) determine RNA polymerase signal. We vary the rate constants for k init , k pre , and k rel over two orders of magnitude and vary k elong rate from 1500 to 6000 bases per minute (Ardehali and Jonkers and Lis 2015) to determine the effect on pause and gene body density and how the model compares to observed changes. We simplify the model by holding k elong constant between time points. Since k pre and k init are opposing rates in the model, we keep k pre constant between time points. We determine how changes in k init combined with k rel changes can account for the average density changes for the 174 isolated GRregulated genes from Figure 4D. We find that a 1.03-fold increase in recruitment / initiation and a 1.53-fold increase in pause release explain the changes in compartment occupancy between 0 and 20 minutes ( Figure 5B left). We generate a simulated composite based on these estimated changes in rate constants ( Figure 5C). Estimated pause residency time drops from 29 seconds to 19 seconds between 0 and 20 minutes as a result of the rate constant changes. Taking a similar approach, an 0.80-fold decrease in recruitment / initiation rate with no change in pause release rate produces observed changes in polymerase occupancy between 60 and 120 minutes for the 1127 isolated SP genes ( Figure 5B middle). This corresponds to a initiation / recruitment rate reduction from 15.1 to 12.1 polymerase molecules per minute ( Figure 5D). If SP factors normally stimulate initiation, then mass action would explain dissociation of SP factors upon transcriptional repression of SP genes. Previous studies link SP1 to transcriptional initiation through interaction with the TFIID general TF (Gill et al. 1994). The observed changes in RNA Polymerase composite profiles between 40 and 60 minutes at AP1 target genes is explained by a 1.3-fold increase in initiation rate and a 0.77-fold decrease in pause release rate (Figure 5B right). These relative changes in k init and k rel for AP1 targets do result in gene activation,  but it was unexpected that the profiles are explained by a decrease in k rel . Since composite profiles represent the average of all included genes, it is possible that the composite represents a diverse set of genes that are regulated by different AP1 family members. We speculate that we could gain a more clear insight if we were able to deconstruct the AP1 targets and identify gene targets of specific AP1 factors. The above analyses indicate that we can deconvolve complex transcriptional networks to identify gene targets of individual TFs and determine which steps in the transcription cycle each TF preferentially regulates.
TFs cooperate to bind REs and activate gene expression. AP1, CEBP, GR, and KLF bind REs either individually or in combination in order to activate expression. We classify REs based on the combination of factors that bind and drive accessibility changes. Likewise, we classify genes based on which TFs are immediately upstream in the network. Genes activated by the same combination of factors can be downstream of different classes of REs. We use genes activated by all 4 of AP1, CEBP, GR, and KLF to illustrate potential regulatory scenarios. These genes may be downstream of a single RE that binds all factors ( Figure 6A orange). Alternatively, the gene may be downstream of a pair of REs each binding 2 factors ( Figure 6A purple), 3 and 1 ( Figure  6A blue), or more complicated regulatory schemes (Figure 6A green). All 15 possible classes of REs contribute to activation of the 82 genes downstream of AP1, CEBP, GR, and KLF ( Figure 6B). The largest population of RE classes is isolated AP1 peaks with 8794, while peaks bound by all activating factors is the smallest with 74. The distribution of gene classes generally mirrors the distribution of RE classes, with isolated AP1 genes being the largest class. As discussed above, all factors activate more genes in combination than in isolation.  . C) Genes were sorted into categories based on proximity to AP1 motifs, proximity to GR motifs, and activation status. A significantly higher fraction of activated genes are GR-proximal among AP1-proximal genes compared to AP1-distal genes, suggesting the two factors coordinate to activate transcription. We found that the odds-ratio confidence intervals for the effect of AP1 on gene activation is conditionally dependent upon GR-bound status. D) The reciprocal analysis also indicated that AP1 and GR factors coordinate to activate transcription. E) We see an increase in proportion of genes proximal to TWIST peaks among repressed genes regardless of SP proximity, suggesting the two factors operate through different mechanisms. The odds-ratio confidence intervals for the effect of TWIST on gene repression is independent of SP status, thus repression and TWIST proximity are conditionally independent of SP status. F) Likewise, there is no significant difference in the odds of repression of SP-proximal genes with respect to TWIST proximity.
(1924 with AP1 v. 149 without). This finding, along with the high number of genes regulated by AP1, underscores the importance of the AP1 family in the network. While the bulk of negatively regulated genes are downstream of either SP or TWIST, approximately 20% are affected by both TWIST-mediated repression and SP-mediated attenuation ( Figure S5B).
We found that the relative change in transcription positively correlates with the number of immediate upstream activators in the network ( Figure S5D). Interestingly, there is not a significant relationship between magnitude of RE accessibility change and number of regulatory factors ( Figure  S5C). Normalizing transcriptional change by local accessibility change eliminates the observed correlation between transcription and number of regulatory factors ( Figure S5E). Therefore, transcription is positively correlated with total local accessibility change, regardless of the number of factors effecting that change. We conclude that if a gene is regulated in the network, the magnitude of expression change is independent from the number of upstream TFs.
Since the degree of activation and repression are unrelated to the number of upstream factors, we asked if having multiple TFs upstream in the network influences whether a gene is dynamic. To determine whether two TFs cooperate with one another we considered genes close to dynamic peaks with either a single TF motif or both TF motifs. We determine if the fraction of dynamic and nondynamic genes proximal to a TF is influenced by the presence of another TF. We define TF-proximal genes as genes that are close to dynamic ATAC peaks containing the TF motif. We find that there is no difference between the fraction of GR-proximal activated genes in the absence of AP1. However, there is an increase in the fraction of activated genes proximal to GR in the presence of AP1 ( Figure 6C). The reciprocal analysis shows that AP1 is a more effective activator in the presence of GR ( Figure 6D). These results support the model that AP1 and GR coordinate with one another to increase the likelihood of gene activation. The repressive factors TWIST and SP do not seem to work together in this way. The fraction of repressed genes proximal to TWIST increases regardless of the presence of SP ( Figure 6E). This suggests that TWIST functions largely independently of SP, supporting our hypothesis that the two TFs result in gene repression through unrelated mechanisms ( Figure S3B & C). Interestingly, we find a lower proportion of repressed genes proximal to SP motifs both in the presence and absence of TWIST ( Figure 6F). We speculate that these genes tolerate dissociation of SP and maintain their expression levels despite local decreases in chromatin accessibility. In support of this explanation, we find higher basal transcription and lower magnitude of repression in genes proximal to SP, suggesting these genes are more actively transcribed before loss of SP ( Figure S5F & G). These results highlight the complexity of gene regulatory control and how kinetic networks reveal coordinate and independent relationships between transcription factors.

Multi-wave networks incorporate molecular dynamics and kinetic information.
We further interrogate the early adipogenesis gene regulatory network by leveraging temporal information to infer multiple waves of accessibility and transcriptional changes throughout the time course. The importance of TFs can be inferred by the number of direct target genes (Figure 6) or the total number of connected downstream genes. The latter is captured by temporal multi-wave network depictions. We assembled a representative multi-wave deep network ( Figure 7A). The differentiation cocktail induces AP1 and GR binding to thousands of REs to activate thousands of genes; binding at 4 of these REs results in activation of the Twist2 gene ( Figure 7A & B). The resulting TWIST2 protein returns to the nucleus and binds hundreds of REs and represses its target genes. Among the hundreds of repressed TWIST2 target genes are the late (40+ minutes) acting factors Sp1 and Sp3 ( Figure 7C). The decreased occupancy of SP transcription factors from the genome leads to decreases in RE accessibility and attenuation of gene expression. Our multi-wave network reveals that SP1/3 dissociation and TWIST2 binding results in repression of Srf ( Figure 7D). We hypothesize that if we were to extend the time course, we would identify the SRF binding motif in REs decreasing in accessibility beyond 4 hours as result of attenuated transcription.
Many genes activated in the early phase of the time course are repressed later on, either through active repression or factor dissociation. We detect these negative feedback loops for each activating TF ( Figure S6A). About 63% of AP1, 74% of CEBP, and 80% of GR cis-edges are transient. Only 27% of KLF cis-edges are attenuated, suggesting that KLF-mediated activation is less transient and less dependent on the extracellular stimuli found in the adipogenic cocktail. Similarly, a minority of TWIST cisedges and no SP cis-edges are attenuated, indicating that SP and TWIST factors mediate sustained repression. A much smaller proportion of trans-edges are attenuated, implying that accessibility changes downstream of factor binding and dissociation are more stable ( Figure S6B) than changes in nascent transcription.
We find that regulatory potential for each TF varies greatly throughout the time course. AP1, CEBP, and GR activate the most genes during the initial phase of the time course, indicating that these TFs precipitate the initial wave of signaling during the first 20 minutes. Transcriptional activation of TWIST and KLF family genes leads to the next wave of signaling after 20 minutes ( Figure 7B and Figure  3B). We begin to detect changes in accessibility at KLF-and TWIST-bound REs as early as 20 minutes ( Figure S6D), however these presumptive binding events do not manifest as detectable changes in nascent transcription until 40 minutes ( Figure 7E). Although we had originally expected changes in accessibility and transcription to be observed concomitantly, these data show that we have the sensitivity to detect changes in RE accessibility before changes in transcription.

Sp1 Sp3
Srf 0-60' 0-40' 0-60' 0-120' 0-20' 0-20' 0-20' 0-20' 20-120' 0-120' 40-240' 20-120' 20-60' 40-240' 40-240' 40-120' 40-120' Figure 4. In addition, we added time interval attributes to respective edges to indicate when REs and genes are changing. B-D) UCSC genome browser shots indicate accessibility and peaks (top three tracks) as well as nascent transcription (next two tracks) dynamics between the indicated time points. E) Wedged bar plots quantify the regulatory kinetics across the time course for indicated factors. The x-axis intervals represent the time range in which the indicated number (y-axis) of genes are regulated (connected in the network) by the specified factor. Wedges between bars indicate carryover elements from previous time interval and the outer "wings" represent elements that are not included in the previous time interval. The top shaded blue wedges represent genes regulated by multiple factors; bottom wedges represent genes that are solely regulated by the indicated factor. F) A Twist2-centric network illustrates the high connectivity and influence of Twist2.
In addition to the TFs whose activity is stimulated by the adipogenesis cocktail, we identify transcriptionally regulated TF genes that are highly connected nodes within the network. The Twist2 gene is the most highly connected node and directly affects accessibility and transcription of thousands of downstream nodes by binding REs and repressing proximal genes ( Figure 7F). TWIST2 acts through intermediate factors, such as SP, AP1, GR, to repress thousands of additional genes. In the case of SP, TWIST2 mediated repression of Sp1 and Sp3, results in SP dissociation and activation attenuation of downstream genes. TWIST2-mediated repression of AP1 factors causes AP1 dissociation and attenuation of AP1-mediated activation. The cumulative result from both direct TWIST2 action and indirect dissociation / attenuation of TWIST2-targeted TF families affects accessibility at 12662 REs and 4574 genes. We believe that TWIST2 may have been overlooked as an important adi-pogenic TF because Twist2 is only transiently activated ( Figure S3C), but this kinetic network implicates TWIST2 as a critical intermediary in the adipogenesis cascade.

Discussion
Kinetic accessibility and nascent transcriptional profiling of developmental cascades can identify key regulatory nodes that may be transiently active, but are nonetheless necessary for proper cellular differentiation. We present an extremely rapid and precise capture of chromatin and transcription changes induced by an adipogenic cocktail. These changes represent the first few waves of differentiation signaling and precipitate the cellular transition process. RE accessibility and gene transcription change within minutes of initiating adipogenesis. By focusing only on dynamically accessible REs, we can infer TF binding and dissociation events that drive adipogenesis without performing hundreds of ge-nomic ChIP experiments. We find a multitude of enriched TF family motifs, many of which have been previously associated with adipogenic REs including AP1, GR, KLF, and CEBP (Siersbaek et al. 2014). We do not identify PPARγ, the master regulator of adipogenesis (Lefterova et al. 2014;Rosen et al. 2002), as a driver of early adipogenic signaling. This agrees with previous conclusions that PPARγ does not influence adipogenesis until several days into the process (Nielsen et al. 2008). Stable PPARγ activity is indispensable for adipogenesis and maintaining adipocyte identity, but other factors may be critically important and overlooked because their role is transient.
Our method implicates TWIST2 as a novel contributor to adipogenesis. The TWIST subfamily of bHLH TFs homoand heterodimerize with other bHLH proteins to affect gene expression. Although TWIST family factors all recognize the same DNA motif, different members can act as either activators or repressors. TWIST proteins can repress transcription by non-productive dimerization with TWIST family activators, competing with TWIST family activators for DNA motifs, or by recruiting chromatin condensers like HDACs to the genome (Bialek et al. 2004;Gong and Li 2002;Hamamori et al. 1999;Hayashi et al. 2007;Koh et al. 2009;Lee et al. 2003;Šošić et al. 2003). While multiple mechanisms may be at play in our system, we hypothesize that our observed repressive effects are downstream of increased TWIST2 binding. TWIST1 and TWIST2 negatively regulate multiple developmental pathways including myogenesis, osteogenesis, and myeloid differentiation (Bialek et al. 2004;Gong and Li 2002;Hebrok et al. 1994;Murray et al. 1992;Sharabi et al. 2008;Spicer et al. 1996). The role of the TWIST TF family in adipogenesis is less clear. While TWIST1 and TWIST2 are known regulators of mature adipose tissue homeostasis, TWIST1 does not affect adipogenesis (Dobrian 2012;Lee et al. 2003;Pan et al. 2009). Interestingly, homozygous Twist2 mutations cause Setleis syndrome, a disease characterized by facial lesions lacking subcutaneous fat (Tukel et al. 2010). Twist2 knockout mice develop such lesions and lack lipid droplets within the liver and brown fat tissue (Šošić et al. 2003;Tukel et al. 2010). Even with a well-studied system such as adipogenesis, these methods were able to identify Twist2 as a novel regulator of the differentiation cascade. Previous phenotyping of Twist2 -/mice and Setleis syndrome patients confirms its importance.
Our networks define genes sets that are predominantly regulated by a single TF. We can track changes in RNA polymerse density within the gene sets to identify the target regulatory steps of individual TFs. Stimulated pause release is an established cause of early gene activation in adipogenesis (Wang et al. 2021). We find that GR is largely responsible for the observed increase in pause release. GR is a well-established activator of gene expression (Vockley et al. 2016), often in combination with AP1 (Biddie et al. 2011). Other activating factors, including AP1, increase RNA polymerase recruitment to the gene. By acting on separate steps, GR and AP1 provide non-redundant stimuli to target genes. We find GR and AP1 are conditionally dependent upon one another in their potential to activate local genes. A recent study suggests that both AP1 and CEBP act as pioneer factors that prime the genome for GR-induced transcription activation (Wissink et al. 2021). We find all three of these factor families activate the initial wave of transcription changes, both in combination and in isolation.
We confidently differentiate primary, secondary, and tertiary transcriptional changes by examining multiple, closely spaced time points upon induced adipogenesis. ATAC-seq, ChIP-seq, or chromatin conformation assays alone can only suggest functional relationships between REs and genes (Lieberman-Aiden et al. 2009;Ren et al. 2000). Similarly PRO-seq and RNA-seq return transcription changes with little information regarding upstream regulation. We define individual regulatory relationships more directly by focusing only on the ATAC peaks and genes that are proximal and covary in their dynamics (i.e. significantly change over in the same direction the time course). Our bipartite directed graph networks are unique in the gene regulation field because each edge represents a functional interaction as opposed to an abstract relationship between linked nodes. Trans edges represent binding of TF proteins to cognate DNA elements and cis edges describe regulatory interactions between REs and target genes. These networks can define genes sets that are predominately regulated by a single TF and identify the target regulatory steps of the TF. Highly connected nodes in the network are candidate key regulatory hubs in the differentiation cascade. Moreover, these networks ascribe time attributes to each edge, so subgraphs that respect the flow of time are easily extracted from the larger graph. This integrative genomics approach to network construction can be applied to a multitude of cellular responses and transitions to uncover novel biology and new hypotheses.

ATAC-seq library preparation.
We prepared ATAC-seq libraries as previously described (Corces et al. 2017). We trypsinized and collected cells in serum-free growth media. We counted ∼5 x 10 4 cells per replicate and transferred them to 1.5 mL tubes. We centrifuged cells at 500 x g for 5 minutes at 4°C and resuspended the pellet in 50 µL cold lysis buffer (10mM Tris-HCl, 10mM NaCl, 3mM MgCl 2 , 0.1% NP-40, 0.1% Tween-20, 0.01% Digitonin, adjusted to pH 7.4) and incubated on ice for 3 minutes. We washed the samples with 1 mL cold wash buffer (10mM Tris-HCl, 10mM NaCl, 3mM MgCl 2 , 0.1% Tween-20). We centrifuged at 500 x g for 10 minutes at 4°C and resuspended cells in the transposition reaction mix (25 µL 2X TD buffer (Illumina), 2.5 µL TDE1 Tn5 transposase (Illumina), 16.5 µL PBS, 0.5 µL 1% Digitonin, 0.5 µL 10% Tween-20, 5 µL nuclease-free water) and incubated at 37°C for 30 minutes. We extracted DNA with the MinElute kit (Qiagen). We attached sequencing adapters to the transposed DNA fragments using the Nextera XT Index Kit (Illumina) and amplified libraries with 6 cycles of PCR. We performed PEG-mediated size fractionation (Lis 1980) on our libraries by mixing SPRIselect beads (Beckman) with our sample at a 0.55:1 ratio, then placing the reaction vessels on a magnetic stand. We transferred the right side selected sample to a new reaction vessel and added more beads for a final ratio of 1.8:1. We eluted the final size-selected sample into nuclease-free water.

ATAC-seq analyses.
We aligned reads to the mm10 mouse genome assembly with bowtie2, sorted output BAM files with samtools, and converted files to bigWig format with seqOutBias (Langmead and Salzberg 2012;Li et al. 2009;. We called accessibility peaks with MACS2 (Zhang et al. 2008). We sort reads into peaks using the bigWig R package and identify differentially accessible REs with DESeq2 (Love et al. 2014;Martins 2014). We cluster dynamic peaks into response groups using DEGreport (Pantano 2019). We performed de novo motif extraction on dynamic REs with MEME and used TOMTOM to match motifs to the HOMER, Jaspar, and Uniprobe TF binding motif databases (Bailey et al. 2015;Heinz et al. 2010;Hume et al. 2015;Khan et al. 2018). We use the bigWig package to assess motif enrichment around ATAC-seq peak summits (Martins 2014).

PRO-seq library preparation.
We performed cell permeabilization as previously described . We trypsinized and collected cells in 10 mL ice cold PBS followed by washing in 5 mL buffer W (10 mM Tris-HCl pH 7.5, 10 mM KCl, 150 mM sucrose, 5 mM MgCl 2 , 0.5 mM CaCl 2 , 0.5 mM DTT, 0.004 U/mL SUPERaseIN RNase inhibitor (Invitrogen), protease inhibitors (cOmplete, Roche)). We permeabilized cells by incubating with buffer P (10 mM Tris-HCl pH 7.5, KCl 10 mM, 250 mM sucrose , 5 mM MgCl 2 , 1 mM EGTA, 0.05% Tween-20, 0.1% NP40, 0.5 mM DTT, 0.004 units/mL SU-PERaseIN RNase inhibitor (Invitrogen), protease inhibitors (cOmplete, Roche)) for 3 minutes. We washed cells with 10 mL buffer W before transferring into 1.5 mL tubes using wide bore pipette tips. Finally, we resuspended cells in 500 µL buffer F (50 mM Tris-HCl pH 8, 5 mM MgCl 2 , 0.1 mM EDTA, 50% Glycerol and 0.5 mM DTT). After counting nuclei, we separated cells into 50 µL aliquots with ∼3-5 x 10 5 cells each. We snap froze our aliquots in liquid nitrogen and stored them at -80°C. We prepared PRO-seq libraries as previously described (Sathyan et al. 2019). We included a random eight base unique molecular identifier (UMI) at the 5 end of the adapter ligated to the 3 end of the nascent RNA. We did not perform any size selection in an attempt to not bias our libraries against short nascent RNAs.

PRO-seq analyses.
First we used cutadapt to remove adapters from our reads (Martin 2011). We used fqdedup and the 3 UMIs to deduplicate our libraries ). Next we removed UMIs and converted reads to their reverse complement with the FASTX-Toolkit (Gordon 2010). As with the ATAC-seq samples, we used bowtie2, samtools, and seqOutBias to align, sort, and convert reads to bigWig files respectively (Langmead and Salzberg 2012;Li et al. 2009;. We used primaryTranscriptAnnotation to adjust gene annotations based on our PRO-seq data (Anderson et al. 2020). We queried the bigWig files within the adjusted genomic coordinates with the bigWig R package and UCSC Genome Browser Utilities (Kent et al. 2010;Martins 2014). We identified differentially expressed genes with DESeq2 (Love et al. 2014). We used dREG to define peaks of bidirectional transcription from our bigWig files (Wang et al. 2019). As with the ATAC-seq samples, we identified overrepresented motifs in dREG-defined REs with MEME and TOMTOM (Bailey et al. 2015). We evaluate motif enrichment around peak summits and polymerase density in the gene body and pause region with the bigWig package (Martins 2014). We define the summit of the pause peak for genes by first identifying the point of maximum density within 1 kb of the TSS. We define the pause region as the 50 bp window around the summit.

DNA binding domain alignments.
DNA binding domains were extracted from the TFClass database (Wingender et al. 2018) and TF paralogs that were absent from the database were extracted from the NCBI protein database. DNA binding domains were aligned using FASTA (Pearson and Lipman 1988) and the following command: ssearch36 -s MD40 -m 8CBl. Although there are six DNA motifs, the TWIST and ZNF families of DNA binding domains recognize the same motif, despite their lack of evolutionary conservation.

Network construction.
The bipartite directional networks with gene and RE nodes were inferred using a data-driven rules-based approach. The first rule to infer trans-edges from TF families to individual REs is that the RE must contain the cognate motif for the TF family. The second is that the peak must be dynamically accessible over some part of the time course. The third is that at least one gene encoding a member of the TF family must be expressed and activated (or in the case of SP, repressed) over the same time range. We restrict trans-edges attributed to GR to the first 40 minutes of the time course for reasons discussed in the text. Similarly, we do not draw edges from SP before 40 minutes. Next, we drew cis-edges between REs and proximal genes based on a different rule set. First, REs need to be within 10 kb of gene bodies as defined by primary transcript annotation of our PRO-seq data. We used bedtools to find gene-RE pairs that satisfied this rule (Quinlan and Hall 2010). Next, the peak and the gene need to covary in accessibility and transcription during the same time range. For example, a gene must be activated at the same time as its local RE is increasing in accessibility. We refined the distance requirements by incorporating constraints from our CDF analysis. For each activating factor (AP1, CEBP, GR, KLF) we find a set of pairwise comparisons within the time course for which factor REs are significantly closer to activated than nondynamic genes. We find a similar set of comparisons for repressive factors (SP, TWIST). For a gene to be linked to a factor RE with a cis-edge, we require that the gene must be dynamic in at least one of the comparisons identified by the CDF analysis for that factor. In addition, our CDF analysis also identifies the maximum distance between a factor RE and a regulated gene for each comparison. The RE and the gene's TSS must be within the relevant distance threshold defined by the CDF.
All analysis details and code are available at https: //github.com/guertinlab/adipogenesis. Raw sequencing files and processed bigWig files are available from GEO accession record GSE133147 (PRO-seq) and GSE150492 (ATAC-seq).

Compartment modeling.
Detailed analysis and raw code is available at https: //github.com/guertinlab/modeling_PRO_composites. We calculated pause region signal by summing the PROseq signal over a 50 base pair window centered on the summit of the pause peak. The gene body RNA polymerase density was determined by averaging the PROseq signal over the region from the end of the pause window to the transcription termination site determined by primaryTranscriptAnnotation. We described RNA polymerase density dynamics in each compartment using differential equations that incorporate rates of transcription initiation, premature termination, pause release, and elongation. We determined how different rate constants would affect hypothetical pause region and gene body densities and the pause index using the pksensi package. We varied initiation, premature termination, and pause release rate constants from 0.01 to 1 molecules per second and varied elongation rate from 25 to 100 molecules per second and calculated compartment density at each step. We excluded any parameter estimates that result in more than 1 RNA polymerase molecule in the pause region at any given time. We selected all sets of parameters that resulted in the observed median composite pause index for all time points from Figure 4. Next, we sought to determine if changing the rate constants could explain observed changes in compartment density at different gene sets for the indicated time point comparisons. We varied rate constants from the early time point values to model RNA polymerase density ratio changes between time points. We varied the initiation and pause release constants from their initial values over a 5fold range in each direction and determined which sets of constants produced the observed changes in pause density ratio for each gene. We allowed the target ratio to vary by 5% compared to the observed. We plotted model RNA polymerase density curves by choosing the set of parameters with the elongation rate closest to a consensus rate of 2,500 bases / min (Ardehali and Lis 2009; Jonkers and Lis 2015), while still accurately reproducing the composite profiles densities within 5% of the original. In order to reproduce composite plots, we spread the pause density over a 50 base region and fit the density profiles to a waveform as previously described (Sathyan et al. 2019).   C) After clustering ATAC-seq peaks based on accessibility dynamics, we combined clusters into more inclusive response classes by cutting the dendrogram at the indicated height of 2. D) In addition to the motifs discussed in Figure 1, de novo motif analysis identified 8 factor families as potential regulators of ATAC-seq peak dynamics. These motifs are less enriched than those in Figure 1. E) We plotted enrichment of each motif family around peak summits as in Figure 1E. F) Composite motif extraction on all peaks with either SP or KLF motifs returns a KLF-like motif in the increased peaks and an SP-like motif in the decreased peaks, suggesting that KLF and SP factor families are associated with increased and decreased accessibility respectively. We scored each SP/KLF motif in a dynamic peak against an SP composite and a KLF composite and plotted the two scores against one another. Each data point is colored based on the dynamics of the peak. The dividing line separates points such that all motifs on the left are labeled as KLF motifs and all motifs on the right are labeled SP motifs. The line is drawn in such a way as to maximize the number of increased KLF peaks and decreased SP peaks. . dREG peak analysis reveals potential factors driving bidirectional transcription. A) We used a likelihood ratio test to define dREG peaks as dynamic or nondynamic using an adjusted p-value cutoff of 1 * 10 −5 . B) ATAC-seq and dREG-defined REs -including dynamic and nondynamic peaks -largely overlap in promoter regions. Genome regions are defined as described in Figure 2 legend. Overlapping categories are not equal between ATAC and dREG peaks because multiple ATAC peaks may overlap a single dREG peak and vice versa. C) Motif density plots identify CTCF, ELF, and TFAP2 as enriched in dREG peak summits over ATAC-seq peak summits. Since these factors were not found in the de novo analysis, we did not consider them dREG-enriched factors. D) Plotting PRO-seq signal around non-SP dREG factor motifs shows the bidirectional transcription phenotype but not as clearly as with the SP motifs.  genes, and C) the Twist2 gene. Nr3c1, Sp1, Sp3, and Sp4 all exhibit rapid early repression, while Twist2 is immediately and transiently activated. D) The UCSC genome browser shot shows that early activation of Twist2 gene has subsided by the time cells reach full maturity at 6 days. E-F) Box and whisker plots representing genes within 10 kb of either (E) a single increased peak or (F) a single decreased peak. Genes were split into four quartiles based on the distance between the peak summit and the TSS of the gene. Box and whisker plots on the left contain the smallest distances, while those on the right represent the largest. The y-value is the transcription of the genes over the first hour of the time course. We find that for genes near increased peaks, the closer the peak is to the gene the more likely gene transcription and peak accessibility dynamics will covary. We do not find a similar relationship for genes near decreased peaks. G) Box and whisker plots representing genes within 10 kb of either a single increased peak (red plot) or a single decreased peak (blue plot). The y-value represents the distance between the peak summit and the TSS of the gene. Genes near decreased peaks are generally closer to the peak summit than the increased peak-gene pairs. H-I) Scatter plot of genes dynamically transcribed between H) 60 and 0 minutes or I) 240 and 60 minutes. Each data point represents a gene. The x-value is total accessibility change over the time period scaled by the distance between the TSS and each peak. Positive x-values indicate increases in local accessibility. The y-value is change in transcription over the same time period. A gene with positive x and yvalues or negative x and y-values exhibits positive covariation between local accessibility and transcription dynamics. The majority of genes for both time point comparisons exhibit positive covariation, indicating a correlative relationship between accessibility and transcription. There is no correlation between normalized transcription and number of regulatory factors when controlling for changes in accessibility. F) Repressed genes proximal to SP motifs tend to exhibit a lower magnitude of transcriptional change. G) Repressed genes proximal to SP motifs tend to exhibit higher basal transcription. A) The percentage of each factor's attenuated cis-edges highlights the transient nature of gene expression changes in early adipogenesis. Early activators show the highest proportion of attenuated edges, meaning early activation is followed by a return to baseline expression. There are no attenuated SP cis-edges, meaning genes with decreased expression downstream of SP dissociation do not return to baseline within the time course. B) The percentage of each factor's attenuated trans-edges indicates that binding is more stable than gene expression changes. GR trans-edges are most likely to be attenuated. There are no attenuated SP trans-edges. C) TF genes are highly interconnected; arrows represent direct regulatory relationships between factors. Arrows emanate from regulatory factor family and point to target factor family. Solid arrows represent gene activation, blunt-ended arrows represent repression, and dashed arrows represent attenuation. The numbers indicate how many TF family members exhibit the indicated regulatory relationship. For example, the arrow from AP1 to KLF represents the 7 KLF family genes activated by AP1 factors. D) Wedged bar plots quantify the regulatory kinetics across the time course for indicated factors. The x-axis intervals represent the time range in which the specified factor regulates changes in accessibility of the indicated number of peaks (y-axis). Wedges between bars indicate carryover peaks from previous time interval and the outer "wings" represent peaks that are not included in the previous time interval. The top shaded purple wedges represent peaks regulated by multiple factors; bottom wedges represent peaks that are solely regulated by the indicated factor.