Regulatory signatures of drought response in stress resilient Sorghum bicolor

The effects of drought stress can be devastating to crop production worldwide. A grand challenge facing agriculture is the development of crop varieties with improved drought resilience through breeding or biotechnology. To accelerate this, a mechanistic understanding is needed of the regulatory networks underlying drought response pathways in crop genomes and the genetic elements that modulate them. In this study, we explore the regulatory landscape of sorghum [Sorghum bicolor (L.) Moench] in response to controlled-environment drought stress. Sorghum is a C4 cereal crop with innate drought resilience and is an untapped resource of allelic diversity. To define molecular signatures of drought response, we mapped genome-wide chromatin accessibility using an Assay for Transposase Accessible Chromatin by sequencing (ATAC-seq) and analyzed parallel transcriptional profiles in drought-stressed sorghum shoot and root tissues compared to well-watered controls. Drought-responsive changes in chromatin accessibility were largely found in proximal promoters of differentially expressed genes and also in distal regions of the genome. These data were integrated to infer gene network connections and cis-regulatory modules that underlie drought response in sorghum, including cross-talk among hormone and nutrient pathways and the transcription factors that control them. Our analyses provide drought-inducible regulatory modules in the sorghum genome that can be leveraged for fine-tuning responses to stress, mining diversity for advantageous alleles, and translating across species to ultimately improve productivity and sustainability in sorghum and closely related cereal crops.


Introduction
Crop loss due to drought stress can be devastating. Climate shifts are predicted to become increasingly more erratic, leading to water deficits and higher temperatures that result in expansion of drought-affected arable land (Stocker et al., 2013;Varshney et al., 2018b;Gupta et al., 2020). Development of drought resilient crop varieties is necessary to help mitigate this challenge and enhance global food security (Hu and Xiong, 2014;Reynolds et al., 2016). Sorghum [Sorghum bicolor (L.) Moench] is the fifth most widely grown cereal crop in the world and an important staple in semiarid and arid regions of sub-Saharan Africa and south Asia (Vietmeyer et al., 1996). Because of its innate resilience to drought, heat and low nutrient inputs, sorghum presents an excellent model for studying the molecular basis for abiotic stress tolerance (Tuinstra et al., 1997;Boyles et al., 2019). Sorghum also boasts extensive phenotypic and genetic diversity (Morris et al., 2013;Lasky et al., 2015), and its adaptation to a wide range of environments provides an untapped resource for gene discovery and an ideal target for accelerated crop improvement through genomics-enabled breeding or engineering. In addition to being a staple grain crop, sorghum has emerged as an attractive system for the development of dedicated bioenergy feedstocks on marginal soils with minimal water and nutrient inputs (Mullet et al., 2014;Brenton et al., 2016). Sorghum also occupies a key evolutionary node that bridges studies in annual diploid grasses such as Zea mays and Setaria, and its close perennial and polyploid relatives, Saccarum and Miscanthus. A comprehensive understanding of the genetic factors underlying stress resilience in sorghum and the effects of genotype on phenotype, will help pinpoint targets for its improvement as well as in other cereals through comparative genomics.
Drought response in plants is a combination of cellular, morphological, and physiological components (Hu and Xiong, 2014), some of which are mirrored in other abiotic stresses that also cause dehydration, such as cold and salt stress (Nakashima et al., 2014). Plant response to dehydration at the cellular level includes production of osmoprotectants such as proline and trehalose, and stress-protecting, hydrophilic proteins like Late Embryogenesis Abundant (LEA) proteins. LEA protein synthesis promotes desiccation tolerance in seeds, but their production is also induced in vegetative tissues in response to drought (Battaglia et al., 2008;Olvera-Carrillo et al., 2011). It is suggested that regulatory modules that control LEA gene expression and other aspects of seed desiccation tolerance have been re-deployed to enhance drought tolerance in plants (Lamaoui et al., 2018;Pardo et al., 2020). Overexpression of LEA genes has been shown to confer stress tolerance in various crops (Chandra Babu et al., 2004;Duan and Cai, 2012;Amara et al., 2013), as well as through natural variation in their promoters (Xiao et al., 2007).
A major integrator of drought response in plants is the isoprenoid phytohormone abscisic acid (ABA). ABA enhances adaptation to drought and other dehydration stresses by regulating a range of physiological processes including stomatal aperture dynamics and protein storage for osmoprotection (Yoshida et al., 2015;Sah et al., 2016). ABA biosynthesis is induced under drought stress (Qin and Zeevaart, 1999;Endo et al., 2008), and its rapid local synthesis is regulated in part via hydraulic signals from the root (Christmann et al., 2007(Christmann et al., , 2013. In the leaf, ABA-induced stomatal closing allows plants to control water loss by reducing transpiration, but this also limits gas exchange for photosynthesis (Munemasa et al., 2015). Therefore, root-to-shoot coordination of ABA signaling limits above-ground biomass accumulation while enhancing water uptake from the soil. While the ABA pathway itself has been a major target of stress response mitigation, water use and photosynthetic efficiency are key traits for enhancing drought tolerance in crops (Reynolds et al., 2016;Bailey-Serres et al., 2019).
Drought tolerant genotypes tend to be associated with morphological traits like stay-green, decreased canopy leaf area, decreased stomatal density, increased leaf thickness or folding, and enhanced root system architecture; and physiological traits such as reduced transpiration rate and stomatal conductance, CO2 assimilation rate and canopy temperature depression (Hu and Xiong, 2014). Variation in morpho-physiological characters that contribute to abiotic stress tolerance is modulated at the molecular level by the spatiotemporal activity of transcription factors (TFs) (Joshi et al., 2016;Mittal et al., 2018). Various TF families, such as dehydration-responsive element-binding factor (DREB), basic leucine zipper (bZIP), and NAM-ATAF-CUC2 (NAC), have been implicated in regulating drought responses (Nakashima et al., 2014;Takasaki et al., 2015;Wang et al., 2019). The ABA-responsive element-binding factors (ABRE/ABFs) are bZIP TFs that drive ABA-dependent responses through positive relay of ABA perception by SNF1-related kinase 2s (SnRK2s) (Fujita et al., 2013;Yoshida et al., 2015). Numerous studies have demonstrated improved drought tolerance in crops by overexpressing or inducing stress-related TFs (Hu and Xiong, 2014;Lamaoui et al., 2018). A detailed understanding of their regulation and how they are integrated within a larger molecular network governing whole plant growth and development will enable fine-tuning of plant response to stress, while maintaining productivity.
TFs influence target gene expression through binding of non-coding, cis-regulatory elements (CREs) in the genome. CREs typically occur in combinatorial arrangements in proximal promoters or distal regulators, and the coordination of chromatin accessibility and spatiotemporal expression of regulatory TFs orchestrates plant growth and development (Swift and Coruzzi, 2017;Brkljacic and Grotewold, 2017). When TF complexes bind DNA, they displace nucleosomes and the genomic region around their binding site becomes more accessible. Genome-wide maps of chromatin accessibility can therefore be used as a proxy for defining functional regions of DNA. Several methods have been used to generate accessibility maps in plant genomes including nuclease-based assays such as DNase I hypersensitivity and micrococcal nuclease (MNase) (Vera et al., 2014;Oka et al., 2017;Zhao et al., 2020), and an Assay for Transposase Accessible Chromatin (ATAC), which uses a Tn5 transposase that inserts into accessible regions of DNA (Buenrostro et al., 2015). Studies based on these methods have revealed dynamic shifts in chromatin accessibility during development or in response to stimuli (Sullivan et al., 2014;Reynoso et al., 2019), as well as conserved and unique regulatory signatures across species (Maher et al., 2018;Lu et al., 2019;Han et al., 2020). Accessible chromatin has been associated with heritable variation (Rodgers-Melnick et al., 2016;Parvathaneni et al.) and can guide construction of gene regulatory networks through inferences of TF-CRE interactions (Sullivan et al., 2014).
A knowledgebase of the genes and regulatory sequences that underlie stress responses in plants will enable predictions on gene regulation that can help improve crops. In this study, we explore the drought-responsive genome in sorghum, and define gene regulatory modules that underlie its stress resilience. Using RNA-seq-based transcriptome analyses coupled with ATAC-seq from the same samples, we annotated drought-inducible signatures in developing leaf and crown root tissues in response to a controlled-environment drought stress. Using a co-expression gene regulatory framework across tissues, we inferred core TF-DNA interactions and combinatorial regulation of downstream target genes in response to stress. Our results provide a resource for elucidating molecular responses to drought in sorghum and a tool kit of drought-inducible regulatory sequences for engineering improved varieties.

Genome-wide dynamics of chromatin accessibility in response to drought
To define genetic elements that underlie drought response in sorghum, including those implicated in drought resilience, we profiled the gene regulatory space in shoot and root tissues of young sorghum plants subjected to a controlled drought treatment, and well-watered (WW) controls. Sorghum plants of the reference genotype BTx623 were grown in a controlled-environment chamber and watered to 100 percent Soil Moisture Capacity (SMC) until 17 Days After Sowing (DAS). At this time, plants in the water stressed (WS) group were gradually brought down to 25 percent SMC by 21 DAS, replacing water lost by transpiration, and held for 48 hours before destructive sampling ( Figure 1A). At 23 DAS, plants in both WW and WS groups were sampled as follows: the emerging leaf was dissected out and sectioned across a developmental gradient (Li et al., 2010), the inner developing leaves within the emerging leaf whorl were sampled, and the crown roots sampled after root washing ( Figure 1B). All samples were collected for transcriptome profiling using RNA-seq.
For the inner developing leaf and crown root samples, we also profiled chromatin accessibility using ATAC-seq from the same exact sample used for RNA-seq. Each biological replicate represented a single plant, and we analyzed three biological replicates in each group. Each individual leaf or root sample was lightly ground in liquid N and then divided in half. Nuclei were isolated from one half and used immediately for ATAC-seq library preparation (Supplemental Figure 1). The other half was ground further for subsequent RNA extraction and RNA-seq library construction. ATAC-seq libraries were sequenced and mapped to the sorghum reference genome v3 (phytozome.jgi.doe.gov; Supplemental Table   1). Overall, we achieved higher coverage for the leaf datasets. Visual scans of the ATAC-seq data on a JBrowse viewer and correlation analysis revealed high concordance across biological replicates genomewide ( Figure 2A; Supplemental Figure 1). Regions of significant read pile-ups were called as 'peaks' using the algorithm HOMER (Heinz et al., 2010), and are proxies for chromatin accessible regions (Supplemental Data Set 1). Most of the accessible genome was shared across tissues and treatment groups ( Figure 2B).
We defined a set of 'unique peak' regions of accessibility across both tissues and treatments that showed consistency across biological replicates (see Methods; Supplemental Table S2, Supplemental Data Set 2). Coverage of ATAC-seq reads in relation to protein-coding gene models showed strong enrichment of signal in the proximal promoter regions, and to a lesser extent at the 3' ends ( Figure 2C).
Distribution of unique peaks across genic and intergenic features showed that indeed, the majority of these accessible regions were located within the proximal promoters and 5' UTRs of genes ( Figure 2D; Supplemental Table S3). In addition, a substantial portion (29%) of accessible peaks were categorized as intergenic (excluding space within 2 kb up-and 1 kb down-stream of annotated gene models).
To define regions of the genome that showed differential accessibility in response to drought, we adapted a pipeline from (Sullivan et al., 2019). High-confidence ATAC peaks from both treatment groups were merged into a set of 'union peaks', and this was done separately for leaf and root (see methods). A statistical test based on the differences in number of ATAC transposition sites for each union peak between WW and WS plants, determined a set of Differentially Accessible Regions (DARs). Across the sorghum genome, we identified 6,496 and 2,275 drought-responsive DARs (median size ~443 bp) in developing inner leaf and crown root, respectively, and identified the closest gene model(s) within 10 kb to each DAR (Supplemental Data Set 3; Supplemental Table S3). This allowed us to link intergenic DARs to proximal genes and to investigate gene-associated DARs across tissues. For example, a droughtinduced DAR 3 kb upstream of the sorghum ortholog of viviparous 1 (vp1) from maize (Sobic.009G221400), was found in leaf but not root ( Figure 3E). In the genomics region around a GA-2 oxidase gene, there are both leaf-specific accessible chromatin signatures and a root-specific DAR ( Figure   3F).

Drought-induced gene regulation links hormone, stress, and physiology pathways in sorghum
We also profiled transcriptional changes in response to drought using RNA-seq in the exact same samples processed for ATAC-seq. In the inner leaf, 4,307 genes were differentially expressed (DE) in response to drought (FDR ≤ 0.05; ≥ 2 fold change), and 2,303 in crown roots ( Figure 3A; Supplemental Figure S2; Supplemental Data Set 4). Overall, gene expression differences in response to drought were larger in the developing leaf compared to the crown root ( Figure 3A). In addition, genes that were up-regulated in the leaf tended to show larger differences in expression, which coincided with enhanced chromatin accessibility at their promoters and was not observed for down-regulated genes (Supplemental Figure   S3A). Among the DE genes, 1,568 (36%) in inner-leaf and 712 (31%) in crown root were associated with DARs, and 248 of these were shared between both tissues ( Figure 3B;Supplemental Figure S3B; Supplemental Data Set 3). These genes are hereinafter referred to as differentially regulated. We classified differentially regulated genes as having correlated or anti-correlated expression values relative to the direction of accessibility change in the associated DAR. For example, is increased (+) or decreased (-) gene expression associated with more (+) or less (-) DAR accessibility in response to drought? We observed a strong positive correlation among gene expression and accessibility, with a higher number of genes positively correlated with DARs than anti-correlated ( Figure 3C). This potentially suggests more prominent examples of TF activator activity compared to repressor activity.
Gene Ontologies (GO) were used to test for overrepresentation of functional classes among differentially regulated genes. In both developing leaf and crown root, there was significant enrichment of genes associated with various, but related, stress responses (e.g., water deprivation, cold, salinity), sugar metabolism, and several hormone response pathways, including ABA ( Figure 3D; Supplemental Data Set 5). Certain functional categories showed tissue-specific enrichment and there were also differences in functional enrichment among genes showing up-and down-regulation in leaf and root tissues ( Figure   3D). For example, DE genes specifically up-regulated in the crown root were enriched for functional categories related to oxidative stress and protein folding, while down-regulated genes in the developing leaf were enriched for functional categories associated with developmental progression; e.g., 'regulation of meristem growth' (GO:0010075) and 'stomatal lineage progression' (GO:0010440). The latter being consistent with repression of developmental processes in response to stress, and perhaps reduction of stomatal patterning in early leaf development. Most genes related to hormone synthesis and signaling were up-regulated and associated with increased chromatin accessibility (+/+) in leaf and root, however biosynthesis and response of brassinosteroids (BRs) showed dampened regulation (-/-) in the leaf.
Response to cytokinin was enriched only among differentially regulated genes in the root. Interestingly, 'nitrate transport' (GO:0010167) was overrepresented among genes that were up-regulated in leaf, but down-regulated in root ( Figure 3D).
In addition to transcriptional changes in the developing inner leaf, we also profiled gene expression along the developmental gradient of the emerging leaf in response to drought. RNA-seq was performed on four distinct sections from base to tip, which capture distinct zones of developmental and metabolic transitions in grass leaves ( Figure 1A; Supplemental Data Set 6). The base leaf section, closest in progression to the developing inner leaf, is where developmental decisions are still occurring, while C4 photosynthesis and associated metabolism progresses towards the leaf tip (Li et al., 2010). As was shown in maize, Setaria, and rice, suites of TFs were differentially regulated along the developmental timeline of the leaf (Wang et al., 2014) as well as hormone responses (Supplemental Data Set 6). We also observed dynamic expression profiles in response to drought for many differentially regulated genes, including those with known functions in ABA synthesis and signaling as well as related to carbon-nitrogen metabolism and allocation and osmoprotection ( Figure 3E).

Drought-responsive DARs are enriched for binding sites of stress-associated TFs
To infer putative TF binding sites within DARs proximal to drought-responsive genes, we tested for enrichment of CREs. DARs were grouped based on their response to drought (more (+) or less (-) accessible), and the correlation of this response with that of the proximal genes (expression increase (+) or decrease (-)). An enrichment test for de novo motifs was performed on these subsets of DARs and Position Weight Matrices (PWMs) for enriched sequences were compared to known PWMs in the JASPAR plant database. Best matches to experimentally validated PWMs from plants were reported (Supplemental Data Set 7), which include binding sites for several TF families previously implicated in abiotic stress response; e.g., ABRE, DREB, and NAC.
We compared de novo motifs that were discovered between leaf and root, and among up-and down-regulated gene sets using STAMP (Mahony and Benos, 2007). This revealed tissue-specific preferences for certain motifs as well as motifs that were preferentially found in DARs associated with drought induced or repressed genes ( Figure 4A; Supplemental Data Set 7). Some elements were found more broadly; for example, the GATA-like motif (CGRTCS) was significantly enriched across all DAR subsets examined ( Figure 4A). The rice gene OsGATA8 has been shown to positively regulate genes related to photosynthetic efficiency and root biomass under osmotic stress, and negatively regulate genes Since TFs act in a combinatorial manner to modulate the stress response, we can leverage their common expression profiles and CREs that co-occur within DARs to infer TF-DNA interactions. We used the enriched set of PWMs to scan DARs associated with DE genes to define combinations of CREs that may provide insight into the TFs that modulate them in response to drought. For example, a gene encoding a LEA protein (Sobic.003G271800) was up-regulated in response to drought and its accessible promoter harbored several CREs including CRF4 and HY5/ABRE that were enriched in leaf (+/+) DARs ( Figure 4C). Alternatively, a gene encoding a small auxin up-regulated RNA (SAUR) was downregulated in response to drought and its promoter included several CREs found associated with (-/-) DARs, such as bHLH ( Figure 4C). ABRE-like sequences (GMCACGY; E-value = 2.3e-41) were among the most enriched in (+/+) DARs in both inner leaf and crown root. Expression of ABA-induced genes is typically driven by the presence of multiple ABA-responsive elements (ABRE) motifs or a combination of ABRE and a coupling element (CE) in the promoter (Yoshida et al., 2010;Shen et al., 1996;Hobo et al., 1999). A well-characterized aspect of this regulation in promoting desiccation tolerance is coordinated regulation of ABA responsive genes by the bZIP TF ABI5 (ABRE) and B3 TF ABI3 (orthologous to maize vp1). ABI3-like TFs bind the Ry element (CATGCA). Within leaf DARs (+/+ category), 42.5 % of the B3 domains annotated were in close proximity (within 200 bp) to ABRE motifs suggesting co-binding of these TFs during gene activation in response to drought stress (Supplemental Figure S4).

Gene network analysis resolves drought-associated regulatory modules in sorghum
We expect that genes with highly similar expression profiles across tissues and treatments could potentially work together in a given pathway, and/or be regulated by a common set of TFs. To help resolve TF-DNA interactions in response to drought, we used a weighted gene co-expression network analysis (WGCNA) coupled with a random forest classifier to construct a gene regulatory network based on the collective RNA-seq data across WW and WS samples. We used the WGCNA algorithm (Langfelder and Horvath, 2008) to generate a co-expression network that included 22,847 nodes (genes) in 23 distinct co-expression modules (Supplemental Figure S5). Module eigengenes (ME; the expression pattern that best fits an individual module across all samples) were evaluated for their significant associations with specific tissue types and the drought treatment ( Figure 5A). Certain MEs showed tissuespecific expression patterns; e.g., MElightcyan showed a strong association with specificity to the inner developing leaf modules and MEroyalblue with crown roots ( Figure 5A; Supplemental Figure 6).
Several modules showed strong associations with the drought response. Among these, MEcyan, MElightyellow, MEyellow and MEgrey60, showed positive correlations with drought response (R 2 > 0.50) and MEdarkred showed a strong negative correlation (R 2 = -0.67; Figure 5A). The ME profiles among positively regulated modules (MEcyan, MEyellow and MElightyellow) were closely related but showed distinct features ( Figure 5B; Supplemental Figure 6). A relatively high proportion of genes within cyan and lightyellow modules, 41 and 34 percent, respectively, were differentially regulated in response to drought in either inner leaf or crown root. Among genes within these modules, we observed overrepresentation of many functional categories related to various types of dehydration stress, ABA signaling, and osmotic adjustment (Supplemental Data Set 9).
We also integrated the co-expression network with information derived from regulatory interactions among TFs and their putative targets using the GENIE3 algorithm (Huynh-Thu et al., 2010).
We annotated 1,277 TFs in the network based on PlantTFDB (Jin et al., 2017) and used their network trajectories to identify connections with potential target genes. There were 179 TFs among the five drought-associated modules and we focused on TF-target interactions within these modules. This enabled us to identify suites of genes that were co-expressed in response to drought, and to predict potential TF-DNA interactions based on motif scans of DARs within promoters of co-expressed genes. We used the PWMs that were identified as enriched in DARs ( Figure 4A) and scanned all DARs associated with DE genes co-expressed within these five modules. This allowed us to determine co-occurrences of CREs across these drought-associated regulatory regions and strengthen predictions on TF-DNA interactions based on the GRN.
Of the TFs associated with the five drought modules, 84 were DE in response to drought and 37 were predicted to be network hubs across these modules. Four of these TFs are putative AREB/ABF transcription factors based on homology. While the regulatory networks associated with action of ABAinduced ABF TFs have been well-studied in Arabidopsis (Song et al., 2016), relatively little is known in grasses. We identified four enriched motifs from the DREME analysis that contained the "ACGT" core ABF binding motif sequences, which were further used for presence/absence screening in DARs of putative target genes. We examined target genes of Sobic.010G081800 (SbAREB1/ABF2) and Sobic.007G155900 (SbABF3) that were DE, DAR proximal and contained enriched ABF motifs. This resulted in 51 SbABF2 and 38 SbABF3. Of these, 23 targets were shared between the two ABF TFs. Several predicted targets are known genes in drought response such as PP2C, ABI-binding, SNRK3.14, SNF1, heat shock proteins, snd ABI1-like. Nine and 5 targets of SbABF2 and SbABF3, respectively, were also TFs. All but three of the combined ABF targets were up-regulated.

Discussion
Global demand for cereals is expected to reach 3 billion tons in 2050, with increased demand largely coming from developing countries in Asia and Africa (Alexandratos and Bruinsma, 2012). In addition, yields have begun to plateau for major cereals crops (Ray et al., 2012;Grassini et al., 2013). To mitigate this, step changes are needed in crop improvement, and climate-resilient crop species are an untapped resource for novel genetic and genomic information (Reynolds et al., 2016;Varshney et al., 2018a). There have been some successes using biotechnology approaches to overexpress drought-responsive genes in crops to enhance drought tolerance (Hu and Xiong, 2014;Shi et al., 2015;Lamaoui et al., 2018). Often, however, these result in growth defects or yield losses, particularly in favorable environments.
Accelerating crop yield to meet these growing global demands is going to require an understanding of the mechanisms governing existing traits that can be leveraged for enhancing productivity (Bailey-Serres et al., 2019). Our analysis of regulatory components across the sorghum genome and in response to a controlled drought stress, provides a resource for interrogating the mechanisms of gene regulation in this stress resilient crop.
These analyses in sorghum highlighted the induction of several conserved drought-responsive pathways, including components of ABA synthesis, signaling and response. Many of the associated genes in these pathways were differentially regulated in response to drought, and we identified cis-elements that potentially drive their expression through TF-DNA interactions. In general, we observed more upregulation in drought-responsive genes, which was consistent with enhanced chromatin accessibility. This suggests transcriptional activation genome-wide, which is the general mode of action in ABA-dependent signaling cascades. The ABA-dependent drought responses are propagated largely through the action of ABF/ABRE TFs, which are induced via phosphorylation by ABA-responsive SnRK2s. This phosphorylation allows for the rapid action of ABFs to transduce the drought response. The ABAmediated ABF regulome has been largely mapped out in Arabidopsis (Song et al., 2016), and it is expected to be conserved to some degree across species. However, our specific knowledge of this is lacking in crop species; any diversification or specialization in function can be control points for regulating drought response. Our network analyses revealed several ABF TFs positioned as hubs in the drought-associated sub-networks. The two ABFs were predicted to share several direct target genes, including heat shock factors (HSFs), G-box binding factor (GBF), and an AP2/ERF TF, but also distinct sets of targets. Interestingly, SbAREB1/ABF2 was predicted to directly target genes related to circadian clocks and flowering, including ELF3 and GI. Understanding the spatiotemporal action of the different drought responsive ABF TFs in sorghum can provide clues on how to manipulate their expression and target genes.
In addition to ABA-related processes, there were signatures of regulation by BRs and ethylene, which have also been implicated in various aspects of abiotic stress response. Notably, genes associated with BR synthesis and response pathways were down-regulated in response to drought and also Our study was performed in a controlled growth environment, and we could isolate the drought response more accurately by controlling for other variables. Of course, in a drought-stressed field environment, the molecular and physiological responses become more complex through interactions with other stresses that are typically coupled with drought, e.g. heat, salt or low nutrient stresses. Molecular signaling components of these stress pathways can also overlap, and perhaps rewiring drought-specific responses to some degree. Our functional enrichment analyses showed that DE genes in response to drought were associated with several types of stress and stress hormones, suggesting reuse of many genes in various stress contexts. Several of these are TFs that interface stress response pathways such as cold, heat, and salt (Nakashima et al., 2014). The knowledge gained here provides a set of drought-responsive promoters and predicted TF-DNA interactions that will help fine-tune a plant's response to its environment. This is a first look at the drought responsive genome in stress tolerant sorghum. We observed both known and novel aspects of drought response and inferred regulatory modules that function at the core of this response in young sorghum plants. Sorghum boasts a wide range of adaptation to various local climates, which likely underlies its stress tolerance and provides a repository for identifying stress loci. Extending these analyses to other genotypes will further enhance our understanding of how natural variation regulates response to the environment at the molecular level. Leveraging the extensive genetic diversity in sorghum can help pinpoint functional variants in gene regulation (e.g., regulatory elements that improve stress resilience with minimal disruption to the complex gene networks governing plant growth and development). Understanding the regulatory components that control stress response in resilient crops will help us fine tune-tune crops to improve success in various contexts.

Experimental design and tissue collection.
BTx623 sorghum seeds were germinated on peat moss in petri dishes for 3 days and then transplanted into a Metromix360/Turface MVP blend soil in 9x2x2 tree pots. Plants were grown in a controlled, highlight chamber at the Donald Danforth Plant Science Center Plant Growth Facility in 500µMol light, 14 hour days, 28°C days, 24°C nights. Pots were weighed each morning and drought stressed plants were watered based on what they lost the previous day to transpiration. Soil moisture content (SMC) and field capacity was measured by drying soil in a 37°C drying chamber for 4 days, weighing out material and adding water to retention capacity and re-weighing. Total water weight was calculated, and used as reference for appropriate watering targets. From 12-14 DAS, 10 mL of the respective solution was applied and pots were maintained at 80% SMC. From 15-23 DAS, plants were watered with reverse osmosis (RO) water and maintained at either 80% SMC (WW) or 25% (WS). Plants were sampled at 24 DAS in a random order, 1 hour after lights on and ~22 hrs since the previous watering. The emerging leaf was carefully dissected down to the ligular region and the developing leaves, the leaves within the emerging leaf, were removed and segmented into six 1 cm sections (1,2,3 cm above the ligule; 2 cm above and 2 cm below the midpoint or where the leaf had emerged from the whorl; and 4 cm from the tip of the leaf) were dissected from the plants and flash frozen in liquid N. The base of the stem was pulled from soil to remove crown roots. Roots were quickly washed, dried, and flash frozen in liquid N.

RNA-seq library preparation.
We generated a total of 17 full length RNA libraries for inner-leaf and root tissues (3 replicates per treatment; 2 replicates for root N-stress). Total RNA was extracted from the ~100 mg of ground tissue (inner-leaf and root) using an in-house Trizol extraction protocol. DNA contamination was removed using the turbo DNA-free kit (invitrogen) following the manufacturers settings and the RNA quality was checked using the Agilent Bioanalyzer RNA chip. RNA libraries were prepared using ~950 ng the DNAase treated total RNA using the NEBNext Ultra Directional RNA Library Prep Kit (Illumina) size selected for 200bp insert size. The adapters supplied in the NEBNext kit were used and 12 PCR cycles were used for the cDNA amplification. The final libraries were quality checked using the Agilent bioanalyzer using a DNA 1000 chip, 9 libraries were pooled to a single lane and 100bp SE reads were generated using the Illumina HiSeq 4000 platform (University of Illinois at Urbana-Champaign W.M. Keck Center). We obtained an average of 85 million reads per library.
For the leaf gradient samples, we generated 48 RNA-seq libraries using the Lexogen 3' mRNA-seq Library Prep Kit following the manufacturer's protocol and amplified using 12 cycles. Libraries were sequenced toan average 4M reads per library.

Analysis of RNA-seq data.
Single-end (SE) 100 nt reads (and paired end 150 nt) were adapter-and quality-trimmed using Trim RNAseq from leaf gradient samples were performed using Quantseq 3' FWD (Lexogen) kit. As read2 of the read pairs primarily contains PolyA sequences, only read1 was used for further processing.
Reads were adapter-and quality-trimmed using Trim Galore with parameters:--stringency 3 --clip_R1 13 --length 20 -q 20 followed by a second run with --stringency 3 -a A{10} --length 20 -q 20 in order to remove any PolyA sequences. Quantseq 3' end sequencing method typically represents reads near the 3' end of transcripts. We observed several instances of poorly annotated genes with completely missing or a short 3' UTR which led to reads mapping out of the genes. To minimize such issues, we curated the gene models by extending the 3' end by 250 nt or 750 nt for the genes with or without annotated 3' UTR respectively. Transcript quantification was performed using the quasi-mapping method, Salmon (parameters: --validateMappings --incompatPrio 0.0 --noLengthCorrection --noEffectiveLengthCorrection --noFragLengthDist --libType SF).

Nuclei isolation and ATAC-seq library construction
Flash-frozen inner-leaf and crown root tissue were ground in liquid nitrogen and ~0.2-0.3 g of tissue was aliquoted into a 15 mL falcon tube. Ground tissue was resuspended in 4 mL of 1x nuclei isolation buffer (16 mM HEPES; pH8, 200 mM sucrose, 0.8 mM MgCl2, 4 mM KCl, 32 % Glycerol, 0.25% Triton X-100, 1x complete protease inhibitor, 0.1% 2-ME, 0.1 mM PMSF) very gently at 4°C for 20 minutes and then filtered through 2 sheets of mira cloth. The resulting elutent (~3 mL) was equally split into two 2 mL eppendorf tubes and centrifuged at 1,000 x g for 15 minutes. After discarding the supernatant, the nuclei pellets in the two eppendorf tubes were resuspended in 400 µL of 1x tagmentation buffer, combined into a single tube and centrifuged at 1,000 x g for 5 minutes as a wash step (total 2 x washes). The nuclei were resuspended in 100 µL of 1x tagmentation buffer and observed under a microscope (2% acetocarmine stain) to check nuclear integrity. Nuclei were counted using a hemocytometer, and approximately 50,000 nuclei were used for tagmentation.
Libraries were amplified with 12 cycles. Libraries were diluted to 50 µL and cleaned up with a two-sided Ampure XP bead size selection. A 0.5:1 bead:sample ratio followed by a 1.2:1 bead:sample ratio was used to select ~200-1000 bp libraries. Sequencing was performed using Illumina HiSeq 4000 paired-end (PE) 50 platform.
The distribution of the high confidence ATAC-seq peaks and DARs across gene features was performed using custom scripts using R package Genomic Ranges. The midpoint of the ATAC-seq peaks were used for assignment to each of the gene features. Primary gene transcripts of the S. bicolor v3.1.1 annotations were used. The first gene within a 10kb distance (on either side of DAR) is considered to be cis-associated with the DAR.

Motif analysis
De novo motif analysis was performed using the Discriminative Regular Expression Motif Elicitation (DREME) tool in the MEME suite (Bailey, 2011) with default settings. Random genomic fragments of equal size to DARs from all ATAC HS regions combined were used as the background control. Resulting de novo motifs were compared to the JASPAR plant database (2018) to identify the closest matches using TOMTOM (Gupta et al., 2007). To compare de novo motifs across tissues and treatment groups, we used the STAMP (Mahony and Benos, 2007) web interface (http://www.benoslab.pitt.edu/stamp/). Genomic locations of de novo motifs were identified using Find Individual Motif Occurrences (Bailey et al., 2009) with settings " --max-strand --parse-genomic-coord --max-stored-scores 1000000 --thresh 0.0005". The background file for FIMO was generated using all the accessible peaks.

Gene Co-expression Network Analysis
A gene co-expression network was constructed using the WGCNA v1.69 package in R (Langfelder and Horvath, 2008). Expression data (TPM) from inner leaf, leaf gradient and root (44 samples) was used as input. Genes with low expression (< 5 TPM in at least 3 samples) were filtered out. Expression values from 22,847 genes were log2 transformed and a signed network was created using following parameters: corType="pearson", maxBlockSize=25000, networkType="signed", power=14, minModuleSize=30, deepSplit=2, mergeCutHeight=0.25). Based on the scale free topology model fit, a softthresholding power of 14 was selected. The network identified 23 modules of genes with distinct eigengenes each represented by a different color. The relationship between modules (module eigengene) and traits was determined and reflected as correlation coefficient and p-value. Treatments (Control and drought) as well as tissue types (inner leaf, Base+2, M+2, P+4, Tip and root) were considered as traits. Module-trait correlation of R 2 > 0.5 were considered as highly correlated and the corresponding modules were further analyzed. Hub genes for each module were identified by selecting genes with module membership > 0.8.

Target Gene Prediction
Putative target genes of transcription factors in the drought modules were identified by R implementation of GENIE3 (Huynh-Thu et al., 2010) using the expression values with following parameters: treeMethod= "RF", K= "sqrt", nTrees= 1000. Regulatory links with weight >0.005 were used for further analysis. Target genes were filtered by those that were DAR proximal and differentially expressed followed by the presence of corresponding TF binding motifs (obtained from the DREME and scanned by FIMO described above) in the associated DARs. Target network models for SbAREB1/ABF2 and SbABF2 were created using select filtered targets in the igraph R package (Csardi et al., 2006).

Data Accessibility
Raw data have been deposited in NCBI SRA as Bioproject ID PRJNA655502 and will be available upon publication.         Table S1: Mapping statistics for the ATAC-data from inner-leaf and root Supplemental    The xerico (xero) 1 gene is highlighted with a drought-specific signature in its promoter.

Author Contributions
(B) Overlap of genome accessible space across tissues and treatments. ~83% of the accessible genome was shared between drought and control treatments (87% in leaf and 78% in root) and ~78% was shared between tissues (75% in control and 81% in drought).
(C) Read coverage of the ATAC-seq across the consensus gene model in inner leaf and root tissue. A window of 2kb upstream and downstream of the consensus gene model was used.
(D) Distribution of the control inner-leaf and root ATAC peaks across the genomic features.
(E) A leaf-specific DAR is shown 3 kb upstream of the vp1 ortholog.
(F) Root-and leaf-specific chromatin accessibility shown around a gene encoding GA-2 oxidase.  (A) A subset of de novo motifs that were enriched in DARs correlating with expression of proximal genes (+/+ or -/-) were compared based on their relative enrichment in each category or between leaf and root tissues. PWMs shown reflect top match in JASPAR Plant DB. The enrichment scores for each PWM were compared pairwise to motifs found in all other categories to determine a similarity value (e-value).
(B) Expression differences of TFs associated with certain enriched motifs were plotted based on their log fold change differences between WW and WS across the inner leaf, developing leaf gradient, and crown root samples. Highlighted are orthologs of classical maize genes and those involved in drought stress. The sorghum orthologs of AREB1/ABF2 (left) and ABF3 (right) were identified as hub TFs in the cyan and yellow modules, respectively. Based on our GRN predictions, direct target genes of each within the five drought-associated modules were selected if they had a DAR within their proximal promoter (within 2 Kb) and curated ABF binding motifs within these. Nodes (genes) are represented as circles with edges connecting to their predicted targets. Common targets between ABFs are colored in gold and unique targets in light grey. The outline color of each circle indicates the module identity from the WGCNA.