A transcription factor quintet orchestrating bundle sheath expression in rice

C4 photosynthesis has evolved in over sixty plant lineages and improves photosynthetic efficiency by ∼50%. One unifying character of C4 plants is photosynthetic activation of a compartment such as the bundle sheath, but gene regulatory networks controlling this cell type are poorly understood. In Arabidopsis a bipartite MYC-MYB transcription factor module restricts gene expression to these cells but in grasses the regulatory logic allowing bundle sheath gene expression has not been defined. Using the global staple and C3 crop rice we identified the SULFITE REDUCTASE promoter as sufficient for strong bundle sheath expression. This promoter encodes an intricate cis-regulatory logic with multiple activators and repressors acting combinatorially. Within this landscape we identified a distal enhancer activated by a quintet of transcription factors from the WRKY, G2-like, MYB-related, IDD and bZIP families. This module is necessary and sufficient to pattern gene expression to the rice bundle sheath. Oligomerisation of the enhancer and fusion to core promoters containing Y-patches allowed activity to be increased 220-fold. This enhancer generates bundle sheath-specific expression in Arabidopsis indicating deep conservation in function between monocotyledons and dicotyledons. In summary, we identify an ancient, short, and tuneable enhancer patterning expression to the bundle sheath that we anticipate will be useful for engineering this cell type in various crop species.


Introduction
In plants and animals significant progress has been made in understanding transcription factor networks responsible for the specification of particular cell types.In animals, for example, homeobox transcription factors define the body plan of an embryo (Lewis 1978;Krumlauf 1994), and cardiac cell fate is specified by a collective of five transcription factors comprising Pnr and Doc that act as anchors for dTCF, pMad and Tin (Junion et al. 2012).In plants the INDETERMINATE DOMAIN (IDD) transcription factors work together with SCARECROW and SHORTROOT to specify endodermal formation in the root (Moreno-Risueno et al. 2015;Drapek et al. 2017), PHLOEM EARLY (PEAR) and VASCULAR-RELATED NAC DOMAIN (VND) transcription factors permit production of phloem and xylem vessel respectively (Kubo et al. 2005;Miyashima et al. 2019),and basic helix-loop-helix (bHLH) transcription factors determine differentiation of guard cells (MacAlister et al. 2006;Ohashi-Ito and Bergmann 2006;Pillitteri et al. 2006;Kanaoka et al. 2008).Moreover, transcription factor networks that integrate processes as diverse as responses to external factors such as pathogens and abiotic stresses (Nakashima et al. 2009;Tsuda and Somssich 2015), or internal events associated with the circadian clock (McClung 2006;Nagel and Kay 2012) and hormone signalling (Depuydt and Hardtke 2011;Verma et al. 2016) have also been identified.Transcription factor activity is decoded by short cis-acting DNA sequences known as enhancers.The binding of multiple transcription factors to enhancers thus controls transcription and the spatiotemporal patterning of gene expression.For example, the Block C enhancer interacts with the core promoter to activate expression of FLOWERING LOCUS T (FT) in long days (Adrian et al. 2010;Liu et al. 2014), and a distant upstream enhancer controls expression of the TEOSINTE BRANCHED1 locus in maize responsible for morphological differences compared with the wild ancestor teosinte (Stam et al. 2002;Clark et al. 2006).In contrast to the above examples, transcription factors and cognate cis-elements responsible for the operation of cell types in grasses once specified have not been defined (Weber et al. 2016;Schmitz et al. 2022).
Given the increased specialisation of organs evident since the colonisation of land this lack of understanding of gene regulatory networks controlling cell specific gene expression is striking.For example, in the liverwort Marchantia polymorpha the photosynthetic thallus contains seven cell types (Wang et al. 2023), while leaves of Oryza sativa (rice) and Arabidopsis thaliana possess at least fifteen and seventeen populations of cells as defined by single-cell sequencing respectively (Wang et al. 2021).In leaves of these angiosperms, particular cell types are specialised for photosynthesis and so whilst photosynthesis gene expression is induced by light in all major cell types of the rice leaf the response is greater in spongy and palisade mesophyll cells compared with guard, mestome and bundle sheath cells (Swift et al. 2023).In the case of the bundle sheath, these cells carry out photosynthesis, but are specialised to allow water transport from veins to mesophyll, sulphur assimilation and nitrate reduction (Leegood 2008;Aubry et al. 2014b;Hua et al. 2021).And, strikingly in multiple lineages, the bundle sheath has been dramatically repurposed during evolution to become fully photosynthetic and allow the complex C 4 pathway to operate (Sage 2004).
Compared with the ancestral C 3 state, plants that use C 4 photosynthesis operate higher light, water and nitrogen use efficiencies (Makino et al. 2003;Sage 2004;Mitchell and Sheehy 2006).It is estimated that introducing the C 4 pathway into C 3 rice would allow a 50% increase in yield (Mitchell and Sheehy 2006;Hibberd et al. 2008), but it requires multiple photosynthesis genes to be expressed in the bundle sheath, including enzymes that decarboxylate C 4 acids to release CO 2 around RuBisCO, organic acid transporters, components of the Calvin-Benson-Bassham cycle, RuBisCO activase, and enzymes of starch biosynthesis (Kajala et al. 2011;Aubry et al. 2014a;Ermakova et al. 2021).In summary, although the bundle sheath is found in all angiosperms and associated with multiple processes fundamental to leaf function, the molecular mechanisms responsible for directing expression to this cell type, including in global staple crops, remain undefined.We therefore studied the bundle sheath to better understand the complexity of gene regulatory networks that operate to maintain function of a cell type once it has been specified.Rice was chosen as it a global crop, and identifying how it patterns gene expression to the bundle sheath could facilitate engineering of this cell type.
We hypothesized that analysis of endogenous patterns of gene expression in the rice bundle sheath would allow us to identify a strong and early-acting promoter for this cell type.Once such a promoter was identified we also hypothesised that it could be used to initiate an understanding of the cis-regulatory logic that allows gene expression to be patterned to this cell type in grasses.We tested twenty-five promoters from rice genes that transcriptome sequencing indicated were highly expressed in these cells.Of these, four specified preferential expression in the bundle sheath, and one derived from the SULFITE REDUCTASE (SiR) gene generated strong bundle sheath expression from plastochron 3 leaves onwards.Truncation analysis showed that bundle sheath expression pattern from the SiR promoter is mediated by a short distal enhancer and a pyrimidine patch in the core promoter.This bundle sheath module is cryptic until other enhancers acting to both constitutively activate and repress expression in mesophyll cells are removed.The enhancer is composed of a quintet of cis-elements recognised by their cognate transcription factors from the WRKY121, GLK2, MYBS1, IDD and bZIP families.These transcription factors act synergistically and are sufficient to drive expression of the strong bundle sheath SiR promoter.

The SiR promoter directs expression to the rice bundle sheath
To identify sequences allowing robust expression in rice bundle sheath cells we used data derived from laser capture microdissection of bundle sheath strands and mesophyll cells from mature leaves.Promoter sequence from seven of the most strongly expressed genes in bundle sheath strands (Supplemental Figure 1A) were cloned, fused to the β-glucoronidase (GUS) reporter and transformed into rice.Although five of these promoters (MYELOBLASTOSIS, MYB; HOMOLOG OF E. COLI BOLA, bolA; GLUTAMINE SYNTHETASE 1, GS1; STRESS RESPONSEIVE PROTEIN, SRP; ACYL COA BINDING PROTEIN, ARP) led to GUS accumulation, it was restricted to veins (Supplemental Figure 1B, 1C).And, for the SULFATE TRANSPORTER 3;1 and 3;3 (SULT3;1 and SULT3;3) promoters, no staining was observed (Supplemental Figure 1B, 1C).The approach of cloning promoters from bundle sheath strands therefore appeared to be more efficient at identifying sequences capable of driving expression in veins.We therefore optimised a procedure allowing bundle sheath cells to be separated from veins (Hua and Hibberd, 2019) and produced high quality transcriptomes from mesophyll, bundle sheath and vascular bundles (Hua et al. 2021).From these data eighteen genes whose transcripts were more abundant in bundle sheath cells compared with both veins and mesophyll cells were identified (Supplemental Figure 2A).When the promoter from each gene was fused to GUS and transformed into rice, those from ATP-SULFURYLASE 1B, ATPS1b; SULFITE REDUCTASE, SiR; HIGH ARSENIC CONTENT1.1,HAC1.1; and FERREDOXIN, Fd were sufficient to generate expression in the bundle sheath (Supplemental Figure 2B).However, ATPS1b and Fd also displayed weak activity in the mesophyll, and the HAC1.1 promoter also led to GUS accumulation in epidermal and vascular cells.Thus, only the SiR promoter drove strong expression in the bundle sheath with no GUS detected in other cells (Supplemental Figure 2B, 2C).2B, 2C).In summary, most candidate promoters failed to generate expression that was specific to bundle sheath cells, but the region upstream of the rice SiR gene was able to do so.We therefore selected the SiR promoter for further characterization.

The SiR promoter drives strong and early expression in bundle sheath cells
Sequence upstream of the SiR gene comprising nucleotides -2571 to +42 relative to the predicted translational start site was sufficient to generate expression in the rice bundle sheath.To allow faster analysis of sequences responsible for this output we domesticated the sequence by removing four BsaI and BpiI sites such that it was compatible with the modular Golden Gate cloning system.When this modified sequence was placed upstream of the GUS reporter it also generated bundle sheath preferential accumulation (Figure 1A).Fusion to a nuclear-targeted mTurquoise2 fluorescent protein confirmed that the SiR sequence was sufficient to direct expression to bundle sheath cells, and also revealed expression in the longer nuclei of veinal cells (Figure 1B).Expression from the domesticated and non-domesticated sequences was not different (Figure 1C).Compared with 0.58 nmol 4-MU/min/mg protein previously reported for the Zoysia japonica PHOSPHOENOLCARBOXYKINASE (PCK) promoter (Emmerling 2018) activity from the SiR promoter was at least 36% higher.Designer Transcription Activator-Like Effector (dTALEs) and cognate Synthetic TALE-Activated Promoters (STAPs) amplify expression and allow multiple transgenes to be driven from a single promoter (Brückner et al., 2015;Danila et al., 2022).
We therefore tested whether bundle sheath expression mediated by the SiR promoter is maintained and strengthened by the dTALE-STAP system.Stable transformants showed bundle sheath specific expression (Supplemental Figure 3A, 3B), and GUS activity was ~18-fold higher than that from the endogenous SiR promoter (Supplemental Figure 3C).We conclude that the SiR promoter is compatible with the dTALE-STAP system and its activity can be strengthened.We also investigated when promoter activity was first detected during leaf development and discovered that GUS as well as fluorescence from mTurquoise2 were visible in 5-20mm long fourth leaves at plastochron 3 (Supplemental Figure 4).This was not the case for the ZjPCK promoter even when a dTALE was used to amplify expression (Supplemental Figure 4).We conclude that the SiR promoter initiates expression in the bundle sheath before the ZjPCK promoter, and that it is also able to sustain higher levels of expression in this cell type.

A distal enhancer and Y-patch necessary for expression in the bundle sheath
The SiR promoter contains a highly complex cis landscape (Figure 1D) comprising at least 638 predicted motifs from 56 transcription factor families (Supplemental table 1).We therefore designed a 5' truncation series to investigate regions necessary for expression in the bundle sheath (Figure 1E).Deleting nucleotides -2571 to -2180 and -1490 to -980 led to a statistically significant reduction and then increase in MUG activity respectively but neither truncation abolished preferential accumulation of GUS in the bundle sheath (Figure 1E-F).However, when nucleotides -980 to -394 upstream of the predicted translational start site were removed GUS was no longer detectable in bundle sheath cells (Figure 1E-F).Consistent with this, MUG assays showed a statistically significantly reduction in activity when these nucleotides were absent (Figure 1G).Thus, nucleotides spanning -980 to -394 of the SiR promoter are necessary for bundle sheath specific expression.
To test whether this region is sufficient for bundle sheath specific expression we linked it to the minimal CaMV35S core promoter.Although weak GUS signal was detected in a few veinal cells, this was not the case for the bundle sheath (Figure 1E-G).We conclude that sequence in two regions of the promoter (from -394 to +42 and from -980 to -394) interact to specify expression to the bundle sheath.To better understand this interaction we next generated unbiased 5' and 3' deletions.This second deletion series further reinforced the notion that the SiR promoter contains a complex cis-regulatory landscape.For example, when nucleotides -980 to -829 were removed very weak GUS staining was observed and the MUG assay confirmed that activity was significantly reduced to 11% that of the full-length promoter (Figure 2, Supplemental Figure 5).We conclude that nucleotides -980 to -829 from the SiR promoter are necessary for tuning expression in the leaf.
When nucleotides -829 to -700 were removed GUS appeared in mesophyll cells (Supplemental Figure 5).Truncating nucleotides -613 to -529 abolished GUS accumulation (Supplemental Figure 5).The 3' deletion that removed nucleotides -251 to +42 also stopped accumulation of GUS in both bundle sheath and mesophyll cells (Figure 2A-C, Supplemental Figure 5).Notably, when the distal region required for bundle sheath expression (-980 to -829) was combined with nucleotides -251 to +42 these two regions were sufficient for patterning to this cell type (Figure 2).
Having identified a region in the SiR promoter that was necessary and sufficient for patterning to the bundle sheath, we next used phylogenetic shadowing and yeast one hybrid analysis to better understand the cis-elements and trans-factors responsible.Analysis of cis-elements in the SiR promoter that are highly conserved in grasses identified a short region located from nucleotides -588 to -539 that contained an ETHYLENE INSENSITIVE3-LIKE 3 (EIL3) transcription factor binding site (Supplemental Figure 6A&B).Whilst deletion of this motif had no detectable effect of patterning to the bundle sheath (Supplemental Figure 6C) the level of expression was reduced (Supplemental Figure 6D).We infer that the EIL3 motif positively regulates activity of the SiR promoter but is not responsible for cell specificity.These data are consistent with the promoter truncation analysis that showed nucleotides -613 to -529 containing this motif were not required for bundle sheath specific expression, but instead function as a constitutive activator (Supplemental Figure 5).When yeast one hybrid was used to search for transcription factors capable of binding the SiR promoter, sixteen were identified (Supplemental Figure 7A, 7B).For each, cognate binding sites were present.This included TCP21 and OsOBF1 that can bind to TCP motifs and Ocs/bZIP elements respectively.Consistent with the outcome of deleting the EIL3 motif, three EIL transcription factors interacted with nucleotides -899 to -500 (Supplemental Figure 7B, 7C).
Examination of transcript abundance in mature leaves showed that most of these transcription factors were expressed in both bundle sheath and mesophyll cells (Supplemental Figure 7D) implying that combinatorial interactions with cell specific factors are likely required for bundle sheath specific expression from the SiR promoter.

The enhancer contains four subregions that simultaneously activate in bundle sheath and repress in mesophyll cells
The truncation analysis above identified two short regions comprising nucleotides -980 to -829 and -251 to +42 that were necessary and sufficient for expression in the rice bundle sheath (Figure 2, Supplemental Figure 8).Sequence spanning nucleotides -251 to +42 includes both the annotated 5' untranslated region but also likely contains core promoter elements (Supplemental Figure 9A).Re-analysis of publicly available data identified two major transcription start sites at positions -91 (TSS1) and -41 (TSS2) (Supplemental Figure 9A).Although no canonical TATA-box was evident in this region, a TATA-box variant was detected at position -130 (5'-ATTAAA-3') (Civáň and Švec 2009) that could be responsible for transcription from TSS1.Moreover, upstream of TSS2 is a putative pyrimidine patch (Y-patch) that represents an alternate but common TC-rich core promoter motif in plant genomes (Civáň and Švec 2009) (Supplemental Figure 9A).Scanning sequence from -251 to +42 for core promoter elements also identified MTE (Motif Ten Element), BREu (TFIIB Recognition Element upstream) and DCE-S-I (Downstream Core Element S-I) motifs associated with eukaryotic core promoters (Supplemental Figure 9B).We therefore assume the region upstream of TSS1 and TSS2 contains the core promoter elements.When consecutive deletions to this sequence were made, statistically significant reductions in MUG activity were evident but there was no impact on accumulation of GUS in the bundle sheath.Interestingly, when the Y-patch was retained but the TATA-box like motif removed, low levels of GUS specific to the bundle sheath were apparent (Supplemental Figure 9C, 9D, 9E, 9F), and deletion of the Y-patch completely abolished GUS staining (Supplemental Figure 9C, 9D, 9E, 9F).Consistent with the Ypatch being important for bundle sheath expression, when core promoters from other genes (PIP1;1, NRT1.1A) containing a Y-patch were linked to the distal enhancer from SiR bundle sheath expression was detected (Figure 3A,3B), but this was not the case for genes with only a TATAbox (Figure 3A,3B).GUS activity was higher from the PIP1;1 core promoter that contains more Ypatches.Overall, we conclude that the TATA-box like motif is not required for expression in the bundle sheath, but the Y-patch is necessary for this patterning and in combination with a distal enhancer comprising nucleotides -980 to -829 it is sufficient for expression in this cell type.
We assessed the distal enhancer for transcription factor binding sites.The FIMO algorithm identified motifs associated with WRKY, G2-like, MYB-related, MADS, DOF, IDD, ARR, SNAC (Stress-responsive NAC) families.PlantPAN (Chow et al. 2018), which includes historically validated cis-elements, found an additional Dc3 Promoter Binding Factor (DPBF) binding site for group A bZIP transcription factors (Figure 3D).Seven consecutive deletions spanning this enhancer region and hereafter termed subregions a-g were generated (Figure 3D).Although veinal expression persisted when subregions a, b and d were absent, deletion of subregions a, b, d and f resulted in loss of GUS from bundle sheath cells (Figure 3E-3G).MUG analysis showed that deletion of all four regions significantly reduced promoter activity (Figure 3G).In contrast, deletions of nucleotides -938 to -923 (subregion c), -904 to 873 (subregion e), and -853 to -829 (subregion g) had no impact on the patterning (Supplemental Figure 10).The subregions necessary for expression in the bundle sheath contained unique binding sites for WRKY, G2-like, MYB-related, IDD, NAC and bZIP (DPBF) transcription factors.To examine the significance of these regions in the context of full-length SiR promoter, consecutive deletions from subregion a to f were generated (Supplemental Figure 11A).Deletion of subregion a, d or f, led to GUS accumulating primarily in mesophyll cells whereas removal of subregion b, c or e, caused GUS staining in both mesophyll cells and bundle sheath cells (Supplemental Figure 11B).No significant changes in GUS activity were observed in these deletion lines (Supplemental Figure 11C).We conclude that that the distal enhancer generates expression in the bundle sheath due to four distinct sub-regions, and that nucleotides between -980 to -853 also function as repressors of mesophyll expression by interacting with nucleotides -829 to -251.

WRKY, G2-like, MYB-related, IDD and bZIP transcription factors activate the distal enhancer
To gain deeper insight how the distal enhancer operates we employed multiple approaches including transactivation assays, co-expression analysis and site directed mutagenesis.The distal enhancer contained WRKY, G2-like, MYBR, IDD, SNAC and bZIP (DPBF) motifs (Figure 4A, Supplemental Figure 12).We therefore cloned rice transcription factors from each family and used them as effectors in transient assays (Supplemental Figure 13).WRKY121, GLK2, MYBS1, IDD2/3/4/6/10, and bZIP3/4/9/10/11 transcription factors led to the strongest activation of expression from the bundle sheath enhancer (Figure 4B, Supplemental Figure 14A-14D), whereas the stress-responsive NAC transcription factors targeting a SNAC motif that overlaps a bZIP (DPBF) motif, activated less strongly than bZIP factors (Supplemental Figure 14E).We therefore conclude that the SNAC motif is not important for activity of the bundle sheath enhancer.
Effector assays using pairwise combinations of transcription factors showed synergistic activation from the distal enhancer when GLK2 and IDD3,4,6,10 were co-expressed (Figure 4C).
Co-expression analysis using a cell-specific leaf developmental gradient dataset revealed that GLK2, MYBS1 and IDD4,6,10 transcription factors that bind the G2-like, MYB-related and IDD motifs respectively were more abundant in mesophyll cells (Figure 4D).However, the bZIP9, IDD2 and WRKY121 transcription factors strongly correlated with SiR transcript abundance and were preferentially expressed in bundle sheath cells (Figure 4D).To test whether bZIP9, IDD2 and WRKY121 are sufficient to pattern SiR expression to specific cells, we mis-expressed the single transcription factor bZIP9, both bZIP9 and IDD2, and all three (bZIP9 and IDD2 and WRKY121) in the mesophyll (Supplemental Figure 15A, 15C, 15E).Mis-expression of bZIP alone induced GUS expression from the bundle sheath enhancer in some mesophyll cells (Figure 4E, Supplemental Figure 15B), and mis-expression of both bZIP9 and IDD2 induced greater expression in mesophyll cells (Figure 4E, Supplemental Figure 15D).Strikingly, the expression of bZIP9 and IDD2 and WRKY121 in mesophyll cells fully activated expression in this cell type (Figure 4E, Supplemental Figure 15F).We conclude that one or two transcription factors are weakly sufficient, but all three together effectively interact with the distal enhancer in bundle sheath cells to drive SiR expression.
We next mutated WRKY, G2-like, MYB-related, IDD, and bZIP motifs.With the exception of the WRKY site that had no statistically robust effect, mutations in each of these motifs diminished or abolished enhancer activity in the bundle sheath (Figure 5A-5C).
In order to test whether the WRKY, G2-like, MYB-related IDD, and bZIP (DPBF) sites are sufficient to pattern expression to rice bundle sheath cells we concatenated them and fused them to the core promoter of SiR (Figure 5E).GUS staining was evident in the bundle sheath (Figure 5F).Fusion to the PIP1;1 core promoter maintained bundle sheath expression and resulted in an ~5 fold increase in activity (Figure 5E&F).Oligomerisation of the enhancer by repeating it three or five times increased bundle sheath specific expression 23 or 58-fold respectively when fused to SiR core promoter (Figure 5E Such cis-regulatory modules include enhancers and silencers that act as hubs receiving input from multiple transcription factors and so allow gene expression to respond spatially and temporally to both internal and external stimuli (Li et al. 2007;Buecker and Wysocka 2012).After testing 25 promoters, we discovered that the majority were not capable of driving expression in the rice bundle sheath, and this included ten that generated no detectable activity of GUS in leaves.In all cases we had cloned sequence between -3191 and -960 nucleotides upstream of the predicted translational start site and so these data demonstrate that the core promoter and any enhancers in these regions are not sufficient to direct expression to rice bundle sheath cells.Combined with the paucity of previously reported promoters active in this cell type (Nomura et al. 2005a;Lee et al. 2021) these data argue either for long range upstream enhancers (Studer et al. 2011;Liu et al. 2015;Li et al. 2019;Yan et al. 2019;Zhao et al. 2022) or other regulatory mechanisms being important to specify expression in the bundle sheath.Possibilities include transcription factor binding sites in introns that impact on transcription start site and strongly enhance gene expression (Rose et al. 2008;Gallegos and Rose 2019;Rose 2019), or in exons where because such sequences specify amino acid sequence as well as binding of trans-factors, they have been termed duons (Stergachis et al. 2013).Functional analysis showed that duons can pattern expression to the bundle sheath of the C 4 plant Gynandropsis gynandra (Reyna-Llorens et al. 2018), and it is notable that a genome-wide analysis of transcription factor binding sites in grasses revealed genes preferentially expressed in bundle sheath cells tended to contain transcription factor binding sites in their coding sequence (Burgess et al. 2019).It therefore appears possible that gene expression in the bundle sheath is commonly encoded by non-canonical architecture perhaps based on duons rather than more traditional enhancer elements upstream of the core promoter.
Despite the above, we discovered four promoters capable of driving expression in the rice bundle sheath, and each was associated with a gene important in sulphur metabolism.For example, ATPS, SiR and Fd all participate in the first two steps of sulphate reductive assimilation, while HAC1;1 encodes an arsenate reductase important in the detoxification of arsenate using glutathione that is a product of sulphur assimilation.Collectively, these data further support the notion that the rice bundle sheath cell is specialised in sulphur assimilation (Hua et al. 2021).

Two distinct genetic networks governing expression in bundle sheath cells
The only other promoter for which both cis-elements and trans-factors that are necessary and sufficient to pattern bundle sheath expression have been reported is from the dicotyledonous model A. thaliana.Here, a bipartite MYC-MYB module upstream of the MYB76 gene is responsible for this output (Dickinson et al. 2020).MYB76 forms part of a network governing glucosinolate biosynthesis in A. thaliana, and so it is notable that the gene regulatory network we report in rice is also associated with sulphur metabolism.However, rather than the bipartite transcription factor network that regulates bundle sheath expression in A. thaliana, in rice we report a quintet of transcription factors controlling SiR (Figure 6C&6D).The enhancer controlling bundle sheath SiR expression in rice comprises four distinct regions recognised by transcription factors belonging to the WRKY, G2-like, MYB-related, IDD and bZIP families (Figure 6D).As loss of the G2-like, MYBrelated, IDD and bZIP motifs all reduced expression in the bundle sheath, this implies they act cooperatively -a notion further supported by the fact that GLK2 and IDD3,4,6,10 synergistically activated promoter output in a transient assay.It is of course possible that other motifs in the enhancer such as MADS, DOFs and ARRs act as modulators to tune the level of bundle sheath expression.In fact, single nuclei sequencing of rice and sorghum during photomorphogenesis identified DOFs as important for the evolution of C 4 gene bundle sheath expression (Swift, Luginbuhl et al. 2023).For the PIP1;1 and NRT1.1A genes, whose transcripts preferentially accumulate in the bundle sheath, the core promoters were not able to generate bundle sheath expression, but they contain a Y-patch and the WRKY, G2-like, MYB-related, IDD and bZIP enhancer is present in intronic sequence (Supplemental Figure 17).It is therefore possible that this regulatory system controls their expression.Moreover, for two promoters from rice (Fd and HAC1;1) and two from other species (ZjPCK and FtGLDP) that are sufficient to drive expression to the bundle sheath contain Y-patches and the cognate cis-elements for WRKY, G2-like, MYBrelated, IDD and bZIP transcription factors (Supplementary Figure 17).
The distal enhancer in the SiR promoter operates in conjunction with the core promoter that contains two transcription start sites, one with an upstream TATA-box and the other a TC-rich element known as a pyrimidine (Y) patch (Supplemental Figure 9A).The TATA-box is found in metazoans and plants and allows recognition by the pre-initiation complex (Smale and Kadonaga 2003), but in plants computational analysis showed that many promoters lack a TATA-box and instead contain a Y-patch (Yamamoto et al. 2007a(Yamamoto et al. , 2007b;;Bernard et al. 2010).These genes tend to be relatively steadily expressed and associated with protein metabolism (Bernard et al., 2010), and presence of a Y-patch can increase core promoter strength (Jores et al. 2021).For SiR, whilst the TATA-box is not required, the Y-patch is needed for expression in the bundle sheath.Notably, core promoters with a higher number or longer Y-patches tended to drive stronger expression, and showed that in plants cell specific gene expression can be tuned by selecting different core promoters.
The regulatory network comprising the Y-patch and distal enhancer enabling bundle sheath expression of SiR is embedded within a complex cis-regulatory landscape with distinct regions encoding activating and repressing activities (Figure 6A&6B).For example, the distal enhancer (nucleotides -980 to -829) activating expression in the bundle sheath overlaps with sequence (nucleotides -900 to -700) that suppresses expression in the mesophyll.Notably, the distal enhancer is both essential for mesophyll repression and also sufficient to drive bundle sheath specific expression (Figure 6A).In addition to controlling cell specificity, this complexity likely also facilitates the tuning of expression to environmental conditions.For instance, the EIL motif (position Consistent with previous in silico analysis (Kurt et al. 2022) the presence of multiple AP2/ERF and EIL transcription factors binding sites suggests that SiR is likely subject to control from ethylene signalling (Binder 2020) and also of transcription factors that respond to ABA and jasmonic acid biosynthesis and signalling (Yaish et al. 2010;Chen et al. 2014;Jisha et al. 2015;Zhang et al. 2016Zhang et al. , 2018)).Together this implies that multiple phytohormone signalling pathways converge on the SiR promoter.These data are similar to those reported for the SHORTROOT promoter in A.
thaliana roots where a complex network of activating and repressing trans-factors also tunes expression (Sparks et al. 2016).It is also notable that the architecture we report for the bundle sheath enhancer of SiR appears of similar complexity to the collective of five transcription factors used to specify cardiac mesoderm in Drosophila melanogaster and vertebrates (Junion et al., 2012).For the five transcription factors that bind the cardiac mesoderm enhancer, the order and positioning of motifs (motif grammar) is flexible.However, this is not always the case, with for example output from the human interferon-beta (INF-β) enhancer demanding a conserved grammar (Thanos and Maniatis 1995;Panne 2008).Further work will be needed to determine if the bundle sheath enhancer reported here for rice is more similar to one of these models, or indeed, as reported for the Drosophila eve stripe 2 enhancer, operates as a billboard in different tissues to determine patterning of expression (Kulkarni and Arnosti 2003).

Using the SiR promoter to engineer the rice bundle sheath
In addition to bundle sheath cells being important for sulphur assimilation (Leegood 2008;Aubry et al. 2014b;Hua et al. 2021) they have also been implicated in nitrate assimilation, the control of leaf hydraulic conductance and solute transport (Hua et al. 2021) and the systemic response to high light (Xiong et al. 2021).Moreover, in one of the most striking examples of a cell type being repurposed for a new function, bundle sheath cells have repeatedly been rewired to allow the evolution of C 4 photosynthesis (Sage 2004).To engineer these diverse processes, specific and tuneable promoters for this cell are required.However, identification of sequence capable of driving specific expression to bundle sheath strands has previously been limited to A. thaliana and C 4 species.For example, the SCARECROW (Cui et al. 2014), SCL23 (Cui et al. 2014), SULT2;2 (Kirschner et al. 2018) and MYB76 promoters (Dickinson et al. 2020) are derived from A. thaliana, whilst the Glycine Decarboxylase P-protein (GLDP) promoter is from the C 4 dicotyledon Flaveria trinervia (Engelmann et al. 2008;Wiludda et al. 2012).In rice, only the C 4 Zoysia japonica PCK and the C 4 Flaveria trinervia GLDP promoters are known to pattern expression to the bundle sheath (Nomura et al. 2005c;Lee et al. 2021).Both are capable of conditioning expression in this cell type, but are weak, turn on late during leaf development and the molecular basis underpinning their ability to restrict expression to the bundle sheath has not been defined.It has therefore not been possible to rationally design or tune expression to this important cell type in rice.The architecture of the SiR promoter we report here now provides an opportunity to engineer the bundle sheath.
In summary, from analysis of the ~2600 nucleotide SiR promoter we identify an enhancer comprising 81 nucleotides that with the Y-patch is sufficient to drive expression to bundle sheath cells.Moreover, we show that output can be tuned via two approaches.First, oligomerising the distal enhancer can drastically increase expression.Second, combining it with different core promoters achieved the same output, and correlated with length of the Y-patch present.Our identification of a minimal promoter that drives expression in bundle sheath cells of rice now provides a tool to allow this important cell type to be manipulated.Cell specific manipulation of gene expression has many perceived advantages.For example, when constitutive promoters have been used to drive gene expression gene silencing and reduction of plant fitness due to metabolic penalties (Glick 1995;Que et al. 1997).In contrast, tissue specific promoters allow targeted gene expression either spatially or at particular developmental stages and so allow increased precision in trait engineering (Kummari et al. 2020).The SiR promoter and the bundle sheath cis-regulatory module that we identify thus provide insights into mechanisms governing cell specific expression in rice, and may also contribute to our ability to engineer and improve cereal crops.

Plant material and growth conditions
Kitaake (O.sativa ssp.japonica) was transformed using Agrobacterium tumefaciens with the following modifications as described previously (Hiei et al., 2008).Mature seeds were sterilized with 2.5% (v/v) sodium hypochlorite for 15 mins, and calli were induced on NB medium with 2 mg/L 2,4-D at 30 o C in darkness for 3-4 weeks.Actively growing calli were then co-incubated with A.
tumefaciens strain LBA4404 in darkness at 25 o C for 3 days, they were selected on NB medium supplied with 35 mg/L hygromycin B for 4 weeks, and those that proliferated placed on NB medium with 10 mg/L hygromycin B for 4 weeks at 28 o C under continuous light.Plants resistant to hygromycin were planted in 1:1 mixture of topsoil and sand and placed in a greenhouse at the Botanic Gardens, University of Cambridge under natural light conditions but supplemented with a minimum light intensity of 390 μmol m −2 s −1 , a humidity of 60%, temperatures of 28 o C and 23 o C during the day and night respectively, and a photoperiod of 12 h light, 12 h dark.Subsequent generations were grown in a growth cabinet in 12 h light/12 h dark, at 28 o C, a relative humidity of 65%, and a photon flux density of 400 μmol m −2 s −1 .

Cloning and construct preparation, and motif analysis
The 2613-bp promoter DNA fragment of SULFITE REDUCTASE (SiR, MSU7 ID: LOC_Os05g42350, RAP-DB ID: Os05g0503300) was originally amplified from Kitaake genomic DNA, with forward primer (5'-3') "CACCATGCTTGACCATGTGGACTC" and reverse primer (5'-3') "ACGGAACCCGTGGAACTC". Gel-purified PCR product was cloned into a Gateway pENTR TM vector to generate pENTR-SiRpro using pENTR™/D-TOPO™ Cloning Kit (Invitrogen), then the promoter was recombined into the pGWB3 expression vector and fused with GUS gene using LR reaction.The resultant vector was transformed into A. tumefaciens strain LBA4404 and transformed into Kitaake.To engineer the SiRpro such that it is compatible with the Golden gate system, four BsaI or BpiI restriction enzyme recognition sites at -214, -298, -1468, and -2309 were mutated from T to A and cloned into the pAGM9121 vector using Golden Gate level 0 cloning reactions and then into a level 0 PU module EC14328, which was used for driving kzGUS (intronless GUS) and H2B-mTurquoise2 reporter genes via Golden gate reaction and using Tnos as a terminator.A five prime deletion series was generated using EC14328 as the template and prepared as level 0 PU modules, and a three prime deletion series prepared as level 0 P modules.
The minimal CaMV35S promoter was used as the U module, and they were linked with kzGUS and terminated with Tnos.
To test SiRpro in the dTALE/STAP system, the 42-bp coding region were excluded and the 2571-bp resultant fragment placed into a level 0 PU module EC14330 and was used to drive dTALE1.Two reporters were used.For the GUS reporter kzGUS was linked with STAP62 and terminated with Tnos.In the fluorescent reporter construct, a chloroplast targeting peptide fused to mTurquoise 2 was linked with STAP 4 and terminated with Tact2.In both constructs, pOsAct1 driving HYG (Hygromycin resistant gene) was terminated with Tnos and used as the selection marker during rice transformation.
The Find Individual Motif Occurrences (FIMO) tool (Grant et al. 2011) from the Multiple Em for Motif Elucidation (MEME) suite v.5.4.0 (Bailey et al. 2015) was used to search for individual motifs within the promoter sequences using default parameters with "--thresh" of "1e-3".Position weight matrix of 656 non-redundant plant motifs and 13 RNA polymerase II (POLII) core promoter motifs were obtained from JASPAR (https://jaspar.elixir.no/downloads/)(Fornes et al. 2020).To cluster the transcription factor binding motifs, the RSAT matrix-clustering tool (Castro-Mondragon et al. 2017) was run on all 656 non-redundant plant motifs using the default parameters, which yield 51 motif clusters, these clusters were further divided based on transcription factor families (Supplemental table 2).

Analysis of GUS and fluorescent reporters
In all cases, to account for position effects associated with transformation via A. tumefaciens, multiple T 0 lines were assessed for each construct.GUS staining was performed as described previously (Jefferson et al., 1987) with the following minor modifications.Leaf tissue was fixed in 90% (v/v) acetone overnight at 4 o C after washing with 100 mM phosphate buffer (pH 7.0).Leaf samples were transferred into 1 mg/ml 5-bromo-4-chloro-3-indolyl glucuronide (X-Gluc) GUS staining solution, subjected to 2 mins vacuum infiltration 5 times, and then incubated at 37 o C for between 1 and 168 hours.Chlorophyll was cleared further using 90% (v/v) ethanol overnight at room temperature.Cross sections were prepared manually using a razor blade and images were taken using an Olympus BX41 light microscopy.Quantification of GUS activity was performed using a fluorometric MUG assay (Jefferson et al., 1987).~200 mg mature leaves from transgenic plants were frozen in liquid nitrogen and ground into fine powder with a Tissuelyser (Qiagen).
Soluble protein was extracted in 1 mL of 50 mM phosphate buffer (pH 7.0) supplemented with 0.1% In order to visualize mTurquoise2 mature leaves were dissected into 2-cm sections, leaf epidermal cells were removed by scraping the leaf surface with a razor blade and then mounted with deionized water.5-mm and middle sections of 2-cm young tissue of the fourth leaves were dissected and mounted with deionized water directly.Imaging was then performed using a Leica TCS SP8 confocal laser-scanning microscope using a 20x air objective.mTurquoise2 fluorescence was excited at 442 nm with emission at 471-481 nm, chlorophyll autofluorescence was excited at 488 nm with emission at 672-692 nm.

Yeast one hybrid, protoplast isolation and transactivation assay
The yeast one hybridisation assay was performed by Hybrigenics (https://www.hybrigenicsservices.com/).Fragments were synthesized and used as bait.Rice leaf and root cDNA libraries Rice leaf protoplasts and PEG-mediated transformation were performed as described previously (Page et al. 2019).Golden gate level 1 modules for transformation were isolated using ZymoPURE™ II Plasmid Midiprep Kit, ZmUBIpro::GUS-Tnos was used as transformation control.
Transcription factor coding sequences were amplified using rice leaf cDNA, with BsaI and BpiI sites mutated, and cloned into Golden gate SC level 0 modules.They were assembled into a level 1 module with a ZmUBIpro promoter and Tnos terminator module.Nucleotides -980 to -829 with the endogenous core promoter (nucleotide -250 to +42) were fused with the LUC reporter to generate output of transcription activity.In each transformation, 2 µg of transformation control plasmids, 5 µg of reporter plasmids, and 5 µg of effector plasmids per transcription factor were combined and mixed with 170 µl protoplasts.After incubation on the benchtop for overnight protein was extracted using passive lysis buffer, GUS activity was determined with 20 μl of protein sample and MUG fluorescent assay as described above, LUC activity was measured with 20 μl of protein sample and 100 μl of LUC assay reagent (Promega) using Clariostar plate reader.

Declaration of interests
The authors declare no competing interests.
-5G), and this effect was amplified 90 and 224-fold when fused with the PIP1;1 core (Figure5E-5G).Synthetic promoters, created by oligomerising this enhancer and combining it with core promoters that contain Y-patches enabled fine-tuning of bundle sheathspecific expression in rice.When an oligomerised version of the enhancer was linked to the SiR core promoter and placed in A. thaliana, it generated strong expression in bundle sheath cells (Figure5H, Supplemental Figure16).Collectively our data indicate that transcription factors belonging to the WRKY, G2-like, MYB-related, IDD and bZIP (DPBF) families act cooperatively to decode distinct cis-elements in a distal enhancer of the SiR promoter, and that this transcription factor collective represent an ancient and highly conserved mechanism allowing bundle sheath determined by interactions between elements in the core promoter allowing basal levels of transcription(Juven-Gershon and Kadonaga 2009;Haberle and Stark 2018) with more distal cis-regulatory modules(Spitz and Furlong 2012;Shlyueva et al. 2014;Ray-Jones and Spivakov 2021).
[v/v] Triton X-100 and cOmplete™ Protease Inhibitor Cocktail (half tablet per 50 mL).Protein concentration then determined using a Qubit protein assay kit (Invitrogen).The MUG fluorescent assay was performed in duplicates with 20 µl protein extract in MUG assay buffer (50mM phosphate buffer (pH 7.0), 10 mM EDTA-Na 2 , 0.1% [v/v] Triton X-100, 0.1% [w/v] Nlauroylsarcosine sodium, 10 mM DTT, 2 mM 4-methylumbelliferyl-β-D-glucuronide (MUG)) in a 200 µl total volume.The reaction was conducted at 37 • C in GREINER 96 F-BOTTOM microtiter plate using a CLARIOstar plate reader.4-Methylumbelliferone (4-MU) fluorescence was recorded every 2 minutes for 20 cycles with excitation at 360 nm and emission detected at 450 nm.4-MU concentration was determined based on a standard curve of ten 4-MU standards placed in the same plate.GUS enzymatic rates were calculated by averaging the slope of MU production from each of the duplicate reactions.

Figure legends Figure 1 .
Figure legends

Figure 2 .
Figure 2. A distal enhancer and the core promoter that are necessary and sufficient for bundle sheath expression (A) Schematics showing deletions of nucleotides -980 to -849 and -119 to +42.(B) Representative image of leaf cross sections of transgenic lines after GUS staining.Zoomed-in images of lateral veins shown in right panels, the staining duration is displayed in the bottom-left corner, bundle sheath cells highlighted with dashed red line, scale bars = 50 µm.(C) Promoter activity determined by the fluorometric 4-methylumbelliferyl-β-D-glucuronide (MUG) assay, data subjected to pairwise Wilcoxon test.Lines with differences in activity that were statistically significant (adjusted P<0.05) labelled with different letters.Median catalytic rate of GUS indicated with red line, n indicates total number of transgenic lines assessed.

Figure 3 .
Figure 3.The Y-patch and four distinct regions in the distal enhancer are required for bundle sheath specific expression.(A-C) Nucleotides -980 and -829 from the SiR promoter pattern expression to the bundle sheath when linked with the PIP1;1 and NRT1.1A core promoters containing Y-patches.(A) Prediction of Y-patch and TATA-box sequences in core promoters of PIP1;1, NRT1.1A,PIP1;3 and ATPSb.(B) Representative cross sections of transgenic rice leaves after GUS staining, zoomed-in image of lateral veins shown in the right panel, bundle sheath cells highlighted with white dashed lines, the staining duration is displayed in the bottom-left corner, scale bars = 50 µm.(C) Promoter activity determined by the fluorometric 4-methylumbelliferyl-β-Dglucuronide (MUG) assay.(D) Schematics showing transcription factor binding sites between nucleotides -980 and -829.(E) Schematics showing consecutive deletions between nucleotides -980 and -829 fused to the GUS reporter.(F) Representative images of cross sections from transgenic lines after GUS staining, zoomed-in images of lateral veins shown in right panels, the staining duration is displayed in the bottom-left corner, bundle sheath cells highlighted with red dashed lines, scale bars = 50 µm.(G) Promoter activity determined by the fluorometric 4methylumbelliferyl-β-D-glucuronide (MUG) assay.In C&G, data were subjected to pairwise Wilcoxon test with Benjamini-Hochberg correction.Lines with differences in activity that were statistically significant (adjusted P<0.05) labelled with different letters.Median catalytic rate of GUS indicated with red line, n indicates total number of transgenic lines assessed.

Figure 4 .
Figure 4. WRKY, G2-like, MYB-related, IDD and bZIP transcription factors interact and activate with the distal enhancer.(A) Schematics showing transcription factor binding sites between nucleotides -980 and -829 which are likely required for bundle sheath specific expression.(B) Effector assays showing that each transcription factor activates expression from the distal enhancer.(C) Effector assays showing synergistic activation from the distal enhancer when GLK2 and IDD3,4,6,10 were co-expressed.Data subjected to pairwise Wilcoxon test with Benjamini-Hochberg correction in B&C.Lines with differences in activity that were statistically significant (adjusted P<0.05) labelled with different letters.(D) Transcript abundance of transcription factors in bundle sheath strands (BSS) and mesophyll (M) cells during maturation.Leaf developmental stage S2 to S7 represent base of the 4 th leaf at the 6th, 8th, 9th, 10th, 13th and 17th day after sowing.(E) Representative images of transgenic lines misexpressing WRKY121, IDD2 and bZIP9 in mesophyll cells, staining duration is displayed in the bottom-left corner, zoom-in of mesophyll shown in right panel, red arrows indicate GUS expressing mesophyll cells.

Figure 5 .
Figure 5. Oligomerisation of bundle sheath enhancer increases bundle sheath expression.Schematics showing site-directed mutagenesis of WRKY, G2-like, MYBR, IDD and bZIP motifs, mutated nucleotides highlighted in red (A), and constructs to test impact of oligomerization of enhancer (E).(B&F) Representative images of cross sections from transgenic lines after GUS staining, zoomed-in images of lateral veins shown in right panel, the staining duration is displayed

Figure 6 .
Figure 6.Model of mechanism underpinning bundle sheath expression from SiR promoter.(A) Schematic with location of Bundle Sheath (BS) enhancer, constitutive activator and mesophyll repressor.(B) Bundle sheath expression is a result of the enhancer, constitutive activators and mesophyll repressor acting in concert.Schematic indicating how the enhancer operates within a broader cis-regulatory landscape.(C&D) Model depicting transcription factors and cognate ciselements responsible for bundle sheath expression.
Transcriptional activity from the promoter was calculated as LUC luminescence/rate of MUG accumulation.L.H. and J.M.H. conceived the work.J.M.H. guided execution of experiments and oversaw the project.L.H., N.W., S.S., R.D., K.B., and A.R.B. did the experiments and analysed the data.L.H. and J.M.H. wrote the manuscript with input from all authors.