Synergistic regulation of bZIP53 and dimerizing partners results in abnormal seed phenotype in Arabidopsis: Use of a designed dominant negative protein A-ZIP53

In Arabidopsis, basic leucine zipper (bZIP) family of transcription factors (TFs) are key proteins to regulate the expression of seed maturation (MAT) genes. bZIPs are functionally redundant and their DNA-binding activity is dependent on dimerization partner. The intervention of loss of function mutation is inadequate to understand and regulate the redundant behavior of TFs and one such example is bZIP53, which is known as a key regulator of seed maturation phenomena. Here, to examine the consequences of hindering the function of bZIP53 and its known and unknown heterodimerizing partners, a transgenic Arabidopsis constitutively expressing a novel dominant negative (DN) protein A-ZIP53 was raised. Transgenic plants demonstrated a delayed growth and retarded seed phenotype. The in vivo inhibition of DNA binding of bZIP53, bZIP10, and bZIP25 to the G-box demonstrated the efficacy of A-ZIP53 protein. In first generation, majority of plants failed to survive beyond four weeks suggesting a pleiotropic nature of bZIP53. Plants expressing A-ZIP53 have small flower, shorter siliques, and small-seeded phenotype. RNA seq analysis of the transgenic lines revealed the reduced expression of target genes of bZIP53 and its heterodimerizing partners. Furthermore, immunoprecipitation followed by mass spectrometry (IP-MS/MS) of transgenic plants helped to identify the additional heterodimerizing partners of the A-ZIP53. The interactions were subsequently confirmed with the transient transfection experiments. Unlike other gene knock out technologies, DN protein can inhibit the function of members of the same group of bZIP TFs.


Introduction
transgenic lines were subjected to gene expression analysis. qRT-PCR was used to 208 analyze the expression of seven target genes (2S2,CRU,LEA76,ProDH,ASN1,CRA1,and 209 HSD1), and a non-target gene (SHB-1) (Alonso et al. 2009;Cheng et al. 2014;Weirauch et al. 210 2014). Leaves of the transgenic lines from the T-1 generation and immature siliques and seeds 211 from T-2 generation were taken for the gene expression analysis ( Figure 4A and 4B). The 212 expression of bZIP53 increased several folds in transgenic lines compared to the wild type ( Figure  213 4). It could be to compensate the requirement of bZIP53 in transgenic, which is not available due 214 to heterodimerzation with A-ZIP53. The expression of target genes of bZIP53, bZIP10, and 215 bZIP25 including CRU, ASN1, CRA, and HSD1 were downregulated in both generations ( Figure  216 4A and 4B). The expression of seed storage albumin (2S2) and late embryogenesis accumulating 217 76 (LEA76) were not observed in the T-1 generation ( Figure 4A) whereas both genes were 218 downregulated in the T-2 generation ( Figure 4B). The expression of ProDH, which is a direct 219 target of bZIP53 is higher in the T-1 generation while expressed less in the T-2 generation (Figure 220 4A and Figure 4B) (Weltmeier et al. 2006). Thus, it could be inferred that, several fold higher 221 expression of bZIP53 might overcome the inhibitory effect of the A-ZIP53 and led to the higher 222 expression of the ProDH. Previously, we demonstrated the specificity of the A-ZIP53 and showed 223 it does not heterodimerize with bZIP39 and bZIP72 in vitro (Jain P et al. 2017). The higher 224 expression of the SHB-1, a direct target of bZIP39 confirmed the specificity of A-ZIP53. It shows 225 the specificity and efficacy of DN protein to regulate the redundant behavior of target bZIPs. 226

Varied reproductive phase parameters of transgenics 227
To investigate the effect of A-ZIP53 on the reproductive phase of plants, A-ZIP53 expressing 228 plants were studied together with the mutants of bZIP53, bZIP10, and bZIP25 and wild type 229 Arabidopsis (Supplementary Figure S2). bZIP10 and bZIP25 are reported to be involved in early 230 stages of seed development (Lara et al. 2003) while the role of bZIP53 is reported in the later 231 stages of seed maturation (Alonso et al. 2009). A transgenic line, which has lower expression of 232 A-ZIP53 was used for further analysis. Initially, this transgenic line has delayed growth like its 233 predecessor and has lesser rosette diameter compared to mutants of bZIP53, bZIP10, bZIP25, and 234 wild type (Supplementary Figure S2F) but later no significant differences were observed in plant 235 height, growth, and leaves compared to mutants and wild type plants. Immature siliques and seeds 236 of transgenic were subjected for qRT-PCR that showed the lower expression of genes involved in seed development and maturation (Supplementary Figure S2G). The expression of bZIP53 was 238 many fold higher compared to the wild type. A significant higher expression of the HSD1 and 239 bZIP39 also observed in the A-ZIP53 expressing transgenic lines. 240 Transgenic plants were analyzed for phenotypic variation including developed flower, silique, and 241 mature seed compared to mutants of the bZIP53, bZIP10, bZIP25, and wild type Arabidopsis. 242 Transgenics has reduced flower size compared to the wild type and mutants of bZIPs while flower 243 development was like wild type ( Figure 5A). Significant differences between flower of transgenics 244 and mutant of bzip10 and bzip25 were observed but no significant difference were seen compared 245 to bzip53 mutant ( Figure 5). The diameter of transgenic flower was significantly less compared to 246 wild type and mutants of bzip10 and bzip25 ( Figure 5B). Furthermore, transgenic has smaller 247 siliques compared to the wild type and mutants ( Figure 6A and 6B) and number of siliques per 0.5 248 gm of weight is more compared to wild type ( Figure 6C). Seeds of transgenic were small and 249 shriveled ( Figure 7A and Figure  The assembled data was functionally categorized using the agriGO gene ontology (GO) 265 tool (Figure 7). 20,764 unique transcripts, with the FDR of 0.05 were examined with the GO tool. 266 The knocking down of bZIP53 and its heterodimerizing partners have a profound effect on the 267 expression of corresponding genes. Down regulated transcripts were categorized into GO terms 268 that participate into biological, cellular, and molecular functions (Supplmentary Table S4). Most 269 of the downregulated GO terms were identified to be related to genes, which are involved in 270 gamete formation, seed development, seed maturation, seed storage protein synthesis, 271 reproduction, and other biological processes. 48.8% genes in the biological, 20.8% in the cellular, 272 and 32.2% genes involved in the molecular functions have been identified (Figure 9). Genes 273 related to developmental process, hormone metabolism, and DNA binding transcription factor s 274 activities were down regulated (Figure 9), which suggests the A-ZIP53 role inthe development 275 related pathway. 276 The lower expression of genes related to gamete and seed development pathway including Late 277 embryogenesis abundant protein (LEA) family, ECA1 gametogenesis related family protein, 278 maternally expressed family protein, seed storage 2S albumin superfamily protein and others substantiate 279 the A-ZIP53 effect on the target genes. Highlighting the redundant behaviour of TFs, genes involved in 280 stress including salt stress are also found to be differentially expressed in A-ZIP53 expressing transgenic 281 plants.

Decoding of target genes and regulatory network 283
Data generated from this study has helped to identify the putative target genes of bZIP53 and its 284 dimeric partners involved in regulating seed storage protein, gamete development, transcription 285 factors, and both biotic and abiotic stresses. 286 DN protein A-ZIP53 has served to capture different proteins, which might act synergistically in 287 different biological process like correlation between hormone signaling and TFs binding. One such 288 example is bHLH-MYB complex in jasmonic acid -mediated stamen development and seed 289 production (Qi et al. 2015). Transcriptome data revealed the higher expression of bHLH and MYB 290 genes, which might be to balance the hinderance in function of other bZIPs are also reported to be involved in seed development and maturation, including bZIP39 308 (Bensmihen et al. 2005;Cheng et al. 2014;Dekkers et al. 2016). bZIP39 is also involved in floral 309 transition (Wang et al. 2013), which signifies the functional redundancy like bZIP53 (Alonso et 310 al. 2009;Dietrich et al. 2011;Hartmann et al. 2015). Functional redundancy of bZIP depends on 311 the different dimerizing partner selection. 312 Our finding showed that A-ZIP53 can form the heterotypic interaction with the bZIP53, bZIP10, 313 and bZIP25 in vitro and in vivo (Jain et al. 2018;Jain et al. 2017). In order to know other 314 heterodimerizing partners of A-ZIP53, whole protein extract from immature siliques, immature 315 seeds, and leaves were subjected to the immunoprecipitation followed by the mass-spectrometry 316 (IP-nano LC-MS/MS (Material and Methods). bZIP proteins that were identified in more than one 317 sample with at least one proteotypic peptide were considered as a high confidence candidate. Eight 318 bZIP TFs (bZIP14, bZIP17, bZIP19, bZIP23, bZIP29, bZIP33, bZIP34, and bZIP69) were found 319 in the study, which could be the interacting partners of the bZIP53 or A-ZIP53 (Table I). The 320 similarity between the bZIPs (bZIP29, bZIP33, and bZIP67) precipitated from both samples 321 confirm the efficacy and effectivity of A-ZIP53. In addition, to confirm the dimeric specificity of 322 A-ZIP53 with target bZIPs, the total protein soup of wild-type Arabidopsis was incubated with 323 pure protein A-ZIP53 followed by IP-MS. The annotated peptides were related to bZIP33, bZIP29, that belongs to class S1 bZIP TF (Jakoby et al. 2002). Earlier reports confirmed the putative 368 heterotypic interaction between bZIPs of group S1 and group C namely bZIP9, bZIP10, bZIP25, 369 bZIP63, and others using yeast two hybrid and in vitro DNA binding assays, which have a 370 prominent role in growth and development (Alonso et al. 2009;Dietrich et al. 2011;Dröge-Laser 371 et al. 2018;Ehlert et al. 2006;Hartmann et al. 2015;Kang et al. 2010;Weltmeier et al. 2006). The 372 network of class C/S1 bZIPs is less disordered and have heptads with a high helical tendency that 373 favors their dimerization (Deppmann et al. 2006;Jakoby et al. 2002). Further, heterodimers of 374 class C/S1 bZIPs have lesser stabilizing forces resulting in the weaker stability (Llorca et al. 2014). 375 It prompted us to imply the DN protein A-ZIP53 that efficiently and stably forms a heterotypic 376 interaction with the class C/S1 bZIP TFs. The designed acidic extension prolongs the dimerization 377 interface into the DNA binding region and provides two magnitude higher stability to the A-378 ZIP53|bZIP heterodimer complex (Jain et al. 2017). Excess of A-ZIP53 extends its specific 379 dimerization and tendency to target bZIPs, which is less abundant. It makes A-ZIP53 an effective 380 competitor in a stoichiometric environment. 381 It was shown previously that the O2, bZIP10, and bZIP25 related to the group C bZIPs 382 interacts with ABI3 during seed maturation and bZIP53 enhanced the activation of the heterodimer 383 complex in transient transfection assay (Alonso et al. 2009;Ehlert et al. 2006;Schmidt et al. 1990; 384 transient transfections, which showed the efficacy of DNs to overcome biological redundancy 386 (Satoh et al. 2004;Weltmeier et al. 2006) and stresses (Dietrich et al. 2011;Hartmann et al. 2015). 387 Redundant behavior of bZIP TFs is due to their interacting partners and A-ZIP53 has an 388 edge to decipher them. It is reported that bZIPs has strong transactivation properties and the 389 heterotypic interaction with the promiscuous DN like A-ZIP53 may deter their potential binding 390 to cognate DNA binding sites. These DNA binding sites can play an active role for the cooperative 391 new-horizons about the molecular mechanism governed by bZIP TFs. 458 Immunoprecipitation followed by the mass spectrometry has confirmed the heterotypic 459 interaction of A-ZIP53 with target bZIPs and its dimeric interacting partners (Table I) The prepared MMG solution can be stored at room temperature (Yoo et al. 2007). 549 To confirm the heterodimeric interaction between the A-ZIP53 and target proteins, constructs of 550 effector, reporter, and normalization vector were transformed and subjected for the GUS-NAN 551 activity as described earlier (Alonso et al. 2009;Jain et al. 2017). 552

RNA extraction and Illumina sequencing 553
Total RNA was extracted from the wild type and A-ZIP53 expressing Arabidopsis using ZR plant 554 RNA miniprep (ZYMO Research) as per manufactures instruction. The quality and quantity of 555 RNA was checked on 1 % denaturing RNA agarose gel and NanoDrop/Qubit fluorometer, 556 respectively. The RNA-seq paired end sequencing library were prepared from the QC passed RNA 557 samples using illumina TrueSeq stranded mRNA sample preparation kit. Briefly, mRNA was 558 enriched from the total RNA using poly-T attached magnetic beads, followed by enzymatic 559 fragmentation, 1 st strand cDNA conversion using Superscript II and Act-D mix to facilitate RNA 560 dependent synthesis. The 1 st strand cDNA was then synthesized to second strand using second strand mix. The ds cDNA was then purified using Ampure XP beads followed by A-tailing, adapter 562 ligation, and then enriched by limited number of PCR cycles. 563

Cluster generation and Sequencing 564
After obtaining the Qubit concentration for the libraries and the mean peak size from Agilent Tape 565 Station profile, the PE illumine libraries were loaded onto NextSeq 500 for cluster generation and 566 sequencing. Paired-End sequencing allows the template fragments to be sequenced in both the 567 forward and reverse directions on NextSeq500. The adaptors were designed to allow selective 568 cleavage of forward strand after re-synthesis of reverse strand during sequencing. The copied 569 reverse strand will then use to sequence from the opposite end of the fragment. 570

RNA seq analysis 571
Adaptor trimming and quality trimming of the samples (wild-type and three biological replicates 572 of the A-ZIP53 expressing transgenic) were performed using Trimmomatic-0.35. The sequenced 573 raw data was processed to obtain high quality clean reads using Trimmomatic to remove adaptor 574 sequences, ambiguous reads (reads with unknow nucleotides "N" larger than 5%) and low-quality 575 sequences (read with more than 10 % quality threshold (QV) <20 phred score). A minimum length 576 of 50 nucleotide (after trimming) was applied. After removing the adaptor and low-quality 577 sequences from the raw data, high quality sequences were obtained. This high quality paired-end 578 reads were used for the referenced based read mapping. The high-quality reads were mapped on 579 the reference genome of Arabidopsis thaliana using TopHat v2.1.1 with the default parameters. 580

Gene ontology (Go) and differential gene expression (DGE) analysis 581
The DGE was carried out using cutdiff v1.3.0. Fold change (FC) values greater than zero wer 582 considered as upregulated whereas less than zero as downregulated. P value threshold of 0.05 was 583 used to filter statistically significant results. For GO analysis Singular Enrichment Analysis (SEA) 584 of agri GO(http://bioinfo.cau.edu.cn/agriGO/analysis.php) was used. Hypergeometric tests with 585 Hochberg FDRs (false discovery rates) were performed using the default parameters to adjust the 586 P-value <0.05 for obtaining significant GO terms. 587 588 Quantitative real-time PCR (qRT-PCR) 589 The differential gene expression was validated by qRT-PCR. Total RNA (2 μg) was isolated 590 (Spectrum Plant Total RNA kit) from the leaves (T1 generation) and immature siliques (T2 and T3 generation) from wild-type and A-ZIP53 expressing Arabidopsis. Contamination of genomic 592 DNA was removed by the Turbo DNA-free kit (Invitrogen, ThermoFisher, USA). 593 Later, cDNA was synthesized (Invitrogen Superscript® III Reverse Transcriptase) as per 594 manufacturer's protocol. For qRT-PCR gene specific primers were used. The template 595 concentration was 10-15 ng while the concentration of forward and reverse primer was 10 ng in 596 SYBR select Master Mix (ABI). The PCR was performed using the ABI 7700 sequence detector 597 (Applied Biosystems, USA) as per manufacturers instruction. Two biological and three technical 598 experiments were taken for each experiment. Statistical analysis was done using Origin 6.1.

Co-Immunoprecipitation of proteins in Transgenic Arabidopsis 614
Pull-down assay was used to extract the target proteins from mixture (cell lysate). For Co-IP of A-615 ZIP53 from the protein extract of transgenic plants, antibodies targeting T7 tag were mixed with 616 protein lysate in the protein extraction buffer. Mixture was kept at 4 °C for 2 hours. "Protein 617 extracted" with T7 tag antibodies were incubated with protein A containing magnetic beads (Dyna 618 beads, Thermo fisher) for 2 hours at 4 °C. Beads with immobilized antibodies were collected 619 through magnetic separation rack (NEB). Beads were washed with protein extraction buffer for 1 620 minute, washed beads were collected with magnets and washing was repeated. Technologies LLC., USA) before peptide sequencing. The eluents used were eluent A, degassed ionized water with 0.1 % (v/v) formic acid, eluent B, 100 % acetonitrile (containing 0.1 % (v/v) 652 formic acid with a linear gradient of 5-95 % for 120 mins with a flow rate of 300 nl/min. After 653 separation peptides were subject to tandem mass (MS/MS) analysis. The Nano-LC was directly 654 coupled to a Triple TOFM 5600 mass spectrometer, which was operated in an information 655 dependent acquisition (IDA) mode. For IDA one full scan (m/z 350-1250) was followed by 8 656 MS/MS scans and the electrospray voltage was set to 2500 V. 657

Data interpretation 658
Raw spectra for the peptide identification were interpreted using Protein Pilot 4.0 (ABsciex). The 659 peptide spectra were searched against the Arabidopsis thaliana entries using uniprot database 660 under following parameters: the peptide tolerance was set to 1 Da and MS/MS was set to 0.8 Da. 661 Trypsin was selected as a protease.    29,14,33,17,33,69,10,25,and 53 67,33,17,29 33,34,19,17,29,PAN,and unfertilized Sac4 Supplementary Tables 931 Table 1 DNA binding sites of target bZIPs on their corresponding genes. 932 Table 2 Primer sequences for the qRT PCR. 933 Table 3 Primer sequence for the cloning of A-ZIP53 into the pRI101AN 934 Table 4 Comparisons of length and width of seeds 935 Table 5 Dry weight of mature seed (25 seeds) 936 Table 6 Differences in the flower size of wildtype, mutant (bZIP53, bzip25, and bzip10) and 937 transgenic 938 Table 7 Differences in the length and width of siliques of wildtype, mutants (bZIP53, bzip25, 939 and bzip10), and transgenic. 940 Table 8 Genes downregulated in A-ZIP53 expressing transgenic plants 941              The delineation of Nterminal basic DNA-binding region followed by dimerizing leucine zipper region. Amino acid sequences represented by the single-letter code are aligned with respect to an invariant asparagine (N) and arginine (R) (shown in bold) in the basic region. Only ten amino acids upstream of asparagine are shown. Tenth amino acid (typically a leucine; L o ) from invariable arginine in the basic DNA-binding region marks the start of the dimerizing leucine zipper. The leucine zipper sequence is grouped into heptad (a,b,c,d,e,f,g) n=8 . The limit of a coiled coil at C-terminus is defined by the presence of a proline or two consecutive glycines, both likely helix-breaking residues and the absence of charged amino acids in g and e' positions in a heptad In homodimer coiled coil, interhelical interactions between amino acids in the g position with those in the following e' position are shown as square brackets. Solid square brackets depict attractive interactions between amino acids with opposite charges in g and e' positions (E↔K, K↔E, D↔R) whereas interhelical repulsive interactions between g and e' position amino acids are shown by discontinuous square brackets (K↔R). In the putative heterodimer coiled coil (bZIP53+bZIP10, bZIP53+bZIP25 and bZIP25+bZIP10) attractive interactions between amino acids at g and e' position are shown by solid diagonal lines (E↔K, R↔E, R↔D, K↔D) and repulsive interactions are depicted by discontinuous lines (K↔K,K↔R). B) The number of attractive and repulsive g↔e' salt bridges formed in three homodimers and three putative heterodimers.