Jurassic NLR: conserved and dynamic evolutionary features of the atypically ancient immune receptor ZAR1

In plants, NLR immune receptors generally exhibit hallmarks of rapid evolution even at the intraspecific level. We used iterative sequence similarity searches coupled with phylogenetic analyses to reconstruct the evolutionary history of ZAR1, an atypically conserved NLR that traces its origin to early flowering plant lineages ∼220 to 150 million years ago (Jurassic period). We discovered 120 ZAR1 orthologs in 88 species, including the monocot Colacasia esculenta, the magnoliid Cinnamomum micranthum and the majority of eudicots, notably the early diverging eudicot species Aquilegia coerulea. Ortholog sequence analyses revealed highly conserved features of ZAR1, including regions for pathogen effector recognition, intramolecular interactions and cell death activation. We functionally reconstructed the cell death activity of ZAR1 and its partner receptor-like cytoplasmic kinase (RLCK) from distantly related plant species, experimentally validating the hypothesis that ZAR1 has evolved to be a partner with RLCKs early in its evolution. In addition, ZAR1 acquired novel features, such as a C-terminal integration of a thioredoxin-like domain. ZAR1 duplicated into two paralog families, which underwent distinct evolutionary paths. We conclude that ZAR1 stands out among angiosperm NLRs for having experienced relatively limited gene duplication and expansion throughout its deep evolutionary history. Nonetheless, ZAR1 did also give rise to non-canonical NLR proteins with integrated domains and degenerated molecular features.


INTRODUCTION
Plants immune receptors, often encoded by disease resistance (R) genes, detect invading pathogens and activate innate immune responses that can limit infection (Jones and Dangl, 2006). A major class of immune receptors is formed by intracellular proteins of the nucleotide-binding leucine-rich repeat (NLR) family (Dodds and Rathjen, 2010;Jones et al., 2016;Kourelis and van der Hoorn, 2018). NLRs detect host-translocated pathogen effectors either by directly binding them or indirectly via host proteins known as guardees or decoys.
NLRs are arguably the most diverse protein family in flowering plants (angiosperms) with many species having large (>100) and diverse repertoires of NLRs in their genomes (Shao et al., 2016;Baggs et al., 2017;Kourelis et al., 2021). They typically exhibit hallmarks of rapid evolution even at the intraspecific level ( Van de Weyer et al., 2019;Lee and Chae, 2020;Prigozhin and Krasileva, 2020). Towards the end of the 20 th century, Michelmore and Meyers (1998) proposed that NLRs evolve primarily through the birth-and-death process. In this model, new NLRs emerge by recurrent cycles of gene duplication and loss-some genes are maintained in the genome acquiring new pathogen detection specificities, whereas others are deleted or become non-functional through the accumulation of deleterious mutations. Such dynamic patterns of evolution enable the NLR immune system to keep up with fast-evolving effector repertoires of pathogenic microbes. However, as already noted over 20 years ago by Michelmore and Meyers (1998), a subset of NLR proteins are slow evolving and have remained fairly conserved throughout evolutionary time (Wu et al., 2017;Stam et al., 2019). These "high-fidelity" NLRs (per Lee and Chae, 2020) offer unique opportunities for comparative analyses, providing a molecular evolution framework to reconstruct key transitions and reveal functionally critical biochemical features (Delaux et al., 2019). Nonetheless, comprehensive evolutionary reconstructions of conserved NLR proteins remain limited despite the availability of a large number of plant genomes across the breadth of plant phylogeny. One of the reasons is that the great majority of NLRs lack clear-cut orthologs across divergent plant taxa. Here, we address this gap in knowledge by investigating the macroevolution of ZAR1 (HOPZ-ACTIVATED RESISTANCE1), an atypically ancient NLR, and asking fundamental questions about the conservation and diversification of this immune receptor throughout its deep evolutionary history.
NLRs occur across all kingdoms of life and generally function in non-self perception and innate immunity Uehling et al., 2017). In the broadest biochemical definition, plant NLRs share a multidomain architecture typically consisting of a NB-ARC (nucleotide-binding domain shared with APAF-1, various R-proteins and CED-4) followed by a leucine-rich repeat (LRR) domain. Angiosperm NLRs form several major monophyletic groups with distinct N-terminal domain fusions (Shao et al., 2016;Kourelis et al., 2021).
These include the subclades TIR-NLR with the Toll/interleukin-1 receptor (TIR) domain, CC-NLR with the Rx-type coiled-coil (CC) domain, CC R -NLR with the RPW8-type CC (CC R ) domain (Tamborski and Krasileva, 2020) and the more recently defined CC G10 -NLR with a distinct type of CC (CC G10 ) . Up to 10% of NLRs carry unconventional "integrated" domains in addition to the canonical tripartite domain architecture. Integrated domains are thought to generally function as decoys to bait pathogen effectors and enable pathogen detection (Cesari et al., 2014;Sarris et al., 2016;Wu et al., 2015;Kourelis and van der Hoorn, 2018). They include dozens of different modules indicating that novel domain acquisitions have repeatedly taken place throughout the evolution of plant NLRs (Sarris et al., 2016;Kroj et al., 2016). To date, over 400 NLRs from 31 genera in 11 orders of flowering plants have been experimentally validated as reported in the RefPlantNLR reference dataset (Kourelis et al., 2021). Several of these NLRs are coded by R genes that function against economically important pathogens and contribute to sustainable agriculture (Dangl et al., 2013).
In recent years, the research community has gained a better understanding of the structure/function relationships of plant NLRs and the immune receptor circuitry they form (Wu et al., 2018;Adachi et al., 2019a;Burdett et al., 2019;Jubic et al., 2019;Bayless and Nishimura, 2020;Feehan et al., 2020;Mermigka et al., 2020;Wang and Chai, 2020;Xiong et al., 2020;Zhou and Zhang, 2020). Some NLRs, such as ZAR1, form a single functional unit that carries both pathogen sensing and immune signalling activities in a single protein (termed 'singleton NLR' per Adachi et al., 2019a). Other NLRs function together in pairs or more complex networks, where connected NLRs have functionally specialized into sensor NLRs dedicated to pathogen detection or helper NLRs that are required for sensor NLRs to initiate immune signalling (Feehan et al., 2020). Paired and networked NLRs are thought to have evolved from multifunctional ancestral receptors through asymmetrical evolution (Adachi et al., 2019a(Adachi et al., , 2019b. As a result of their direct coevolution with pathogens, NLR sensors tend to diversify faster than helpers and can be dramatically expanded in some plant taxa (Wu et al., 2017;Stam et al., 2019). For instance, sensor NLRs often exhibit non-canonical biochemical features, such as degenerated functional motifs and unconventional domain integrations (Adachi et al., 2019b;Seong et al., 2020).
The elucidation of plant NLR structures by cryo-electron microscopy has significantly advanced our understanding of the biochemical events associated with the activation of these immune receptors (Wang et al., 2019a;Ma et al., 2020;Martin et al., 2020).
The CC-NLR ZAR1, the TIR-NLRs RPP1 and Roq1 oligomerize upon activation into a multimeric complex known as the resistosome. In the case of ZAR1, recognition of bacterial effectors occurs through its partner receptor-like cytoplasmic kinases (RLCKs), which occur in a genomic cluster of multiple RLCK-type pseudokinases that vary depending on the pathogen effector and host plant (Lewis et al., 2013;Wang et al., 2015;Seto et al., 2017;Schultink et al. 2019;Laflamme et al., 2020). Activation of ZAR1 induces conformational changes in the nucleotide binding domain resulting in ADP release, dATP/ATP binding and pentamerization of the ZAR1-RLCK complex into the resistosome. The ZAR1 resistosome exposes a funnel-shaped structure formed by the N-terminal α1 helices, which translocates into the plasma membrane, and the resistosome itself acts as a Ca 2+ channel (Wang et al., 2019b;Bi et al., 2021). The ZAR1 N-terminal α1 helix matches the MADA consensus sequence motif that is functionally conserved in ~20% of CC-NLRs including NLRs from dicot and monocot plant species (Adachi et al., 2019b). This suggests that the biochemical 'death switch' mechanism of the ZAR1 resistosome may apply to a significant fraction of CC-NLRs. Interestingly, unlike singleton and helper CC-NLRs, sensor CC-NLRs often carry degenerated MADA α1 helix motifs and/or N-terminal domain integrations, which would preclude their capacity to trigger cell death according to the ZAR1 model (Adachi et al., 2019b;Seong et al., 2020).
Comparative sequence analyses based on a robust evolutionary framework can yield insights into molecular mechanisms and help generate experimentally testable hypotheses. ZAR1 was previously reported to be conserved across multiple dicot plant species but whether it occurs in other angiosperms hasn't been systematically studied (Baudin et al. 2017;Schultink et al. 2019;Harant et al. 2021). Here, we used a phylogenomic approach to investigate the molecular evolution of ZAR1 across flowering plants (angiosperms). We discovered 120 ZAR1 orthologs in 88 species, including monocot, magnoliid and eudicot species indicating that ZAR1 is an atypically conserved NLR that traces its origin to early angiosperm lineages ~220 to 150 million years ago (Jurassic period). We took advantage of this large collection of orthologs to identify highly conserved features of ZAR1, revealing regions for effector recognition, intramolecular interactions and cell death activation. We showed that the cell death activity of ZAR1 from distantly related plant species can be dependent of its partner RLCKs, therefore experimentally validating the hypothesis that ZAR1 has evolved to be a partner with RLCKs early in its evolution. Throughout its evolution, ZAR1 also acquired novel features, including the C-terminal integration of a thioredoxin-like domain and duplication into two paralog families ZAR1-SUB and ZAR1-CIN. Members of the ZAR1-SUB paralog family have highly diversified in eudicots and often lack conserved ZAR1 features. We conclude that ZAR1 has experienced relatively limited gene duplication and expansion throughout its deep evolutionary history, but still did give rise to non-canonical NLR proteins with integrated domains and degenerated molecular features.

ZAR1 is the most widely conserved CC-NLR across angiosperms
To determine the distribution of ZAR1 across plant species, we applied a computational pipeline based on iterated BLAST searches of plant genome and protein databases ( Figure   1A). These comprehensive searches were seeded with previously identified ZAR1 sequences from Arabidopsis, N. benthamiana, tomato, sugar beet and cassava (Baudin et al. 2017;Schultink et al. 2019;Harant et al. 2021). We also performed iterated phylogenetic analyses using the NB-ARC domain of the harvested ZAR1-like sequences, and obtained a well-supported clade that includes the previously reported ZAR1 sequences, as well as new clade members from more distantly related plant species, notably Colacasia esculenta (taro, Alismatales), Cinnamomum micranthum (Syn. C. kanehirae, stout camphor, Magnoliidae) and Aquilegia coerulea (columbine, Ranunculales) (Supplemental Data Set 1). In total, we identified 120 ZAR1 from 88 angiosperm species that tightly clustered in the ZAR1 phylogenetic clade ( Figure 1B, Supplemental Data Set 1). Among the 120 genes, 108 code for canonical CC-NLR proteins with 52.0 to 97.0% similarity to Arabidopsis ZAR1, whereas another 9 carry the three major domains of CC-NLR proteins but have a C-terminal integrated domain (ZAR1-ID, see below). The remaining 3 genes code for two truncated NLRs and a potentially mis-annotated coding sequence due to a gap in the genome sequence. In summary, we propose that the identified clade consists of ZAR1 orthologs from a diversity of angiosperm species. Our analyses of ZAR1-like sequences also revealed two well-supported sister clades of the ZAR1 ortholog clade ( Figure 1B). We named these subclades ZAR1-SUB and ZAR1-CIN and we describe them in more details below.
We have recently proposed that ZAR1 is the most conserved CC-NLR between rosid and asterid plants (Harant et al. 2021). To further evaluate ZAR1 conservation relative to other CC-NLRs across angiosperms, we used a phylogenetic tree of 1475 NLRs from the monocot taro, the magnoliid stout camphor and 6 eudicot species (columbine, Arabidopsis, cassava, sugar beet, tomato, N. benthamiana) to calculate the phylogenetic (patristic) distance between each of the 49 Arabidopsis CC-NLRs and their closest neighbor from each of the other plant species. We found that ZAR1 stands out for having the shortest phylogenetic distance to its orthologs relative to other CC-NLRs in this diverse angiosperm species set (Supplemental Figure 1). A similar analysis where we plotted the phylogenetic distance between each of the 159 N. benthamiana CC-NLRs to their closest neighbor from the other species also revealed ZAR1 as displaying the shortest patristic distance across all examined species (Supplemental Figure 2). These analyses revealed that ZAR1 is possibly the most widely conserved CC-NLR in flowering plants (angiosperms).

Phylogenetic distribution of ZAR1 in angiosperms
Although ZAR1 is distributed across a wide range of angiosperms, we noted particular patterns in its phylogenetic distribution. Supplemental Data Set 1 describes the gene identifiers and other features of ZAR1 orthologs sorted based on the phylogenetic clades reported by Smith and Brown (2018). 68 of the 88 plant species have a single-copy of ZAR1 whereas 20 species have two or more copies (Supplemental Data Set 2). ZAR1 is primarily a eudicot gene but we identified three ZAR1 orthologs outside the eudicots, two in the monocot taro and another one in the magnoliid stout camphor. We failed to detect ZAR1 orthologs in 39 species among the 127 species we examined (Supplemental Data Set 1). Except for taro, ZAR1 is missing in monocot species (17 examined), including in the well-studied Hordeum vulgare (barley), Oryza sativa (rice), Triticum aestivum (wheat) and Zea mays (maize). ZAR1 is also missing in all examined species of the eudicot Fabales, Cucurbitales, Apiales and Asterales. However, we found a ZAR1 ortholog in the early diverging eudicot columbine and ZAR1 is widespread in other eudicots, including in 63 rosid, 4 Caryophyllales and 18 asterid species.

ZAR1 is an ancient Jurassic gene that predates the split between monocots, magnoliids and eudicots
The overall conservation of the 120 ZAR1 orthologs enabled us to perform phylogenetic analyses using the full-length protein sequence and not just the NB-ARC domain as generally done with NLRs ( Figure 2, Supplemental Figure 3). These analyses yielded a robust ZAR1 phylogenetic tree with well-supported branches that generally mirrored established phylogenetic relationships between the examined plant species (Smith and Brown, 2018;Chaw et al., 2019). For example, the ZAR1 tree matched a previously published species tree of angiosperms based on 211 single-copy core ortholog genes (Chaw et al., 2019). We conclude that the origin of the ZAR1 gene predates the split between monocots, magnoliids and eudicots and its evolution traced species divergence ever since.
We postulate that ZAR1 probably emerged in the Jurassic era ~220 to 150 million years ago (Mya) based on the species divergence time estimate of Chaw et al. (2019) and consistent with the latest fossil evidence for the emergence of flowering plants (Fu et al., 2018).

ZAR1 is a genetic singleton in a locus that exhibits gene co-linearity across eudicot species
NLR genes are often clustered in loci that are thought to accelerate sequence diversification and evolution (Michelmore and Meyers, 1998;Lee and Chae, 2020). We examined the genetic context of ZAR1 genes using available genome assemblies of taro, stout camphor, columbine, Arabidopsis, cassava, sugar beet, tomato and N. benthamiana. The ZAR1 locus is generally devoid of other NLR genes as the closest NLR is found in the Arabidopsis genome 183 kb away from ZAR1 (Supplemental Data Set 3). We conclude that ZAR1 has probably remained a genetic singleton NLR gene throughout its evolutionary history in angiosperms.
Next, we examined the ZAR1 locus for gene co-linearity across the examined species. We noted a limited degree of gene co-linearity between Arabidopsis vs. cassava, cassava vs. tomato, and tomato vs. N. benthamiana (Supplemental Figure 4). Flanking conserved genes include the ATPase and protein kinase genes that are present at the ZAR1 locus in both rosid and asterid eudicots. In contrast, we didn't observe conserved gene blocks at the ZAR1 locus of taro, stout camphor and columbine, indicating that this locus is divergent in these species.
Overall, although limited, the observed gene co-linearity in eudicots is consistent with the conclusion that ZAR1 is a genetic singleton with an ancient origin.

ZAR1 orthologs carry sequence motifs known to be required for Arabidopsis ZAR1 resistosome function
The overall sequence conservation and deep evolutionary origin of ZAR1 orthologs combined with the detailed knowledge of ZAR1 structure and function provide a unique opportunity to explore the evolutionary dynamics of this ancient immune receptor in a manner that cannot be applied to more rapidly evolving NLRs. We used MEME (Multiple EM for Motif Elicitation) (Bailey and Elkan, 1994) to search for conserved sequence patterns among the 117 ZAR1 orthologs (ZAR1 and ZAR1-ID) that encode full-length CC-NLR proteins. This analysis revealed several conserved sequence motifs that span across the ZAR1 orthologs (range of protein lengths: 753-1132 amino acids) ( Figure 3A, Supplemental table 1). In Figure 3A, we described the major five sequence motifs or interfaces known to be required for Arabidopsis ZAR1 function that are conserved across ZAR1 orthologs.
Effector recognition by ZAR1 occurs indirectly via binding to RLCKs through the LRR domain.
Key residues in the Arabidopsis ZAR1-RLCK interfaces are highly conserved among ZAR1 orthologs and were identified by MEME as conserved sequence patterns ( Figure 3A). Valine (V) 544, histidine (H) 597, tryptophan (W) 825 and phenylalanine (F) 839 in the Arabidopsis ZAR1 LRR domain were validated by mutagenesis as important residues for RLCK binding whereas isoleucine (I) 600 was not essential (Wang et al. 2019a;Hu et al. 2020). In the 117 ZAR1 orthologs, V544, H597, W825 and F839 are conserved in 97-100% of the proteins compared to only 63% for I600.
After effector recognition, Arabidopsis ZAR1 undergoes conformational changes from monomeric inactive form to oligomeric active state. This is mediated by ADP release from the NB-ARC domain and subsequent ATP binding, which triggers further structural remodelling in ZAR1 leading to the formation of the activated pentameric resistosome (Wang et al. 2019b). NB-ARC sequences that coordinate binding and hydrolysis of dATP, namely P-loop and MHD motifs, are highly conserved across ZAR1 orthologs ( Figure 3A). Histidine (H) 488 and lysine (K) 195, located in the ADP/ATP binding pocket (Wang et al. 2019a;Wang et al. 2019b), are invariant in all 117 orthologs. In addition, three NB-ARC residues, W150, S152 and V154, known to form the NBD-NBD oligomerisation interface for resistosome formation (Wang et al. 2019b;Hu et al. 2020), are present in 82-97% of the ZAR1 orthologs and were also part of a MEME motif ( Figure 3A).
The N-terminal CC domain of Arabidopsis ZAR1 mediates cell death signalling thorough the form a funnel like structure (Baudin et al., 2017;Wang et al. 2019b;Adachi et al., 2019b). We detected an N-terminal MEME motif that matches the α1 helix/MADA motif ( Figure 3A). We also used the HMMER software (Eddy, 1998) to query the ZAR1 orthologs with a previously reported MADA motif-Hidden Markov Model (HMM) (Adachi et al., 2019b).
This HMMER search detected a MADA-like sequence at the N-terminus of all 117 ZAR1 orthologs (Supplemental Data Set 1).
Taken together, based on the conserved motifs depicted in Figure 3A, we propose that angiosperm ZAR1 orthologs share the main functional features of Arabidopsis ZAR1: 1) effector recognition via RLCK binding, 2) remodelling of intramolecular interactions via ADP/ATP switch, 3) oligomerisation via the NBD-NBD interface and 4) α1 helix/MADA motif-mediated activation of hypersensitive cell death.

ZAR1 resistosome displays conserved surfaces on RLCK binding sites and the inner glutamate ring
To identify additional conserved and variable features in ZAR1 orthologs, we used ConSurf (Ashkenazy et al., 2016) to calculate a conservation score for each amino acid and generate a diversity barcode for ZAR1 orthologs ( Figure 3B). The overall pattern is that the 117 ZAR1 orthologs are fairly conserved. Nonetheless, the CC domain (except for the N-terminal MADA motif and a few conserved stretches), the junction between the NB-ARC and LRR domains and the very C-terminus were distinctly more variable than the rest of the protein ( Figure 3B).
We also used the cryo-EM structures of Arabidopsis ZAR1 to determine how the ConSurf score map onto the 3D structures ( Figure 3C, D and Figure 4). First, we found five major variable surfaces (VS1 to VS5) on the inactive ZAR1 monomer structure ( Figure 3C, D), as depicted in the ZAR1 diversity barcode ( Figure 3B). VS1 comprises α2/α4 helices and a loop between α3 and α4 helices of the CC domain. VS2 and VS3 corresponds to α1/α2 helices of NBD and a loop between α2 and α3 helices of HD1, respectively. VS4 comprises a loop between WHD and LRR and first three helices of the LRR domain. VS5 is mainly derived from the last three helices of the LRR domain and the loops between these helices ( Figure   3B, D).
Next, we examined highly conserved surfaces on inactive and active ZAR1 structures ( Figure 4A, B). Consistent with the MEME analyses, we confirmed that highly conserved surfaces match to the RLCK binding interfaces ( Figure 4A, B). We also confirmed that the N-terminal α1 helix/MADA motif is conserved on the resistosome surfaces, although the first four N-terminal amino acids are missing from the N terminus of the active ZAR1 cryo-EM structures ( Figure 4B, C). We also noted sequence conservation at the glutamate rings (comprised of E11, E18, E130 and E134) inside the Arabidopsis ZAR1 resistosome (Supplemental Figure 5). Glutamic acid (E) 11 is conserved in 94% of ZAR1 orthologs, whereas only 3-18% retain E18, E130 and E134 in the same positions as Arabidopsis ZAR1.
Interestingly, mutation of E11 to alanine (A) impaired Arabidopsis ZAR1-mediated cell death, but the E18A, E130A and E134A mutants were capable of inducing cell death (Bai et al., 2021). Furthermore, the E11A mutation impaired Ca 2+ channel activity of the ZAR1 resistosome in vitro and in vivo (Bai et al., 2021). Therefore, our motif and structure analyses suggest that RLCK-mediated effector recognition and E11-dependent Ca 2+ influx are key functional features conserved across the great majority of ZAR1 orthologs.

ZAR1 interaction sites are conserved in ZED1-related kinase (ZRK) family proteins across distantly related plant species
We endeavored to experimentally test the hypothesis that ZAR1 ortholog proteins across angiosperm species require RLCKs to activate their molecular switch. First, we searched for RLCK XII-2 subfamily genes in the distantly related plant species, taro, stout camphor and columbine. The BLAST searches of protein databases were seeded with previously identified RLCK ZED1-related kinase (ZRK) sequences from Arabidopsis and N. benthamiana (Lewis et al., 2013;Schultink et al. 2019). We also performed iterated phylogenetic analyses using the kinase domain of the harvested ZRK-like sequences and obtained a well-supported clade that includes previously reported ZRK from Arabidopsis (ZRK1~7, 10~15) and N. benthamiana (JIM2) as well as new clade members from taro, stout camphor and columbine ( Figure 5A). In total, we identified 21 ZRK genes in these species, which include 1 ZRK gene (CeZRK1) from taro, 15 ZRK genes (CmZRK1~15) from stout camphor and 5 ZRK genes (AcZRK1~5) from columbine ( Figure 5A, Supplemental Data Set 4).
Remarkably, similar to Arabidopsis ZRKs (Lewis et al., 2013), a number of the identified ZRKs are located in genomic clusters. 13 ZRK genes in stout camphor and 4 ZRK genes in columbine form gene clusters on scaffold QPKB01000003.1 and contig KZ305039.1, respectively ( Figure 5B). All of the identified ZRK genes are located on different scaffold or contig with ZAR1 gene in taro, stout camphor and columbine, whereas Arabidopsis ZAR1 and the 9 ZRK genes occur on the same chromosome (Supplemental Data Set 4).
The 21 ZRK genes code for proteins of 277-452 amino acids, similar to Arabidopsis and N.
ZRK family proteins from taro, stout camphor and columbine show 20.8 to 42.2% similarity to Arabidopsis ZRK1 (Supplemental Data Set 6). Although the sequence similarity is low across the ZRK proteins in angiosperms, ZAR1 interaction sites are highly conserved in the ZRKs (Supplemental Figure 6) (Wang et al., 2019a;Hu et al., 2020). Notably, functionally validated residues for ZAR1-RLCK interactions [glycine 27 (G27) and leucine 31 (L31) in Arabidopsis Moreover, 90% of the 21 ZRKs have a hydrophobic V or I residue at the same position to V35 in Arabidopsis ZRK1 (corresponding to I24 in Arabidopsis ZRK5). This sequence conservation supports our hypothesis that ZRK family proteins function together with ZAR1 across distantly related plant species.
Stout camphor and columbine ZED1-related kinase (ZRK) proteins positively regulate ZAR1 autoactive cell death N. benthamiana ZAR1 (NbZAR1) requires its partner RLCK JIM2 to trigger autoimmune cell death in the N. benthamiana expression system (Harant et al. 2021). Therefore, we hypothesized that the ZRK family genes are required for ZAR1 to trigger cell death response in angiosperms. To test this hypothesis, we cloned wild-type ZAR1 and ZRK genes from taro, stout camphor and columbine, and also generated autoactive ZAR1 mutants by following the mutation strategy we previously used for the NbZAR1 MHD motif mutant (NbZAR1 D481V ) (Harant et al. 2021). We found that MHD mutants of taro ZAR1 (CeZAR1 D487V ) and columbine ZAR1 (AcZAR1 D489V ), but not stout camphor ZAR1 (CmZAR1 D488V ), induce autoacitive cell death in N. benthamiana leaves ( Figure 6). Whereas CeZRK1 expression did not alter cell death activity of CeZAR1 D487V , AcZRK1, AcZRK3, AcZRK4 and AcZRK5, but not AcZRK2, enhanced the cell death response by AcZAR1 D489V (Figure 6A   although alternative evolutionary scenarios such as a common origin followed by subsequent deletion of the ID or lineage sorting remain possible (Supplemental Figure 9).
We annotated all the C-terminal extensions as thioredoxin-like using InterProScan (Trx, IPR036249; IPR013766; cd02989). The integrated Trx domain sequences share sequence similarity to each other ( Figure 7B). They are also similar to Arabidopsis AT3G50960 (phosphoducin-like PLP3a; 34.8-90% similarity to integrated Trx domains), which is located immediately downstream of ZAR1 in a tail-to-tail configuration in the Arabidopsis genome (Supplemental Figure 10). We also noted additional genetic linkage between ZAR1 and Trx genes in other rosid species, namely field mustard, orange, cacao, grapevine and apple, and in the asterid species coffee (Supplemental Data Set 7). We conclude that ZAR1 is often genetically linked to a PLP3a-like Trx domain gene and that the integrated domain in ZAR1- Supplemental Data Set 8). Three out of 129 genes are from the early diverging eudicot clade Ranunculales species, namely columbine, Macleaya cordata (plume poppy) and Papaver somniferum (opium poppy). The remaining ZAR1-SUB are spread across rosid and asterid species. We found that 11 species have ZAR1-SUB genes but lack a ZAR1 ortholog (Supplemental Data Set 2). These 11 species include two of the early diverging eudicots plume poppy and opium poppy, and the Brassicales Carica papaya (papaya). Interestingly, papaya is the only Brassicales species carrying a ZAR1-SUB gene, whereas the 16 other Brassicales species have ZAR1 but lack ZAR1-SUB genes (Supplemental Data Set 2). In total, we didn't detect ZAR1-SUB genes in 44 species that have ZAR1 orthologs, and these 44 species include the monocot taro, the magnoliid stout camphor and 42 eudicots, such as Arabidopsis, sugar beet and N. benthamiana (Supplemental Data Set 2).
In summary, given the taxonomic distribution of the ZAR1-SUB clade genes, we propose that ZAR1-SUB has emerged from a single duplication event of ZAR1 prior to the split between

ZAR1-SUB paralogs have significantly diverged from ZAR1
We investigated the sequence patterns of ZAR1-SUB proteins and compared them to the sequence features of canonical ZAR1 proteins that we identified earlier ( Figure 3A). MEME analyses revealed several conserved sequence motifs (Supplemental Table 2). Especially, the MEME motifs in the ZAR1-SUB NB-ARC domain were similar to ZAR1 ortholog motifs (Supplemental Table 3). These include P-loop and MHD motifs, which are broadly conserved in NB-ARC of 97% and 100% of the ZAR1-SUB NLRs, respectively ( Figure 9A). MEME also revealed sequence motifs in the ZAR1-SUB LRR domain that partially overlaps in position with the conserved ZAR1-RLCK interfaces ( Figure 9A, Supplemental Figure 12). However, the ZAR1-SUB MEME motifs in the LRR domain were variable at the ZAR1-RLCK interface positions compared to ZAR1, and the motif sequences were markedly different between ZAR1-SUB and ZAR1 proteins ( Figures 3A, 9A).
Remarkably, unlike ZAR1 orthologs, MEME did not predict conserved sequence pattern from a region corresponding to the MADA motif, indicating that these sequences have diverged across ZAR1-SUB proteins ( Figure 9A). We confirmed the low frequency of MADA motifs in ZAR1-SUB proteins using HMMER searches with only ~30% (38 out of 129) of the tested proteins having a MADA-like sequence (Supplemental Data Set 8, Figure 8). Moreover, conserved sequence patterns were not predicted for the NBD-NBD interface and the conserved underside surface of the ZAR1 resistosome ( Figure 9A, Supplemental Figure 12). This indicates that the NB-ARC domain of ZAR1-SUB proteins is highly diversified in contrast to the relatively conserved equivalent region of ZAR1 proteins. We generated a diversity barcode for ZAR1-SUB proteins using the ConSurf as we did earlier with ZAR1 orthologs ( Figure 9B). This revealed that there are several conserved sequence blocks in each of the CC, NB-ARC and LRR domains, such as the regions corresponding to P-loop, MHD motif and the equivalent of the ZAR1-RLCK interfaces. Nonetheless, ZAR1-SUB proteins are overall more diverse than ZAR1 orthologs especially in the CC domain, including the N-terminal MADA motif, and the NBD/HD1 regions of the NB-ARC domain where the NBD-NBD interface is located.
Next, we mapped the ConSurf conservation scores onto a homology model of a representative ZAR1-SUB protein (XP_004243429.1 from tomato) built based on the Arabidopsis ZAR1 cryo-EM structures (Supplemental Figure 13). As highlighted in Supplemental Figure 13B and C, conserved residues, such as MHD motif region in the WHD, are located inside of the monomer and resistosome structures. Interestingly, although the prior MEME prediction analyses revealed conserved motifs in positions matching the ZAR1-RLCK interfaces in the LRR domain, the ZAR1-SUB structure homology models displayed variable surfaces in this region (Supplemental Figure 13A). This indicates that the variable residues within these sequence motifs are predicted to be on the outer surfaces of the LRR domain and may reflect interaction with different ligands.
Taken together, these results suggest that unlike ZAR1 orthologs, the ZAR1-SUB paralogs have divergent molecular patterns for regions known to be involved in effector recognition, resistosome formation and activation of hypersensitive cell death.

Cinnamomum micranthum (stout camphor) genome
The ZAR1-CIN clade, identified by phylogenetic analyses as a sister clade to ZAR1 and ZAR1-SUB, consists of 11 genes from the magnoliid species stout camphor ( Figure 1B,   Figure 8, Supplemental Data Set 9). 8 of the 11 ZAR1-CIN genes code for canonical CC-NLR proteins with 63.8 to 98.9% sequence similarities to each other, whereas the remaining 3 genes code for truncated NLR proteins. Interestingly, all ZAR1-CIN genes occur in a ~500 kb cluster on scaffold QPKB01000005.1 of the stout camphor genome assembly (GenBank assembly accession GCA_003546025.1) (Supplemental Figure 14A, B). This scaffold also contains the stout camphor ZAR1 ortholog (CmZAR1, RWR84015), which is located 48 Mb from the ZAR1-CIN cluster (Supplemental Figure 14 A, B). Based on the observed phylogeny and gene clustering, we suggest that the ZAR1-CIN cluster emerged from segmental duplication and expansion of the ancestral ZAR1 gene after stout camphor split from the other examined ZAR1 containing species.
We examined the expression of the eleven CmZAR1 and ZAR1-CIN genes in seven tissues of C. micranthum based on the data of Chaw et al. (Chaw et al., 2019). The CmZAR1 gene is relatively highly expressed in seven different tissues of the stout camphor tree (Supplemental Figure 14C). In contrast, only five of the eleven ZAR1-CIN genes displayed detectable expression levels. Of these, two ZAR1-CIN genes (RWR85656 and RWR85657) had different expression patterns across the tissues. Whereas RWR85657 had the highest expression level in flowers, RWR85656 displayed the highest expression levels in stem and old leaf tissues (Supplemental Figure 14C). The implications of these observations remain unclear but may reflect different degrees of tissue specialization of the ZAR1-CIN genes.

Tandemly duplicated ZAR1-CIN display variable ligand binding interfaces on the LRR domain
We performed MEME and ConSurf analyses of the 8 intact ZAR1-CIN proteins as described above for ZAR1 and ZAR1-SUB. The ConSurf barcode revealed that although ZAR1-CIN proteins are overall conserved, their WHD region and LRR domain include some clearly variable blocks ( Figure 9B). MEME analyses of ZAR1-CIN sequences revealed that like However, the sequence consensus at the NBD-NBD and ZAR1-RLCK interfaces indicated these motifs are more variable among ZAR1-CIN proteins relative to ZAR1 orthologs, and the motif sequences were markedly different from the matching region in ZAR1 (Figures 3A,   9C).
We also mapped the ConSurf conservation scores onto a homology model of a representative ZAR1-CIN protein (RWR85656.1) built based on the Arabidopsis ZAR1 cryo-EM structures (Supplemental Figure 13). This model revealed several conserved surfaces, such as on the α1 helix in the CC domain and the WHD of the NB-ARC domain (Supplemental Figure 13B, C). In contrast, the ZAR1-CIN structure homology models displayed highly varied surfaces especially in the LRR region matching the RLCK binding interfaces of ZAR1 (Supplemental Figure 13A). This sequence diversification on the LRR surface suggests that the ZAR1-CIN paralogs may have different host partner proteins and/or effector recognition specificities compared to ZAR1.

DISCUSSION
This study of ZAR1 macroevolution originated from phylogenomic analyses we initiated during the UK COVID-19 lockdown of March 2020. We performed iterated comparative sequence similarity searches of plant genomes using the CC-NLR immune receptor ZAR1 as a query, and subsequent phylogenetic evaluation of the recovered ZAR1-like sequences.
This revealed that ZAR1 is an ancient gene with 120 orthologs recovered from 88 species including monocot, magnoliid and eudicot plants. ZAR1 is an atypically conserved NLR in these species with the gene phylogeny tracing species phylogeny, and consistent with the view that ZAR1 originated early in angiosperms during the Jurassic geologic period ~220 to 150 Mya ( Figure 10). The ortholog series enabled us to determine that resistosome sequences that are known to be functionally important and have remained highly conserved throughout the long evolutionary history of ZAR1. In addition, we experimentally validated the model that ZAR1 has been a partner with RLCKs for over 150 Mya through functional reconstruction of ZAR1-RLCK pairs from distantly related plant species (Figure 6). The main unexpected feature among ZAR1 orthologs is the acquisition of a C-terminal thioredoxin-like domain in cassava and cotton species (Figure 7). Our phylogenetic analyses also indicated that ZAR1 duplicated twice throughout its evolution (Figure 10). In the eudicots, ZAR1 spawned a large paralog family, ZAR1-SUB, which greatly diversified and often lost the typical sequence features of ZAR1. A second paralog, ZAR1-CIN, is restricted to a tandemly repeated 11-gene cluster in stout camphor. Overall, our findings map patterns of functional conservation, expansion and diversification onto the evolutionary history of ZAR1 and its paralogs.
ZAR1 most likely emerged prior to the split between monocots, Magnoliids and eudicots, which corresponds to ~220 to 150 Mya based on the dating analyses of Chaw et al. (2019).
The origin of the angiosperms remains hotly debated with uncertainties surrounding some of the fossil record coupled with molecular clock analyses that would benefit from additional genome sequences of undersampled taxa (Coiro et al., 2019). Recently, Fu et al. (2018) provided credence to an earlier emergence of angiosperms with the discovery of the fossil flower Nanjinganthus dendrostyla, which places the emergence of flowering plants at the Early Jurassic. It is tempting to speculate that ZAR1 emerged among these early flowering plants during the period when dinosaurs dominated planet earth.

Figure 10. Evolution of ZAR1 and the paralogs in angiosperms.
We propose that the ancestral ZAR1 gene has emerged ~220 to 150 million years ago (Mya) before monocot and eudicot lineages split. ZAR1 gene is widely conserved CC-NLR in angiosperms, but it is likely that ZAR1 has lost in a monocot lineage, Commelinales. A sister clade paralog ZAR1-SUB has emerged early in the eudicot lineages and may have lost in Caryophyllales. Another sister clade paralog ZAR1-CIN has duplicated from ZAR1 gene and expanded in the Magnoliidae C. micranthum. Trx domain integration to C terminus of ZAR1 has independently occurred in few rosid lineages.
NLRs are notorious for their rapid and dynamic evolutionary patterns even at the intraspecific level. In sharp contrast, ZAR1 is an atypical core NLR gene conserved in a wide range of angiosperm species (Figure 2). Nevertheless, Arabidopsis ZAR1 can recognize diverse bacterial pathogen effectors, including five different effector families distributed among nearly half of a collection of ~500 Pseudomonas syringae strains (Laflamme et al., 2020) and an effector AvrAC from Xanthomonas campestris (Wang et al., 2015). How did ZAR1 remain conserved throughout its evolutionary history while managing to detect a diversity of effectors? The answer to the riddle lies in the fact that ZAR1 effector recognition occurs via its partner RLCKs. ZRKs of the RLCK XII-2 subfamily rest in complex with inactive ZAR1 proteins and bait effectors by binding them directly or by recruiting other effector-binding RLCKs, such as the family VII PBS1-like protein 2 (PBL2) (Lewis et al., 2013;Wang et al., 2015). These ZAR1-associated RLCKs are highly diversified not only in Arabidopsis (Lewis et al., 2013), but also in stout camphor and columbine, where RLCK XII-2 members occurring in expanded ZRK gene clusters ( Figure 5). In the Arabidopsis ZRK cluster, RKS1/ZRK1 is required for recognition of X. campestris effector AvrAC (Wang et al., 2015) and ZRK3 and ZRK5/ZED1 are required for recognition of P. syringae effectors HopF2a and HopZ1a, respectively (Lewis et al., 2013;Seto et al., 2017). Therefore, as in the model discussed by Schultink et al. (2019), RLCKs appear to have evolved as pathogen 'sensors' whereas ZAR1 acts as a conserved signal executor to activate immune response.
The MEME and ConSurf analyses are consistent with the model of ZAR1/RLCK evolution described above. ZAR1 is not just exceptionally conserved across angiosperms but it has also preserved sequence patterns that are key to resistosome-mediated immunity (Figures 3   and 4). Within the LRR domain, ZAR1 orthologs display highly conserved surfaces for RLCK binding (Figure 4). We conclude that ZAR1 has been guarding host kinases throughout its evolution ever since the Jurassic period. These findings strikingly contrast with observations recently made by Prigozhin and Krasileva (2020) on highly variable Arabidopsis NLRs (hvNLRs), which tend to have diverse LRR sequences. For instance, the CC-NLR RPP13 displays variable LRR surfaces across 62 Arabidopsis accessions, presumably because these regions are effector recognition interfaces that are caught in arms race coevolution with the oomycete pathogen Hyaloperonospora arabidopsidis (Prigozhin and Krasileva, 2020).
The emerging view is that the mode of pathogen detection (direct vs indirect recognition) drives an NLR evolutionary trajectory by accelerating sequence diversification at the effector binding site or by maintaining the binding interface with the partner guardee/decoy proteins (Prigozhin and Krasileva, 2020).
Our functional validation of ZAR1 and ZRKs from distantly related plant species supported the model that ZRKs function together with ZAR1 to trigger immune response in planta ( Figure 6). 11 of the 19 tested ZRKs were either required or enhanced ZAR1 autoactivity in N.
Notably, CmZRK4, CmZRK5 and CmZRK12 have N-terminal truncation or mutations at the ZAR1 interaction sites identified from the Arabidopsis ZAR1-ZRK studies (Supplemental Data Set 6). Therefore, some of the ZRK members may have lost their association with ZAR1 through deletion or mutations. Taro and columbine ZAR1 could trigger autoactive cell death without their partner RLCKs in N. benthamiana, whereas stout camphor and N. benthamiana ZAR1 proteins require ZRKs to trigger the cell death response (Figure 6; Supplemental Figure 7) (Harant et al., 2021). In the case of taro and columbine, ZRKs may trigger conformational changes of ZAR1 after recognition of cognate pathogen effectors. In this scenario, autoactive ZAR1 could form a resistosome without ZRK proteins, thereby triggering the observed cell death response. In the future, further comparative biochemical studies would further inform our understanding of how ZAR1-ZRK interactions contribute to resistosome formation across angiosperms. ZAR1 orthologs display a patchy distribution across angiosperms (Supplemental Data Set 1).
Given the low number of non-eudicot species with ZAR1 it is challenging to develop a conclusive evolutionary model. Nonetheless, the most parsimonious explanation is that ZAR1 was lost in the monocot Commelinales lineage (Figure 10, Supplemental Data Set 1).
ZAR1 is also missing in some eudicot lineages, notably Fabales, Cucurbitales, Apiales and Asterales (Supplemental Data Set 1). Cucurbitaceae (Cucurbitales) species are known to have reduced repertoires of NLR genes possibly due to low levels of gene duplications and frequent deletions (Lin et al., 2013). ZAR1 may have been lost in this and other plant lineages as part of an overall shrinkage of their NLRomes or as consequence of selection against autoimmune phenotypes triggered by NLR mis-regulation (Karasov et al., 2017;Adachi et al., 2019a). In the future, it would be interesting to investigate the repertoires of RLCK subfamilies VII and XII in species that lack ZAR1 orthologs.
We unexpectedly discovered that some ZAR1 orthologs from cassava and cotton species carry a C-terminal thioredoxin-like domain (ZAR1-ID in Figure 7). What is the function of these integrated domains? The occurrence of unconventional domains in NLRs is relatively frequent and ranges from 5 to 10% of all NLRs. In several cases, integrated domains have emerged from pathogen effector targets and became decoys that mediate detection of the effectors (Kourelis and van der Hoorn, 2018). Whether or not the integrated Trx domain of ZAR1-ID functions to bait effectors will need to be investigated. Since ZAR1-ID proteins still carry intact RLCK binding interfaces (Supplemental Data Set 10), they may have evolved dual or multiple recognition specificities via RLCKs and the Trx domain. In addition, all ZAR1-ID proteins have an intact N-terminal MADA motif (Supplemental Figure 9), suggesting that they probably can execute the hypersensitive cell death through their N-terminal CC domains even though they carry a C-terminal domain extension (Adachi et al., 2019b).
However, we noted multiple splice variants of the ZAR1-ID gene of cassava, some of which lack the Trx integration (Supplemental Figure 8). It is possible that both ZAR1 and ZAR1-ID isoforms are produced, potentially functioning together as a pair of sensor and helper NLRs.
Our sequence analyses of ZAR1-ID indicate that the integrated Trx domain originates from the PLP3 phosphoducin gene, which is immediately downstream of ZAR1 in the Arabidopsis genome and adjacent to ZAR1 in several other eudicot species (Supplemental Figure 10).
Whether or not PLP3 plays a role in ZAR1 function and the degree to which close genetic linkage facilitated domain fusion between these two genes are provocative questions for future studies.
ZAR1 spawned two classes of paralogs through two independent duplication events. The ZAR1-SUB paralog clade emerged early in the eudicot lineage-most likely tens of millions of years after the emergence of ZAR1-and has diversified into at least 129 genes in 55 species (Figure 10). ZAR1-SUB proteins are distinctly more diverse in sequence than ZAR1 orthologs and generally lack key sequence features of ZAR1, like the MADA motif and the NBD-NBD oligomerisation interface ( Figure 9) (Adachi et al., 2019b;Wang et al. 2019b;Hu et al. 2020). This pattern is consistent with 'use-it-or-lose-it' evolutionary model, in which NLRs that specialize for pathogen detection lose some of the molecular features of their multifunctional ancestors (Adachi et al., 2019b). Therefore, we predict that many ZAR1-SUB proteins evolved into specialized sensor NLRs that require NLR helper mates for executing the hypersensitive response. It is possible that ZAR1-SUB helper mate is ZAR1 itself, and that these NLRs evolved into a phylogenetically linked network of sensors and helpers similar to the NRC network of asterid plants (Wu et al., 2017). However, 11 species have a ZAR1-SUB gene but lack a canonical ZAR1 (Supplemental Data Set 2), indicating that these ZAR1-SUB NLRs may have evolved to depend on other classes of NLR helpers.
How would ZAR1-SUB sense pathogens? Given that the LRR domains of most ZAR1-SUB proteins markedly diverged from the RLCK binding interfaces of ZAR1, it is unlikely that ZAR1-SUB proteins bind RLCKs in a ZAR1-type manner (Supplemental Figure 13). This leads us to draw the hypothesis that ZAR1-SUB proteins have diversified to recognize other ligands than RLCKs. In the future, functional investigations of ZAR1-SUB proteins could provide insights into how multifunctional NLRs, such as ZAR1, evolve into functionally specialized NLRs.
The ZAR1-CIN clade consists of 11 clustered paralogs that are unique to the magnoliid species stout camphor as revealed from the genome sequence of the Taiwanese small-flowered camphor tree (also known as Cinnamomum kanehirae, Chinese name niu zhang 牛樟) (Chaw et al., 2019). This cluster probably expanded from ZAR1, which is ~48 Mbp on the same genome sequence scaffold (Supplemental Figure 14). The relatively rapid expansion pattern of ZAR1-CIN into a tandemly duplicated gene cluster is more in line with the classical model of NLR evolution compared to ZAR1 maintenance as a genetic singleton over tens of millions of years (Michelmore and Meyers, 1998). ZAR1-CIN proteins may have neofunctionalized after duplication, acquiring new recognition specificities as consequence of coevolution with host partner proteins and/or pathogen effectors. Consistent with this view, ZAR1-CIN exhibit different patterns of gene expression across tissues (Supplemental Figure   14). Moreover, ZAR1-CIN proteins display distinct surfaces at the ZAR1-RLCK binding interfaces and may bind to other ligands than RLCKs as we hypothesized above for ZAR1-SUB (Supplemental Figure 13). ZAR1-CIN could be viewed as intraspecific highly variable NLRs (hvNLR) per the nomenclature of Prigozhin and Krasileva (2020).  Figure 15). Therefore, ZAR1-CIN may form resistosome-type complexes that are independent of ZAR1. One intriguing hypothesis is that ZAR1-CIN may associate with each other to form heterocomplexes of varying complexity and functionality operating as an NLR receptor network. In any case, the clear-cut evolutionary trajectory from ZAR1 to the ZAR1-CIN paralog cluster provides a robust evolutionary framework to study functional transitions and diversifications in this CC-NLR lineage.
In summary, our phylogenomics analyses raise several intriguing questions about ZAR1 evolution. The primary conclusion we draw is that ZAR1 is an ancient CC-NLR that has been a partner with RLCKs ever since the Jurassic Period. We propose that throughout at least 150 million years, ZAR1 has maintained its molecular features for sensing pathogens via RLCKs and activating hypersensitive cell death. Further comparative analyses, combining molecular evolution and structural biology, of plant resistosomes and between resistosomes and the apoptosomes and inflammasome of animal NLR systems (Wang and Chai, 2020) will yield novel experimentally testable hypotheses for NLR research.

ZAR1 and ZRK sequence retrieval
We performed BLAST (Altschul et al., 1990) using previously identified ZAR1 and ZRK sequences as queries (Lewis et al., 2013;Baudin et al., 2017;Schultink et al., 2019;Harant et al., 2021)  we used cut-offs, percent identity ≥ 30% and query coverage ≥ 80%. The BLAST pipeline was circulated by using the obtained sequences as new queries to search ZAR1 and ZRK like genes over the angiosperm species. We also performed the BLAST pipeline against a plant NLR dataset annotated by NLR-parser (Steuernagel et al., 2015) from 38 plant reference genome databases (Supplemental Data Set 11).

Phylogenetic analyses
For the phylogenetic analysis, we aligned NLR and ZRK amino acid sequences (Supplemental Data Sets 5, 12 to 16) using MAFFT v.7 (Katoh and Standley, 2013) and manually deleted the gaps in the alignments in MEGA7 (Kumar et al., 2016

Patristic distance analyses
To calculate the phylogenetic (patristic) distance, we used Python script based on DendroPy (Sukumaran and Mark, 2010

Gene co-linearity analyses
To investigate genetic co-linearity at ZAR1 loci, we extracted the 3 genes upstream and downstream of ZAR1 using GFF files derived from reference genome databases (Supplemental Data Set 11). To identify conserved gene blocks, we used gene annotation from NCBI Protein database and confirmed protein domain information based on InterProScan .

Sequence conservation analyses
Full-length NLR sequences of each subfamily ZAR1, ZAR1-SUB or ZAR1-CIN were subjected to motif searches using the MEME (Multiple EM for Motif Elicitation) (Bailey and Elkan, 1994)

Protein structure analyses
The atomic coordinate of ZAR1 (protein data bank accession codes; 6J5T) was downloaded from protein data bank for illustration in ccp4mg. We used the cryo-EM structures of ZAR1 as templates to generate homology models of ZAR1-SUB and ZAR1-CIN. Amino acid sequences of a tomato ZAR1-SUB (XP_004243429.1) and a stout camphor ZAR1-CIN (RWR85656.1) were submitted to Protein Homology Recognition Engine V2.0 (Phyre2) for modelling (Kelley et a., 2015). The coordinates of ZAR1 structure (6J5T) were retrieved from the Protein Data Bank and assigned as modelling template by using Phyre2 Expert Mode.
The resulting models of ZAR1-SUB and ZAR1-CIN, and the ZAR1 structures (6J5T) were illustrated with the ConSurf conservation scores in PyMol.

Plant growth condition
Wild-type N. benthamiana plants were grown in a controlled growth chamber with temperature 22-25 C, humidity 45-65% and 16/8 hr light/dark cycle.

Transient gene-expression and cell death assay
Transient expression of NbZAR1 mutants in N. benthamiana were performed by agroinfiltration according to methods described by Bos et al. (2006). Briefly, four-weeks old N. benthamiana plants were infiltrated with Agrobacterium tumefaciens strains carrying the binary expression plasmids. A. tumefaciens suspensions were prepared in infiltration buffer (10 mM MES, 10 mM MgCl 2 , and 150 μM acetosyringone, pH5.6) and were adjusted to appropriate OD 600 (Supplemental Table 13). Macroscopic cell death phenotypes were scored according to the scale of Segretin et al. (2014) modified to range from 0 (no visible necrosis) to 7 (fully confluent necrosis).

Accession numbers
DNA sequence data used in this study can be found from reference genome or Supplemental Figure 15. Sequence alignment of full-length ZAR1 and ZAR1-CIN proteins.
Supplemental Table 1. List of MEME motifs predicted from ZAR1 in angiosperms.
Supplemental Table 2. List of MEME motifs predicted from ZAR1-SUB.
Supplemental Table 3. Comparison of MEME motifs between ZAR1-SUB and ZAR1.
Supplemental Table 4. List of MEME motifs predicted from ZAR1-CIN.
Supplemental Table 5. Comparison of MEME motifs between ZAR1-CIN and ZAR1.