Coordinated regulation of gene expression in Plasmodium female gametocytes by two transcription factors

Gametocytes play key roles in the Plasmodium lifecycle. They are essential for sexual reproduction as precursors of the gametes. They also play an essential role in parasite transmission to mosquitoes. Elucidation of the gene regulation at this stage is essential for understanding these two processes at the molecular level and for developing new strategies to break the parasite lifecycle. We identified a novel Plasmodium transcription factor (TF), designated as a partner of AP2-FG or PFG. In this article, we report that this TF regulates the gene expression in female gametocytes in concert with another female-specific TF AP2-FG. Upon the disruption of PFG, majority of female-specific genes were significantly downregulated, and female gametocyte lost the ability to produce ookinetes. ChIP-seq analysis showed that it was located in the same position as AP2-FG, indicating that these two TFs form a complex. ChIP-seq analysis of PFG in AP2-FG-disrupted parasites and ChIP-seq analysis of AP2-FG in PFG-disrupted parasites demonstrated that PFG mediates the binding of AP2-FG to a ten-base motif and that AP2-FG binds another motif, GCTCA, in the absence of PFG. In promoter assays, this five-base motif was identified as another female-specific cis-acting element. Genes under the control of the two forms of AP2-FG, with or without PFG, partly overlapped; however, each form had target preferences. These results suggested that combinations of these two forms generate various expression patterns among the extensive genes expressed in female gametocytes.


Introduction
Plasmodium gametocytes are erythrocytic sexual stages essential for parasite transmission to mosquito vectors.Within the blood meal of a mosquito, female and male gametocytes egress host erythrocytes and transform into gametes for fertilization.Zygotes undergo meiotic nuclear division and develop into motile ookinetes to escape from the blood meal to the mosquito midgut epithelium.
The elucidation of gene regulation during this lifecycle stage is important for developing antimalarial strategies that block transmission and prevent the spread of the disease.
Gametocytes develop from a subset of asexual erythrocytic parasites that are committed to sexual reproduction.The transition from the asexual erythrocytic stage to gametocytes occurs in late trophozoites; a subpopulation of trophozoites differentiates into gametocytes for sexual reproduction, and the rest develop into schizonts to continue asexual proliferation.Early gametocytes display no sex-specific features, but after a period of time, which is different among Plasmodium species, they start differentiating into each sex and finally become mature male and female gametocytes that are ready to transform into gametes.In P. berghei, early gametocytes manifest morphologies distinguishable from asexual stages at 18 hpi (hours post-infection of the new red blood cell) and manifest sex-specific features at 20 hpi.They then complete a sex-specific maturation process, during which a sex-specific gene expression repertoire is established, at approximately 24-26 hpi (1).Sex-specific proteomic and transcriptomic analyses in P. berghei have revealed a number of genes specifically expressed in each sex (2)(3)(4).In females, genes necessary for fertilization and protein synthesis are expressed, and the number of transcripts is stored in an untranslated state in the cytoplasm to prepare for zygote development (5) (6).In males, genes for flagella-based motility and genome replication are expressed, as well as genes for fertilization.
Recently, it has become clear that transcription factors (TFs) belonging to the AP2 family play a central role in sexual development.AP2-G is essential (7).Disruption of this gene causes the parasite to lose its ability to generate gametocytes.In P. berghei, its expression starts from the late trophozoite at 14-16 hpi and peaks at 18 hpi when sexual stage-specific features first appear (8).
P. berghei AP2-G induces hundreds of genes that contain several TFs important for the development of gametocytes, suggesting that induction of these TFs by AP2-G generates a driving force for gametocytogenesis (8).
AP2-FG is a member of the Plasmodium AP2 family of TFs and is a target genes of AP2-G (8) (9).AP2-FG is a female-specific TF and plays an essential role in the development of female gametocytes.In P. berghei, AP2-FG is expressed from early gametocytes to mature females, and by disruption of AP2-FG, the development of the female gametocytes is impaired, showing immature morphologies and resulting in complete loss of capability to mediate parasite transmission to mosquitoes.This TF binds to a ten-base female-specific cis-acting element and regulates variety of genes include those for fertilization, meiosis, and the development of ookinetes.This broad repertoire of target genes and their essential role in female development suggest that this TF is a master regulator of female development.However, AP2-FG-disrupted parasites can produce ookinetes with decreased numbers and abnormal morphologies, suggesting that additional mechanisms for activating female-specific genes are still functional upon the disruption of AP2-FG.
We explored novel TFs for gametocyte development in the target genes of AP2-G and identified a novel female gametocyte-specific TF.In this paper, we report that this TF, designated as PFG, regulates the gene expression in female gametocytes in concert with AP2-FG.

PFG is a target gene of AP2-G and is expressed in female gametocytes
We screened for novel sequence-specific TFs among target genes that were functionally unannotated in PlasmoDB (https://plasmodb.org/plasmo/app/) using highly conserved amino acid sequences among Plasmodium species as a criterion.We hypothesized that the amino acid sequences of DNA-binding domains of sequence-specific TFs would have been difficult to change during evolution because even a small change in the sequence-specificity of TFs could cause catastrophic effects by global changes in gene expression.Through this screening, a gene encoding a 2709 amino acid protein with two regions highly conserved among Plasmodium was identified (PBANKA0902300, designated as a partner of AP2-FG (PFG; Fig. 1A).This gene is one of the P. berghei genes that were previously identified as genes involved in female gametocyte development (named FD2), based on mass screening combined with single cell RNA-seq (10).The two conserved regions comprised 108-and 139-amino acids, respectively, and each region showed 94% and 91% conservation between P. berghei and P. falciparum (Fig. 1B).The Blastp search (https://blast.ncbi.nlm.nih.gov/Blast.cgi)using these regions revealed that genes homologous to PFG exist broadly among apicomplexan parasites and also in alveolates closely related to Apicomplexa, such as Vitrella brassicaformis (Fig. 1C) (11), whereas a large inter-region between these two regions present in all Plasmodium homologs was not observed in these proteins (Fig. 1A).Based on domain prediction using the SMART database (http://smart.embl-heidelberg.de/),no functionally characterized domains were identified to be significantly similar to these two regions.
To investigate the expression of PFG, we generated parasites expressing GFP-fused PFG (PFG::GFP parasites; Fig. S1A).Fluorescence analysis showed that this gene is expressed only in female gametocytes and is localized in the nucleus (Fig. 1D), suggesting its involvement in the transcriptional regulation of female gametocytes.According to sex-specific transcriptome data (3), the PFG is specifically transcribed in female gametocytes.We examined the target genes of the female-specific transcriptional activator AP2-FG and found that it had been predicted as a target of AP2-FG and harbored AP2-FG peaks with binding motifs in the upstream region (Fig. 1E).These results suggest that PFG was activated in two steps, that is, by AP2-G and then by AP2-FG, during female development.Temporal profiling of the expression of PFG using PFG::GFP parasites showed that the expression became visible from 20 hpi (hours post-erythrocyte infection) (Fig. 1F).
The timing of the expression was approximately four hours later than that of AP2-FG, which started at 16 hpi (9).

PFG is essential for female development
We generated PFG-disrupted (PFG(-)) parasites to assess the role of PFG (Fig. S1B).The asexual stages of PFG(-) parasites proliferated as wild-types (Fig. 2A), and morphologically mature male and female gametocytes were observed on Giemsa-stained blood smears (Fig. 2B).However, in mosquito transmission experiments, no oocysts were observed on the mosquito midgut wall 14 days after the infective blood meal in two independent clones (Fig. 2C).In ookinete cultures, only approximately 20% of the female gametocytes were converted into zygotes, and no bananashaped or retort-form ookinetes were generated (Fig. 2D).Considering the female-specific expression of PFG, these results suggest that disruption of PFG significantly impaired the development of female gametocytes.In the cross-fertilization experiment, ookinetes were formed when crossing with normal females (P48/45(-)) but not when crossing with normal males (P47(-)), demonstrating that the abnormal phenotype observed in PFG(-) was indeed derived from female gametocytes (Fig. 2E).
Next, we generated PFG(-) parasites from transgenic P. berghei ANKA 820 cl1m1cl1 (here called 820) parasites, which express red fluorescent protein (RFP) under the control of the femalespecific promoter (CCP2) and GFP under the control of the male-specific promoter (dynein gene) (PFG(-) 820 parasites, Fig. S1B) (12) (13).Although female gametocytes with mature morphology were observed on Giemsa-stained blood smears as the original 820 parasites, in fluorescenceactivated cell sorting (FACS) analysis, no RFP-positive parasites were detected in the PFG(-) 820 parasites (Fig. 2F).In contrast, GFP-positive parasites were observed in them as in the original 820 parasites (Fig. 2F).These results indicated that the promoter activity of the female-specific CCP2 gene used for RFP expression in 820 parasites was severely reduced by the disruption of PFG.

Female-specific genes are globally downregulated in PFG(-) parasites
The above results suggest that PFG encodes a TF that plays an important role in female-specific gene expression.To investigate the effect of PFG disruption on transcriptional regulation in female gametocytes, we performed RNA-seq analyses of the wild-type and PFG(-) parasites (Table S1).Mice were treated with phenylhydrazine prior to the passage of parasites, and after infection, asexual stages were killed by treatment with sulfadiazine in drinking water.Three independent samples were prepared from wild-type and PFG(-) parasites.In PFG(-) parasites, 279 genes were significantly downregulated (log2(fold change) < -2 and p-value adjusted for multiple testing with the Benjamini-Hochberg procedure < 0.001), more than 90% of which were female-specific genes according to sex-specific transcriptome data (3) (Fig. 2G).Consistent with the results of the gene targeting experiment in 820 parasites, transcripts of the CCP2 gene decreased over twenty times in PFG(-)parasites.Genes significantly downregulated in PFG(-) constituted more than half of the female-enriched genes (258/504, Table S1), suggesting that the expression of female-specific genes was globally downregulated in PFG(-) parasites.

PFG is co-localized with AP2-FG on ten-base motif
To examine whether PFG targeted these female-specific genes, ChIP-seq analysis was performed using PFG::GFP parasites and anti-GFP antibodies.In two biologically independent experiments, experiments 1 and 2, 1073 and 1,204 peaks were identified in the genome, respectively (Tables S2A and S2B), and 95% (1029 peaks) of the peaks identified in experiment 1 were common with those identified in experiment 2 (Fig. 3A).These peaks were observed upstream of the genes significantly downregulated in PFG(-) parasites, suggesting that PFG is directly involved in the transcriptional activation of these genes (examples of graphic images are shown in Fig. 3B).
Statistical analysis of genomic sequences around the summits of peaks (common peaks, the same hereafter) showed that binding of PFG to the genome was associated with ten-base motif sequences, TGTRNNYACA (Fig. 3C), the female-specific cis-acting element identified in the ChIPseq of AP2-FG (9).In comparison with the graphical views, the peaks identified in the ChIP-seq of PFG were co-localized with those of AP2-FG (Fig. 3B).An heat map positioned at the peak summit of PFG in the center showed that the position of the peaks was consistent with the ChIP-seq peaks of AP2-FG throughout the genome (Fig. 3D).These results strongly suggest that these two TFs form a complex on the ten-base motif and co-operatively activate their targets.On the other hand, graphic images also showed that some of the AP2-FG peaks lacked the corresponding ChIP-seq peaks of PFG (Fig. 3E).

PFG is essential for AP2-FG binding to ten-base motif
Some sequence-specific TFs form heterodimeric complexes that bind specific DNA sequences.To examine whether this was the case with PFG and AP2-FG, ChIP-seq analyses for each of these two TFs were performed using parasites in which the other was disrupted.
ChIP-seq of PFG in PFG::GFP parasites with the AP2-FG gene disrupted (PFG::GFP AP2- FG (-) , Fig. S1C), 1077 and 881 peaks were obtained, and 847 peaks were common between them (Fig. 4A and Tables S2C and S2D).These peaks showed no apparent changes compared to those obtained for the original PFG::GFP parasites (Fig. 4B).The heat map showed that the positions of PFG peaks obtained in PFG::GFP AP2-FG(-) parasites (common peaks, the same hereafter) were consistent with those obtained for PFG::GFP parasites throughout the genome, and vice versa (Fig. 4C).Statistical analysis showed that the same ten-base motif obtained in the PFG::GFP parasites was enriched around the summits of the PFG peak obtained in PFG::GFP AP2-FG(-) parasites (Fig. 4D).These results indicate that PFG binds to the ten-base motif even in the absence of AP2-FG.
Comparison with PFG peaks showed that upon disruption of PFG, AP2-FG peaks common with PFG peaks disappeared, and only AP2-FG peaks not common with PFG peaks remained (Fig. 4F).
In line with this, statistical analysis showed that the ten-base motif disappeared and instead two distinct motifs, GCTCA and TGCACA, became the most enriched motifs around the summits of these peaks (common peaks, the same hereafter) with a p-value of 2.1 × 10 -27 and 1.3 × 10 -28 , respectively (Fig. 4G).These results suggest that PFG is essential for binding AP2-FG to the tenbase motif, that is, PFG mediates AP-FG binding to the ten-base motif.The results also suggest that AP2-FG binds to the genome in two distinct forms, with and without PFG, and that the former form binds to the ten-base motif, TGTRNNYACA, and the latter binds to other motifs, supposedly GCTCA or TGCACA.In the following, we refer to the form with PFG as cAP2-FG or the complex form, and the form without PFG as sAP2-FG or the single form.

Ten-base motif is essential for binding of PFG to the genome
ChIP-seq of PFG in PFG::GFP AP2-FG (-) parasites showed that PFG binds to the ten-base motif in the absence of AP2-FG.To demonstrate that the ten-base motif is essential for binding PFG to the genome, we performed a ChIP-qPCR assay using transgenic parasites in which the motif upstream of a target was mutated.A CPW-WPC family protein gene (PBANKA_1346300) ( 14) harbors a ChIP-seq peak for PFG and a ten-base motif under the summit of the peak.Three-point mutations were introduced into the motif using the CRISPR/Cas9 system, and GFP was fused to the PFG gene in these parasites (Fig. 4H).Introducing these mutations reduced the %input value to background levels (Fig. 4I), demonstrating that the motif is essential for PFG binding to the genome.
At present, we speculate that PFG directly interacts with genomic DNA through two highly conserved regions; region 1 and region 2.However, these regions are not similar to any DNA binding domains reported thus far.In other apicomplexan orthologues, these two domains are located adjacent to one another in the protein (Fig. 1A).Therefore, these two regions may be separated by a long interval region but constitute a DNA binding domain of PFG as a result of protein folding.

AP2-FG binds to the five-base motifs directly with its AP2 domain
The results of the ChIP-seq analysis of AP2-FG in AP2-FG::GFP PFG (-) parasites suggested that AP2-FG can bind to the genome directly through its AP2 domain.Statistical analysis showed that two distinct motifs, GCTCA and TGCACA, were enriched around the summits of ChIP-seq peaks (Fig. 4G).To investigate whether the AP2 domain of AP2-FG binds to these sequences, we performed DNA immunoprecipitation followed by high-throughput sequencing (DIP-seq) using the recombinant AP2 domain of P. berghei AP2-FG and P. berghei genomic DNA (Table S2G).Around the summits of the peaks obtained by DIP-seq analysis, the five-base motif sequence GCTCA, identical to one of motifs enriched around the summits of ChIP-seq peaks described above, was highly enriched with a p-value of 2.6 × 10 -246 (Fig. 5A).In addition, two motifs, GATCA and ACTCA, which may be variants of the GCTCA motif, were also enriched, with p-value of 2.0 × 10 -153 and 6.8 × 10 -51 , respectively (Fig. 5A).In contrast, the six-base motif, TGCACA, was not enriched around DIP-seq peaks (p-value = 0.051), suggesting that it is not a binding motif of AP2-FG.Two minor motifs, GATCA and ACTCA, were also enriched around the summits of ChIP-seq peaks of sAP2-FG, with p-value of 2.0 × 10 -53 and 1.0 × 10 -13 , respectively.In the ChIP-seq of sAP2-FG, 84.2% of the peaks contained at least one of these five-base motifs within 300 bp from the summits, and the average distance from the summits was 62.4 bp (Fig. 5B).The most enriched motif matched well with the binding sequence of the AP2 domain of P. falciparum AP2-FG, which was reported by Campbell et al. (15).Collectively, these results suggested that AP2-FG binds directly to these motifs through its AP2 domain.
A ChIP-qPCR assay was performed to confirm that this motif is essential for the binding of sAP2-FG to the upstream of target genes.NEK2 (PBANKA_1240700) is a Nima-related protein kinase essential for parasite transmission to mosquito vectors (16).The NEK2 gene is one of the target genes of sAP2-FG, harboring one five-base motif sequence under the ChIP-seq peak.We prepared parasites with this motif mutated from AP2-FG::GFP PFG(-) parasites (Figs 5C) and performed a qPCR assay with anti-GFP antibodies.Upon addition of the mutation, the AP2-FG associated with the locus decreased to background levels (Fig. 5D).These results demonstrated that AP2-FG binds to the five-base motif through its AP2 domain.

Five-base motif acts as a cis-activating element on the promoter
The function of the five-a base motif as cis-acting element in the female-specific promoter was determined by promoter assays using the centromere plasmid (17).The upstream region of the female-specific gene P28 was used in this assay (Fig. 5E).The region harbors one ten-base motif and three five-base motifs.The mCherry gene on the centromere plasmid was expressed under the control of this promoter region, and mCherry expression was analyzed by FACS (Fig. 5F).
Promoter activity was reduced by introducing mutations to the ten-base motifs, but the signals were still retained.The signals became almost undetectable by introducing mutations into the five-base motifs.These results demonstrate that the five-base motifs are female-specific cis-acting elements and function independently of the ten-base motif in the promoter.

Genome-wide identification of cAP2-FG and -sAP2-FG targets
Our ChIP-seq analyses suggested that the targets of AP2-FG were constituted by targets of cAP2-FG and sAP2-FG.To investigate which genes were commonly or uniquely regulated by these two forms of AP2-FG, we predicted the targets of cAP2-FG and sAP2-FG separately.As targets of cAP2-FG, 810 genes were predicted from ChIP-seq peaks of PFG obtained from PFG::GFP parasites (common peaks of two experiments, the same hereafter) (Fig. 6A and Table 3A).These targets broadly contained genes related to female gametocyte-specific functions, including genes for female gametocyte development, such as genes for egress from the erythrocyte and fertilization, and genes for zygote/ookinete development, such as meiosis-related genes, genes for pellicular and subpellicular structures, ookinete surface or secretory protein genes, and crystalloid protein genes (Table S3A).As target genes for sAP2-FG2, 321 were predicted by ChIP-seq in AP2-FG::GFP PFG(-) parasites (Fig. 6B and Table S3B).Targets of sAP2-FG also contained many genes for female gametocyte-specific functions, but in smaller numbers than those in cAP2-FG targets.
As a result of these target predictions, 967 genes were identified as targets for either cAP2-FG or sAP2-FG, and 164 genes were common between them (Fig. 6C).Based on this classification, the cAP2-FG and sAP2-FG targets had the following features: First, genes known to be involved in gametocyte development are target genes of sAP2-FG, and most of them are also targets of cAP2-FG (thus belonging to common targets).Second, genes for zygote/ookinete development are targets of cAP2-FG, and some are also targets of sAP2-FG.Among these, most meiosis-related genes and crystalloid protein genes are the target genes unique to cAP2-FG.
Target prediction also showed that PFG is a target of sAP2-FG and cAP2-FG.Upstream of PFG harbors the binding sites of three different TFs: AP2-G, sAP2-FG (AP2-FG), and cAP2-FG (PFG) (Fig. 6D).These results suggest that PFG is activated in early females by sAP2-FG and then by cAP2-FG in later female gametocytes.Meanwhile, the zygote TF AP2-Z was predicted to be a target of cAP2-FG only (18).Together with the results obtained in previous studies(8)(18), a cascade of TFs starting from AP2-G could be summarized as in Fig. ６E.

Impact of PFG disruption on the expression of AP2-FG target genes
To examine the independence of transcriptional activation by cAP2-FG from that of sAP2-FG, the impact of PFG disruption on the expression of AP2-FG target genes was assessed separately in the three groups described in Fig 6C, that is, unique for sAP2-FG, unique for PFG, and common for both, using RNA-seq data already presented in Fig. 2G.Among the 279 genes significantly downregulated in PFG(-) parasites, 165 genes were targets for PFG (unique for PFG or common for sAP2-FG and PFG), and only four genes were targets unique to sAP2-FG.The log2 distribution (fold change = PFG(-)/wild type) in the three groups of target genes showed that the average value was significantly higher (i.e., less down-regulated) in targets unique to sAP2-FG than in the other two groups (targets unique to cAP2-FG or common targets for both), with p-values of 1.3 × 10 -10 and 1.4 × 10 -5 , respectively, by two-tailed Student's t-test (Fig. 6F).In addition, the average log2 (fold change) value of the common target genes was relatively higher (i.e., less down-regulated) than that of targets unique to PFG, suggesting that transcriptional activation by sAP2-FG partly complements the impact of PFG disruption on these common target.These results suggest that the trans-activity of cAP2-FG and sAP2-FG is independent of each other.

Discussion
Genes transcribed in female gametocytes are involved in various steps of parasite transmission to the mosquito vector, including sex-specific differentiation in the vertebrate host and zygote/ookinete development that take place in the mosquito vector.In a previous study, we reported that this wide range of gene expression repertoires in females is regulated directly by a female-specific TF AP2-FG that binds to a ten-base cis-acting element upstream of these genes (9).In this study, we report two findings that change our previous view.First, the binding of AP2-FG to the ten-base motif is mediated by another female-specific TF, PFG.Second, there is another cis-acting element in female-specific promoters, and AP2-FG binds to this element directly as a trans-activating factor.These findings suggest that transcriptional regulation of female-specific genes is not as simple as previously thought but is regulated by various promoters composed of different combinations of the two cis-acting elements.Considering the different expression profiles of PFG and AP2-FG, it is assumed that different expression profiles are generated among female-specific genes by the combination of these two cis-acting elements.
The present study suggests that PFG is upregulated in two steps in female gametocytes; first by sAP2-FG, and then by cAP2-FG.Transcriptional activation of PFG by cAP2-FG constitutes a transcriptional positive feedback loop, suggesting that after activation by sAP2-FG, PFG maintains its transcription through this autoactivation mechanism.As the expression of PFG increases via this mechanism, AP2-FG recruited by PFG (cAP2-FG) increases and eventually becomes predominant in the transcriptional regulation of female gametocytes.Fig. 6G illustrates this model.Considering that targetome of cAP2-FG contains many ookinete genes, transcripts for ookinete development would gradually increase in transcriptome of females.In this model it is hypothesized that binding of AP2-FG contributes to the trans-activity of cAP2-FG.This is supported by the observation that expression of many unique targets of PFG decreases in AP2-FG-disrupted parasites (9).However, it remains elusive how AP2-FG contributes to this activity.As discussed below, PFG may have its own trans-activity for the ten-base element.It is necessary to investigate the role of AP2-FG in cAP2-FG as a next step.
In AP2-FG-disrupted parasites, while female gametocytes display immature morphologies, a few develop into retort-form ookinetes.Because PFG is essential for the parasite's capacity to produce ookinetes, this phenotype suggests that PFG retains trans-activity in the absence of AP2-FG and partly promote ookinete development.The observation also supports this assumption that RFP signals in female gametocytes could still be detected in AP2-FG(-) 820 parasites by FACS (9) even though RFP signals were reduced to undetectable level in PFG(-) 820 parasites under the same experimental conditions.This suggests that the promoter of the CCP2 gene, which is a target of PFG only, is still active in AP2-FG(-) 820 parasites.This putative trans-activity of PFG could also explain why the expression of PFG was still observed in AP2-FG-disrupted parasites.It is assumed that PFG continues to activate its transcription through a positive feedback loop once it is induced by AP2-G and complements the absence of AP2-FG.Based on this assumption and the model presented in Fig. 6G, phenotypic differences between PFG-disrupted and AP2-FG-disrupted parasites can be explained as follows.PFG(-) parasites completely lose their ability to develop into ookinetes, because the expression of genes necessary for ookinete development is significantly reduced.However, they manifest mature morphologies in females owing to the induction of the genes necessary for female development by sAP2-FG.In contrast, AP2-FG(-) parasites show immature morphologies in females because the expression of genes involved in female development is downregulated in the early stage.Despite this immature morphology, they retain the ability to fertilize and develop into retort-form ookinetes because PFG is still expressed in AP2-FG(-) and activates genes for zygote/ookinete development.
P. falciparum gametocytes require 9-12 days to mature, which is much longer than that of P. berghei (19).Meanwhile, it has been reported that the ten-base motif is highly enriched in the upstream regions of gametocyte-specific genes also in P. falciparum (20).Thus, despite the difference in maturation periods, PFG is likely to play an important role in the transcriptional regulation of female gametocyte development in P. falciparum.
In conclusion, our results suggest that the two forms of AP2-FG, which correspond to distinct female-specific cis-acting elements, play pivotal roles in female gametocyte development and can generate variations among female-specific promoters.The present findings provide the possibility to predict the expression profile of each gene by analyzing its promoter sequence and further estimating its functions according to this regulation.In the next step, it will be necessary to compare the properties of the number of actual promoter activities in vivo and improve the model presented in this study.Such studies deepen our understanding of the sexual stage, including gametocyte development in erythrocytes, fertilization, meiosis, and development into ookinetes, and provide clues to interrupt parasite transmission from humans to mosquitoes.

Parasite preparations
The ANKA strain of P. berghei was maintained in female BALB/c mice (6-10 weeks of age).To examine the number of oocysts, infected mice were exposed to Anopheles stephensi mosquitoes.
Fully engorged mosquitoes were selected and maintained at 20 °C.The number of oocysts was evaluated 14 days after the infective blood meal.To prepare mature schizonts, infected mouse blood was cultured in the medium for 16 h.Mature schizonts were purified using Nicoprep density gradient.To prepare gametocyte-enriched blood, mice were pretreated with hydrazine and infected with P. berghei by intra-abdominal injection of infected blood.After parasitemia increased to over 1%, they were treated with sulfadiazine for two days in drinking water (20 mg/L) to kill asexual parasites.After checking the exflagellation rate, whole blood was collected.

Preparation for transgenic parasites
Parasite transfection was performed as previously described (9).Briefly, cultured mature schizonts were purified using a density gradient of Nicoprep, transfected with DNA constructs by electroporation, and injected intravenously into mice.These mice were treated with pyrimethamine to select the parasite integrated with the construct by double crossover homologous recombination.
The parasites were cloned by limiting dilution to obtain transgenic parasite clones.To prepare the targeting construct, DNA fragments for recombination were annealed to each side of the DNA fragments containing human DHFR as a selectable marker gene by overlapping PCR.Constructs for parasites expressing the GFP-fused genes were prepared as follows.Briefly, DNA fragments corresponding to the 3´ portion of the gene were subcloned into the plasmid vector containing the GFP gene, the 3´-terminal portion of the P. berghei HSP70 gene, and a selectable marker cassette for expressing the human DHFR gene.The plasmid was separated from the vector backbone by digestion with restriction enzymes NotI and BanHI.
Transgenic parasites with mutated motif sequences were prepared using the CRISPR/Cas9 method, as reported previously (21).Briefly, cultured mature merozoites of CAS9-expressing P. berghei parasites were transfected with a gRNA plasmid vector and a linear DNA template containing mutations in the motif sequences and injected intravenously into mice.These mice were treated with pyrimethamine for two days to deplete parasites in which the locus did not change.
The parasites were cloned by limiting dilution.Mutations in these motifs were confirmed by direct genome sequencing.performed using the TB Green Fast qPCR Mix (Takara Bio) according to the manufacturer's protocol.Three independent ChIP experiments were performed, and the percentage of input DNA (DNA extracted from lysate before ChIP) was compared between P. berghei parasites expressing GFP-fused TF and those with the mutated motifs.The primers used for qPCR are listed in Table S4.

RNA-seq analysis
Gametocyte-enriched blood was passed through a filter and subjected to erythrocyte lysis in 0.84% NH4Cl.According to the manufacturer's protocols, total RNA was extracted from residual cells using Isogen II (Nippon Gene).Libraries were prepared using the RNA Hyperprep kit (KAPA Biosystems) and sequencing was performed using an Illumina NextSeq sequencer.

Analysis of RNA-Seq data
Three independent experiments were performed using PFG-disrupted parasites.Read data were counted using FeatureCount software.RPKM (Reads Per Kilobase of exon per million mapped reads) was calculated for each gene, and only genes with an RPKM maximum > 20 in wild-type parasites were used for the following analyses.Additionally, genes located in the sub telomeric regions were excluded because of their variable expression among the clones.The ratio of read numbers and padj was calculated for each gene between wild-type and gSNF2-disrupted parasites using the DEseq2 software.Volcano plots of the obtained results were created using DDplot2 software.The RNA-seq data were deposited in GEO with the accession number GSE226028.The RNA-seq data deposited in GEO with accession number GSE198588 were used as data for the wild-type parasites.G. Motifs enriched around the summits of peaks identified in the ChIP-seq of AP2-FG using AP2-FG::GFP PFG(-) parasites.Sequence logos were depicted using WebLogo 3.

Figures and Tables
H. Peak image of PFG and the ten-base motif under the summit upstream of a CPW-WPC family protein gene (PBANKA_1346300).The motif was mutated using the CRISPR/Cas9 system, and the mutation was confirmed by Sanger sequencing (the lowest panel; mutated nucleic acid residues are highlighted).The protospacer adjacent motif (PAM) sequence used for targeting is also highlighted by a rectangle.A. Functional classification of the PFG target genes (cAP2-FG) (572 annotated genes in total).In this graph, different subgroups related to female gametocyte-specific functions are collectively shown as a group 'female-specific functions' (see also Table S3A).Hypothetical protein genes were not included.The number of members in each group is shown in the chart.
B. Functional classification of the target genes of sAP2-FG (240 annotated genes in total).Hypothetical protein genes were not included.
C. Venn diagram showing an overlap between the target genes of sAP2-FG and PFG (cAP2-FG).The target genes in 'female-specific functions' are written according to their belongings: unique to sAP2-FG, common, and unique to cAP2-FG.The assignment of these genes into G.A model of transcriptional regulation during female development.Female-specific genes harbor either or both of the female-specific cis-activating elements, 5-base and ten-base elements in the upstream.In early female gametocytes, AP2-FG binds to a five-base cisactivating element via its AP2 domain and activates genes for gametocyte development and some genes for zygote/ookinete development.In the later stage, when PFG is highly expressed, AP2-FG is predominantly recruited to the PFG on the ten-base cis-activating element, and the AP2-FG/PFG complex activates a full repertoire of genes for zygote/ookinete development.

Captions of supplementary tables
Table S1 (separate file).Differential expression analysis between wild-type and PFG(-) parasites.
Table S3 (separate file).Target genes predicted from ChIP-seq peaks.Table S3A.PFG target genes predicted from ChIP-seq peaks.Table S3B.sAP2-FG target genes predicted from ChIP-seq peaks.Table S4.List of primers used in this study.

1 .
Figure 1.PFG is a target gene of AP2-G and is expressed in female gametocytes A. Schematic diagram of PFG from P. berghei and its homologs in apicomplexan parasites.Regions homologous to Regions 1 and 2, which are highly conserved among Plasmodium species, are shown as yellow and blue rectangles, respectively.Nuclear localization signals were predicted using the cNLS mapper (http://nls-10 mapper.iab.keio.ac.jp/cgibin/NLS_Mapper_form.cgi).The gene IDs of P. berghei PFG, P. falciparum PFG, and their homologs in Toxoplasma gondii, Eimeria tenella and Vitrella brassicaformis are PBANKA_0902300, PF3D7_1146800, TGGT1_239670, ETH2_1252400, and Vbra_10234, respectively.B. Alignment of amino acid sequences of Region 1 and Region 2 from P. berghei and P. falciparum PFGs was performed using the ClustalW program (https://www.genome.jp/toolsbin/clustalw).C.The amino acid sequences of Regions 1 and 2 from P. berghei PFG and its homologs from other apicomplexan parasites in (A) were aligned using the ClustalW program in MEGA X.The positions at which all these sequences have identical amino acids are indicated by two

Figure 2 .
Figure 2. Development of female gametocytes was impaired in PFG(-)parasites.A. Proliferation of blood-stage parasites in wild-type and PFG(−) parasites.Parasitemia was calculated by counting infected red blood cells using Giemsa staining of blood smears.B. Giemsa-stained images of mature male and female gametocytes in PFG(-) parasites.C. Number of oocysts in the mosquito midgut 14 days after infective blood meal of wild-type and PFG(-) parasites.D. Ookinete cultures of wild type and PFG(-) parasites.Ratios of zygotes, retort-form, and bananashaped ookinetes to all female-derived cells are shown in black, grey, and white, respectively.E. A cross-fertilization experiment was performed between PFG(-), P48/45(-) (males are infertile), and P47(-) (females are infertile) parasites.The rate of conversion of females to mature ookinetes is shown.F. Flow cytometry analysis of 820 parasites and PFG(-) parasites derived from 820 parasites.G. Volcano plot showing differential expression of genes in PFG(-) compared to wild-type parasites.Red dots represent female-enriched genes, and horizontal and vertical lines indicate p-value of 0.001 and log2(fold change) of -2, respectively.A pie graph on the top right of the plot area shows the number of female-enriched genes among the significantly downregulated genes.

Figure 3 .
Figure 3. PFG is co-localized with AP2-FG A. IGV images from ChIP-seq experiments 1 and 2 of PFG on chromosome 14.Histograms show row read coverage in the ChIP data at each base.The Venn diagram on the right shows the number of peaks common in ChIP-seq experiments 1 and 2. B. Representative peak images of PFG and AP2-FG upstream of genes that were significantly downregulated in PFG(-) parasites.Positions of ten-base motifs are indicated by red bars.ChIP-seq data for AP2-FG were obtained from a previous paper (23).C. The ten-base motif was enriched around the PFG peaks.Sequence logos were constructed using WebLogo 3 (http://weblogo.threeplusone.com/create.cgi).D. Heat maps showing coverage in ChIP-seq of PFG (left) and AP-FG (right) with positioning summits of PFG peaks at the center.E. IGV images of ChIP-seq peaks of AP2-FG that lacked the corresponding PFG peak.Histograms show the row read coverage in the ChIP data at each base.

Figure 4 .
Figure 4. PFG mediates AP2-FG binding to the ten-base motif A. Venn diagram showing the number of common peaks between experiments 1 and 2 in ChIPseq of PFG using PFG::GFP AP2-FG(-) parasites.B. IGV images from the ChIP-seq peaks of PFG on part of chromosome 14.The top panel shows the ChIP peaks using PFG::GFP parasites, and the bottom panel shows those using

I.
ChIP-qPCR analysis of PFG in the upstream region of PBANKA_1346300.Gray and white bars indicate the results using wild-type and motif-mutated parasites, respectively.Three independent biological experiments were performed.The results are shown as a percentage input.Error bars indicate standard error.Experiments with another CPW-WPC family protein gene (PBANKA_1123200) and TRAP were performed as positive and negative controls, respectively.

Figure 5 .
Figure 5. sAP2-FG binds to five-base motifs that function as a cis-acting element independent of the ten-base motif.A. Motifs enriched around peaks identified in DIP-seq of the AP2 domain of AP2-FG.Sequence logos were depicted using WebLogo 3. B. Histogram showing the distance between the peak summit and the nearest five-base motifs in (A) for each ChIP peak of sAP2-FG.C. Peak image of sAP2-FG upstream of NEK2 and the five-base motif under the summit.The motif was mutated, and the sequence was confirmed by Sanger sequencing (lowest panel).