Epigenetics and chromatin structure regulate var2csa expression and the placental binding phenotype in Plasmodium falciparum

Plasmodium falciparum is responsible for what appears to be a never-ending public health issue in the developing world. With repeated infections, a gradual semi-immunity to severe malaria can be acquired but this is disrupted when women become pregnant as the parasite cytoadheres in the placenta to prevent splenic clearance. This change in tissue tropism is due to specific expression of the antigenically variable adhesin VAR2CSA. To better understand the molecular mechanisms activating var2csa and antigenic variation over all, we used a combination of phenotypic and systems biology assays. We first established phenotypically homogenous populations of VAR2CSA expressing and placenta binding parasites that were shown to exclusively transcribe var2csa while all other var genes remained silenced. We also confirmed that the transcriptional activation was strongly associated with distinct depletion of repressive H3K9me3 marks. Further, we used chromatin conformation capture as a high-resolution approach to determine interchromosomal interactions and established that transcriptional activation is linked to a small yet significant repositioning of var2csa relative to heterochromatic telomeric clusters. Lastly, we demonstrated that occupancy of 5-methylcytosine was present in all var genes but independent of transcriptional activation and switching. All together, these findings provide insights at high resolution into the potential role of 5-methylcytosine in P. falciparum and increase our understanding of the mechanisms regulating antigenic variation at the epigenetics and chromatin structure level.


INTRODUCTION
The morbidity and mortality associated with Plasmodium falciparum malaria are negatively correlated to the number of infections contracted (1).As a consequence, young children and other malaria naïve individuals are at greatest risk of developing severe illness and succumb to the disease.With repeated exposure, a gradual immunity to clinical manifestations is acquired and symptomatic episodes in adults are relatively rare in endemic areas (1).This holds true until women become pregnant, when the built-up immunity is disrupted and they again become highly susceptible to develop severe disease (2).Besides affecting the mother, the pregnancy-associated malaria (PAM) also results in adverse outcomes for the fetus, including miscarriages, stillbirths, preterm births and low birth weights, of which the latter two often result in predisposition for morbidities later in life (3)(4)(5).Thus, PAM represents a devastating public health problem that targets the most vulnerable populations in the resource scarce malaria endemic regions.
The onset of PAM is tightly linked to the appearance of the placenta, which presents a new niche for the parasitized red blood cell (pRBC) to sequester and thrive.To cytoadhere in the microvasculature, the parasite employs antigenically variable adhesins from the var gene encoded P. falciparum Erythrocyte Membrane Protein 1 (PfEMP1) family of proteins, a key virulence feature of the parasite that is directly linked to the pathogenesis of severe disease (6,7).While harmful for the human host, the cytoadhesion provides the parasite immune evasive opportunities, such as the shielding of the parasite antigens expressed on pRBC surfaces by other cells and evasion of splenic clearance (8).The repertoire of PfEMP1 is large, with approximately 60 var genes per parasite genome from which only one is assumed to be expressed at a time, a process referred to as mutually exclusive expression.Transcriptional switching between the different var loci creates antigenic variation, which leads to immune evasion and binding of a plethora of human receptors and variable tissue tropism for the parasite (8)(9)(10).In between parasites, the genetic variation among the absolute majority of var genes is vast, with one important exception, the remarkably well-conserved var2csa.This gene encodes the PfEMP1 VAR2CSA that mediates cytoadherence to chondroitin sulfate A (CSA) on syncytiotrophoblasts in the placenta and is considered the main culprit behind PAM (11,12).
The variable expression of PfEMP1 is thought to be accomplished through several layers of gene regulatory activities to ensure that only one, or very few, var genes are translated to functional adhesins and exported to the surface of the pRBC (9,10,(13)(14)(15).The antigenic variation is mainly achieved by transcriptional regulation, where altered DNA accessibility is orchestrated through variations in the nucleosome composition, histone modifications and organization of the chromatin into active and repressive clusters.In addition, involvement of non-coding RNAs and specific transcription factors have been proposed to compose additional layers in the regulation that governs repression, activation and switching of expression (15).Lastly, the peculiar VAR2CSA appears also regulated on a translational level (16)(17)(18).
While major advances have been made recently to better understand gene expression of these important virulence factors, mechanisms regulating these events at the molecular level are complex and many questions remain.An improved understanding of the detailed molecular mechanisms that mediate parasite adhesion is vital if we want to grasp the molecular factors that control malaria pathology and to identify new therapeutic strategies.
Here, using a combination of advanced phenotypic assays and systems biology approaches including transcriptomics, epigenetics and chromatin structure features, we provide a detailed view in high-resolution on how expression of the PAM mediating var2csa is regulated at transcriptional initiation level.Intriguingly, we also show that occupancy of the epigenetic mark 5-methylcytosine (5mC) is significant in antigenically variable genes but appears to be disconnected to transcriptional activation and switching.

Parasites and in vitro culturing
The P. falciparum strains NF54 and NF54CSA-ptefKO (18) were cultivated according to standard methods (19).All parasite cultures were maintained in blood group O RBCs at 4% hematocrit in RPMI-1640 medium (Gibco) containing 27 nM NaHCO3 (Sigma), 2mM Lglutamine (Hyclone) and 2.5 µg/mL gentamicin (Gibco) and supplemented with 10% human A+ serum.The microaerophilic environment was kept constant by gassing of culture flasks with a mixture of 90% N2, 5% O2 and 5% CO2 and cultures were kept in suspension by orbital shaking at 37°C.Parasite cultures were regularly synchronized by treatments with 5% sorbitol as described previously (20).

Scanning electron microscopy
To confirm proper pRBC surface morphology prior to the establishment of cytoadhesive phenotype, scanning electron microscopy was used.Magnetically enriched pRBCs were fixed in 2.5% glutaraldehyde (Polysciences), 1% paraformaldehyde (Santa Cruz) in PBS for 1 h, rotating at 4⁰C.The samples were washed twice with cold PBS.Following the fixation, specimens were adhered onto alcian blue pre-treated 0.45mm Supor® PES membrane filter (Pall Corp.) and washed with 0.1M phosphate buffer pH7.4 followed by MilliQ water.The membranes were then subjected to stepwise ethanol dehydration, transferred to acetone and critical-point-dried using carbon dioxide (Leica EM CPD030).The membranes were finally mounted on specimen stubs using carbon adhesive tabs and sputter coated with a 10nm layer of platinum (Quorum Q150T ES).SEM images were acquired using an Ultra 55 field emission scanning electron microscope (Zeiss) at 3kV and the SE2 detector.

Establishment and preservation of parasite phenotypes
The chondroitin sulphate A (CSA) and placental binding phenotype of NF54CSA was maintained by bi-weekly panning on CSA-coated plastic plates with minor modifications from Brolin et al (13).Briefly, 100 μg/mL CSA from bovine trachea (Sigma) in phosphate-buffered saline (PBS) was coated on plates overnight leaving a fraction of the plate without CSAcoating.Thereafter, to prevent non-specific binding, plates were blocked with 2% bovine serum albumin (BSA, fraction V, HyClone) in PBS for 1 h.Mature trophozoites were purified using a MACS magnetic cell sorter (Miltenyi BioTec) and resuspended in RPMI-1640 with 10% human serum.Parasites were added to CSA-coated plates and incubated for 1 h at 37°C in microaerophilic conditions with occasional gentle swirling.Plates were thereafter washed with RPMI-1640 until background binding was low, as confirmed by comparison with the non-CSA coated area.Thereafter, the bound pRBCs were recovered from the plates and returned to culture.It is well known that parasites selected to bind to CSA (and other cytoadherence receptors) revert back to a non-CSA binding state over time.With the selected panning interval, minor reversion was noted.Prior to any down-stream analyses, phenotypes were nevertheless confirmed by flow cytometry (see below)

Flow cytometric analysis of pRBC surface levels of VAR2CSA
For pRBC surface staining of VAR2CSA, cultures with late-stage pRBCs (24-40 hpi) were blocked in PBS with 2% bovine serum albumin (BSA, fraction V, HyClone) for 30 min at room temperature.Primary antibody against the VAR2CSA (goat anti-DBL1-6) was added to blocked cultures at a concentration of 100 μg/mL in PBS with 2% BSA and incubated for 1 h at room temperature.Non-immune goat IgG (Jackson ImmunoResearch) was used for nonspecific binding control.Cells were washed twice with 90 µL of PBS before incubated with a secondary rabbit anti-goat antibody coupled to Alexa647 (Invitrogen) at 1:100 dilution, 5 μg/mL Dihydroethidium (DHE) (Invitrogen) and 10 μg/mL Hoechst 33342 (Invitrogen) in PBS for 1 h at room temperature.Unbound secondary antibody was washed away by 2 x 90 µL washes with PBS before cells were resuspended in PBS to a final hematocrit of 0.3%.3000 pRBCs per sample were analyzed using a BD FACSVerse (BD Bioscience) and surface positivity was analyzed in FlowJo version 10 by first gating on intact cells on SSC-A and FSC-A, followed by FSC-H and FSC-A for single cells and from there, the Hoechst-DHE double positive pRBC population was selected.This population was analyzed for percentage of Alexa647 positive cells based on a cutoff from the non-immune IgG background binding.The parasite NF54CSA-ptefKO (18) was used as negative control as the knock-out of the gene encoding the Plasmodium Translation Enhancing Factor (PTEF) has previously been shown to be essential for the translation of VAR2CSA.Thus, NF54CSA-ptefKo pRBCs are devoid of VAR2CSA and therefore non-adherent to CSA.

Placental binding
A placental binding assay was adapted from (21).Cryo-sections (8 μm in thickness and approximately 6 mm in diameter) from placentas donated by healthy Swedish pregnant women were prepared in a Microm HM 560 cryotome (ThermoScientific).Sections were mounted onto 10-well microscopy slides (Novakemi AB) and stored at -80 ⁰C.Prior to use, the placental sections were submerged in ice cold PBS.Late-stage pRBCs were enriched using magnetic columns (Miltenyi Biotec) and washed 3 times before resuspended in binding medium (RPMI-1640 (Gibco) with HEPES adjusted to 20 mM, pH 6.8, and 10% human serum at a density of 2.4 x 10 5 pRBCs / μL. 25 μL (0.6 x 10 7 cells) of pRBC suspension was added to each placental section and were incubated at 37°C for 1 h in a humid chamber.Unbound cells were washed away by dipping the slides 3 x 5 min in RPMI-1640 with HEPES.
Preparations were thereafter fixed with 100% methanol for 15 s and stained with 5% Giemsa (Merck) for 15 minutes, washed with distilled water and air-dried.Samples were visualized using an Eclipse 80i (Nikon) microscope at 1000 x magnification and blinded for counting.Samples were imaged using an Infinity 3 color camera (Teledyne Lumenera).Number of bound cells per experimental condition was counted for 30-45 fields containing on average 29 pRBCs per field and relative binding was determined compared to the one of VAR2CSA nonexpressing pRBCs.

RNA purification, library preparation and sequencing
For RNA sequencing, NF54 and NF54CSAh were harvested from three independent selections at 16 ± 2h post invasion (p.i.) of RBCs.Cell pellets were lysed using five volumes of TRIzol Reagent (Life Technologies) before RNA was purified using the NucleoSpin miRNA Mini kit for miRNA and RNA purification (Macherey-Nagel) without size fractionation, according to the instructions of the manufacturer.The high retention of RNA on the columns of the kit allows for sequential purification without any apparent loss of RNA (not shown).
Therefore, all RNA samples were passed over columns in total three times, with the first two elutions being treated with Turbo DNAse (Invitrogen) in solution to ensure a complete lack of contaminating gDNA in the preparations.Strand specific RNA sequencing (RNA-seq) libraries were generated using the KAPA Stranded RNA-Seq kit (Roche) and were amplified for 5 PCR cycles (45 s at 98 °C for initial denaturing and cycles of 15 s at 98 °C, 30 s at 60 °C, 30 s at 68 °C before final extension for 60 s at 68 °C) using the KAPA HiFi HotStart ready mix (KAPA Biosystems) before purification twice over 1X KAPA Pure beads (KAPA Biosystems).Prior to sequencing, quality of libraries was evaluated using Agilent 2100 Bioanalyzer (Agilent) and a High sensitivity DNA kit (Agilent).Libraries were quantified using Collibri Library Quantification kit (Invitrogen) and thereafter sequenced in 75bp paired-end reads on an Illumina NextSeq550 instrument.PhiX library (Illumina) was used for sequencing control and to diversify libraries.

Chromatin immunoprecipitation (ChIP) and ChIP-seq library preparation
For the Chip-seq experiment, we harvested three Biological replicates of NF54 and NF54CSAh (16 ± 2h p.i) in parallel with the samples collected for RNA-seq.Parasites were freed from RBCs using 0.1% saponin (from quillaja bark, Sigma) and washed thrice with icecold PBS.Parasites were cross-linked in 1% methanol-free formaldehyde (Polysciences Inc) at 37°C for 10 min while kept in suspension on an orbital shaker (50 rpm).The cross-linking was thereafter quenched using 125 mM glycine (Sigma) for 5 min at 37°C before cell pellets were washed two times with PBS at 4°C.Nuclei were extracted using a nuclear extraction buffer (10 mM HEPES, 10 mM KCl, 0.1 mM EDTA, 0.1 mM EGTA, 1 mM DTT, 0.5 mM AEBSF and 1x Halt protease inhibitor cocktail from ThermoFischer) and incubated for 30 min on ice before addition of Nonidet P 40 substitute (Roche) to a final concentration of 0.5%.Samples were homogenized by passage through 23G needles three times and 25G needles seven times.Parasite nuclei were thereafter collected and resuspended in shearing buffer (1% SDS, 10 mM EDTA, 50 mM Tris HCl pH 8.1 and 1 x Halt protease inhibitor cocktail, ThermoFisher).
Chromatin was fragmented using the Covaris ultra sonicator (ME220) for 5 min with the following settings: 25% duty cycle, 75W intensity peak incident power, 1000 cycles per burst.
ChIP-seq libraries were prepared using the KAPA LTP library preparation kit (KAPA Biosystems).Immunoprecipitated and input libraries were amplified for 16 and 5 PCR cycles respectively (45 s at 98 °C for initial denaturing and cycles of 15 s at 98 °C, 30 s at 60 °C, 30 s at 68 °C before final extension for 60 s at 68 °C) using the KAPA HiFi HotStart ready mix (KAPA Biosystems) before purification twice over 1X KAPA Pure beads (KAPA Biosystems).
Prior to sequencing, quality of libraries was evaluated using Agilent 2100 Bioanalyzer (Agilent) and a High sensitivity DNA kit (Agilent).Libraries were quantified using Colibri Library Quantification kit (Invitrogen) and thereafter sequenced in 75bp paired-end reads on an Illumina NextSeq550 instrument.PhiX library (Illumina) was used for sequencing control and to diversify libraries.

Methylated DNA immunoprecipitation (MeDIP) and MeDIP-seq library preparation
Biological triplicates of NF54 and NF54CSAh (16 ± 2h p.i) were harvested for MeDIPseq in parallel with the samples collected for RNA-seq and ChIP-seq.Parasites were initially released from host RBCs by 0.1% saponin (from quillaja bark, Sigma) and washed thrice with ice-cold PBS.Thereafter genomic DNA was extracted using QIAamp DNA Blood Kit (Qiagen) following the instructions of the manufacturer and including the optional RNase A treatment.
Extracted DNA was eluted in NE buffer (Macherey-Nagel) and quantified with dsDNA HS assay kit (Invitrogen) on a Qubit 3 fluorometer (Invitrogen).gDNA was fragmented in TE buffer using the Covaris ultra sonicator (ME220) for 130 s with the following settings: 20% duty cycle, 70W intensity peak incident power, 1000 cycles per burst, resulting in average fragment size of approximately 300 bp.Thereafter gDNA was precipitated and concentrated by addition of 0.1 volumes of sodium acetate (Alfa Aesar) and 3 volumes of 100% ethanol.gDNA was quantified with dsDNA HS assay kit (Invitrogen) for Qubit 3 fluorometer (Invitrogen) and 100 / 30 ng (immunoprecipitation/input) used for library preparation using KAPA LTP library preparation kit (KAPA Biosystems).Library preparation was stopped prior to PCR amplification.The quality of libraries was assessed with Agilent 2100 Bioanalyzer (Agilent) and High sensitivity DNA kit (Agilent) and quantity with the Collibri Library Quantification kit (Invitrogen).Thereafter equal amounts of libraries from NF54 and NF54CSAh were pooled for the same bio-replicate for the same antibody.Libraries were denatured at 95 °C for 10 min and then incubated on ice for 10 min.The Methylated DNA Immunoprecipitation (MeDIP) Kit (Active Motif) was used according to the instructions by the manufacturer with some modifications.Three μg of polyclonal rabbit anti-5mC (#61255, Active Motif) and rabbit IgG (#103524, Active Motif) were added to the denatured libraries.For the elucidation of potential presence of 5hmC and 6mA in NF54CSAh only, three μg of polyclonal rabbit anti-5mhC (#39069, Active Motif) and polyclonal rabbit anti-6mA (#202003, Synaptic Systems) was used respectively.Immune complexes were collected using 20 μL of protein G magnetic beads (Active Motif).After elution from the beads, samples were purified using 1X KAPA Pure beads (KAPA Biosystems) and PCR amplified using the KAPA HiFi HotStart ready mix (KAPA Biosystems) for 8 PCR cycles (45 s 98°C of initial denaturing and amplification at 15 s at 98°C, 30 s at 60°C, 30 s at 68°C and final extension 60 s at 68 °C).Libraries were analyzed for quality using an Agilent 2100 Bioanalyzer (Agilent) and the High sensitivity DNA kit (Agilent), quantified using Collibri Library Quantification kit (Invitrogen).andthereafter sequenced in 75bp paired-end reads on an Illumina NextSeq550 instrument.PhiX library (Illumina) was used for sequencing control and to diversify libraries.
MeDIP-seq data processing was performed the same as above, using the same versions of FastQC, Trimmomatic, Picardtools, Samtools, Bedtools, and MACS3.Counts per million normalization was performed prior to subtracting IgG and input reads from 5mC, 5hmC and 6mA for genome-wide visualization at 10 bp resolution.MACS3 peak calling was performed using the default narrow peak calling for 5mC, 5hmC and 6mA, but minimum fold coverage for model building was reduced to 2. Default DiffBind parameters were used for differential peak calling between NF54CSAh and NF54 for 5mC, and between NF54CSAh 5mC, 5hmC, and 6mA.

Chromosome conformation capture sequencing (Hi-C) library preparation
Three biological replicates of NF54 and NF54CSAh (at 20±2 h.pi.) were fixed with 1.25 % methanol-free formaldehyde (Polysciences Inc) for 25 min at 37°C and quenched with 150 mM glycine for 15 min at 37°C.Thereafter samples were incubated at 4°C on a rocking platform for 15 min.Parasites were collected and washed with ice-cold PBS.The supernatant was removed and parasite pellets stored at -80°C.Crosslinked parasite pellets were resuspended in lysis buffer (10 mM Tris-HCl, pH 8.0, 10 mM NaCl, 2 mM AEBSF, 0.10% Igepal CA-360 (v/v), and 1X protease inhibitor cocktail) and incubated for 30 min on ice.
Samples were homogenized by a 26.5-gauge needle, then washed in lysis buffer.Pellets were resuspended in 0.5% SDS and incubated at 62°C to solubilize the chromatin.DNA was digested overnight with 100 units MboI (NEB) restriction enzyme then the ends filled using dTTP, dGTP, dATP, biotinylated dCTP, and 25 units DNA polymerase I. Blunt ends were then ligated using 4000 units T4 DNA ligase and chromatin de-crosslinked using decrosslinking buffer (50 mM Tris-HCl, pH 8.0, 1% SDS, and 500 mM NaCl) and RNase A. Ligated DNA was purified and sheared to a length of ~300-500 bp using the Covaris S220 sonicator (settings: 10% duty factor, 140 peak incident power, and 200 cycles per burst for 65 seconds) Fragments were pulled down using streptavidin T1 beads (Invitrogen) then library prepped by end-repair, A-tailing, and adapter ligation all in lo-bind tubes.Libraries were PCR amplified with NEB multiplex oligos (45 sec at 98°C, 12 cycles of 15 sec at 98°C, 30 sec at 55°C, 30 sec at 62°C and a final extension step of 5 min at 62°C) and sequenced using the NOVASeq platform (Illumina).

Hi-C data processing, differential interaction analysis and 3D modeling
Paired-end Hi-C libraries were processed (mapping, read pairing, quality filtering, binning and normalizing) using HiC-Pro (v3.1.0)(30) with minimum mapping quality of 30 at 10 kb resolution when aligning to the P. falciparum 3D7 genome (PlasmoDB v58).Interaction heatmaps were generated using the ICED-normalized interaction matrices after an additional per-million read count normalization to allow for direct comparison between NF54CSAh and NF54.Correlation among biological replicates was evaluated with HiCRep (31), then replicates were merged to generate a single representative sample.To enhance visualization, all intra-bin contacts and contacts within a two-bin distance were set to the 90 th percentile, reducing the range of the data needing to be color mapped.Regions of differential interactions between NF54CSAh and NF54 were identified using Selfish (32) for all intrachromosomal and interchromosomal matrices.Coordinate matrices were generated by PASTIS (33) from raw read count matrices and then visualized as 3D chromatin models in ChimeraX (34) while highlighting bins containing var genes, telomeres and centromeres.

RESULTS
Repeated in vitro selection on CSA results in a highly homogenous population of VAR2CSA expressing and placenta binding parasites P. falciparum pRBCs have different receptor preferences for cytoadherence to diverse host cells.This has been previously established in vitro by panning on antibody-coated beads or either cell-bound or soluble human receptors (13,35,36).While several parasite molecules have been identified as ligands for cytoadherence, the antigenically variant PfEMP1 proteins, encoded by the var genes, have been demonstrated as the genuine adherence molecules.In PAM, VAR2CSA encoded by a subfamily of the var genes, plays a vital role for the cytoadherence of pRBCs to the CSA expressed on the surface of placental syncytiotrophoblasts.To better understand the molecular mechanisms regulating var2csa expression, we first employed an approach that entailed several rounds of repeated panning of NF54 pRBCs on chondroitin sulfate A (CSA) coated plates (13) to select for parasites with homogenous expression of VAR2CSA and a placenta binding phenotype.Prior to the panning procedure, we confirmed by scanning electron microscopy (SEM) the presence of knobstructures, which are protrusions where PfEMP1s are surface exposed on pRBCs in patient isolates but are easily lost upon long-term in vitro cultivation (Figure 1A).Several phenotypic analyses were used to confirm the accuracy of our selection process.We analyzed pRBC surface expression of VAR2CSA by flow cytometry-based antibody recognition throughout the trajectory of repeated panning and revealed a gradual transition from a phenotypically heterogeneous to homogeneous parasite population (Figure 1B).In addition, we interrogated the cytoadhesive potential using placental sections from malaria non-immune donors (21).For the repeatedly panned parasites, substantial numbers of pRBCs were observed binding to the placental syncytiotrophoblasts (Figure 1C).We were able to quantify bound pRBCs per mm2 using a VAR2CSA surface negative PTEF knockout parasite (18) as control for background binding and revealed an approximate 6-fold higher binding of the CSA-selected parasite line (Figure 1D).Thus, the repeatedly panned NF54CSA (from here on referred to as NF54CSAh) was deemed phenotypically highly homogenous with a VAR2CSA/PAM relevant cytoadherence profile and considered suitable for down-stream analyses of var2csa expression and regulation.

The P. falciparum placental binding phenotype is strictly linked to expression of var2csa
Next, we sought to decipher any transcriptomic differences responsible for the phenotypic disparity observed between the unselected NF54 and NF54CSAh.As PfEMP1 encoding var genes are actively transcribed during the first third of the parasite's 48h life cycle (also referred to as the ring-stage) (37,38), we harvested NF54 and NF54CSAh from three independent selections at 16 ± 2h post invasion (p.i.) of RBCs for RNA-seq.We also collected parallel samples for ChIP-seq, MeDIP-seq and Hi-C.
The RNA-seq analysis revealed differential expression in only 64 of the 5285 protein coding genes of the parasite (Supplementary Data 1).Hierarchical clustering and principal component analysis performed on the complete set of data revealed little deviation ( > 0.99) between both replicates and samples when accounting for genome-wide transcription (Figure S1A, B).We however observed a very strong correlation between replicates of NF54 ( > 0.97) and between replicates of NF54CSAh ( > 0.63) when only taking var genes into account (Figure 1E), confirming a high degree of phenotypic maintenance among collected samples.
In line with this, we observed a near exclusive representation of var genes among differentially expressed genes (Figure 1F), with var2csa being the only upregulated and almost all other of the 60 var genes downregulated in NF54CSAh compared to NF54 (Figure 1F, G).Besides the confirmation of var2csa expression being closely associated to the placental binding phenotype of malaria parasites, the high degree of phenotypic homogeneity of NF54CSAh and the extensive heterogeneity of NF54 provided an excellent opportunity for down-stream analyses of gene regulation in antigenic variation via epigenetic mechanisms and chromatin structure.

Transcriptional activity of var2csa is associated to a near complete lack of H3K9me3 occupancy
Besides its known role in repressing repetitive elements and gene depleted regions in eukaryotes, the histone H3 lysine 9 trimethylation mark (H3K9me3) has been demonstrated to be an important player in heterochromatin formation, to silence lineage-inappropriate genes and control cell fate.This appears particularly true for Plasmodium spp., where H3K9me3marks have been identified as potentially critical in regulating parasite specific genes involved in pathogenicity and sexual commitment (39,40).We therefore utilized chromatin ChIP-seq as previously described (41) to assess the distribution of H3K9me3 and examine whether NF54CSAh displays differentially bound heterochromatin/euchromatin marks due to elevated expression of var2csa.We performed ChIP-seq experiments in triplicates using NF54CSAh and NF54 samples collected at the same time as the samples used for the RNA-seq experiments described above.ChIP-seq libraries were sequenced, processed and mapped to the genome.Following input normalization, we observed a clear targeted enrichment of H3K9me3 within both subtelomeric and internal var gene clusters as previously described (42) (Figure 2A).H3K9me3 occupancy within the gene body of var2csa of NF54CSAh was almost entirely eliminated, differentiating it from the NF54 control which shows little variation from other var genes, which goes in line with its highly heterogenous var phenotype and consequently small contribution to H3K9me3 for individual genes on a population level (Figure 2A, B).Due to highly homologous sequences preventing unique mapping, several var genes display no H3K9me3 coverage (Figure 2B).This observation was confirmed by performing MACS3 broad peak calling of input normalized H3K9me3 enrichment with results showing an average of 2234 significant consensus peaks (q < 0.05) per sample and approximately 79% of peaks within the coding sequence (CDS) of protein coding genes (Supplementary Data 2).Differential binding analysis via DiffBind showed strong correlation (r > 0.75) between replicates and samples, with only 4 differentially bound sites (FDR < 0.05) (Figure S2C, D and Figure S3), all within the CDS or just upstream of the transcription start site (TSS) of var2csa (Supplementary Data 2).Given that var2csa is significantly downregulated in NF54 these results demonstrate the correlation between loss of var2csa expression and H3K9me3 mediated transcriptional silencing.
P. falciparum variant antigen gene families such as var, rif, stevor and pfmc-2tm cluster within subtelomeric regions of most chromosomes and enrichment of H3K9me3 within the gene body is similarly linked to their repression (43)(44)(45).NF54CSAh and NF54 both display this same pattern of elevated H3K9me3 within the exons of these multicopy gene families, and after normalizing each region (5' UTR, exons, introns and intergenic) the mean H3K9me3 within each is higher than genes outside these families as well as all intergenic regions (Figure 2C, D).

Increased interchromosomal interaction and perinuclear repositioning is involved in silencing of var genes and activation of var2csa respectively
To further investigate the effect chromatin organization and accessibility has on transcription of highly variant multi-copy gene families such as var and rif, we performed Hi-C experiments on tightly synchronized trophozoites to ensure peak expression of the target genes.Hi-C libraries of three biological replicates for each sample (NF54 and NF54CSAh) were prepared as previously described (41,46) and sequenced to a mean depth of ~155 million reads per replicate.The libraries were processed (aligning, pairing, mapping, and quality filtering) using HiC-Pro (30) and resulted in a mean of ~44 million valid interaction pairs per replicate.Due to the P. falciparum genome size and frequency of MboI cut sites, we elected to bin our reads at a 10-kb resolution to identify intrachromosomal and interchromosomal interactions.
A high stratum adjusted correlation coefficient (SCC » 0.87 to 0.94) suggests that chromatin structures were highly similar between replicates and samples (Figure S4A).We therefore combined the biological replicates for downstream analyses.Due to variation in sequencing depth between the merged NF54 and NF54CSAh samples, random sampling was performed on NF54CSAh to obtain ~100 million consensus reads for comparative and differential analyses.Heatmaps generated from the ICED normalized matrices show patterns similar to previous studies (47,41,48), with a negative log-linear relationship between contact probability and genomic distance demonstrating that our experiment worked as expected (Figure S4B).Most intrachromosomal interactions occur at a distance less than 10% the total length of each chromosome; with heterochromatin clustering occurring in telomeric regions and internal var gene clusters at a higher frequency than the other distant regions (Figure 3A-C, Figure S5 and Figure S6).
To evaluate the correlation between gene expression and chromatin architecture, we used Selfish (32) to identify differential intrachromosomal and interchromosomal interactions.
Although there were slight variations in the number of intrachromosomal interactions across most regions, the consistent pattern that emerged was the increased number of interactions between subtelomeric regions on most chromosomes (p < 0.05, log2FC > 1) in NF54CSAh over NF54, indicating tighter heterochromatin control of var gene regions (Figure 3A-C and Figure S7).The subtelomeric regions on chromosomes 2, 3, 4, and 10 showed the largest increase in interaction frequency and all of the var genes within those regions were significantly down-regulated in the RNA-seq data generated in this study.Most interesting is perhaps the mix of increased and decreased interactions between subtelomeric regions on chromosome 12, due to the proximity of var2csa to other nearby var genes within that region confirming a potential role of the chromatin 3D structure in gene regulation.
Additional analysis of these small-scale structural changes was performed by 3D chromatin modeling using the Poisson-based algorithm, PASTIS (33).Modeling shows colocalization of centromeres and telomeric clustering in distinct regions within the nucleus (Figure 3D, E).The overall 3D structure of the genome was similar between NF54 and NF54CSAh.Using the coordinates output from PASTIS we computed the distance between various bins/regions and found that the mean distance between telomeres containing var genes decreased by 6% in NF54CSAh, indicating overall telomeric compaction.What is more significant is that although there is a 52% increase in distance of var2csa to the nearest bin in spatial proximity and 19% greater distance from the telomeric cluster, the var gene nearest to the end of chromosome 12 only shows a 3% increase in distance from the telomeric cluster.This stark difference in distances for two regions only separated by 20 kb shows that small difference in exon coverage was detected between NF54CSAh and NF54 in var2csa or any other var gene (Figure 4A), we investigated the difference in distribution of 5mC marks within high versus lowly expressed genes.We detected a negative correlation for 5mC marks in the promoter of highly expressed genes, whereas 5mC across the rest of the gene body displays the opposite correlation (Figure 4B).Functional differentiation of methylated cytosines within these two distinct regions has already been observed in P. falciparum and in other eukaryotic organisms (49,52,54) but not in the context of antigenic variation.Over the years, genome wide DNA methylation studies have confirmed this "DNA methylation paradox" and demonstrated that methylation in exons is found in constitutively expressed genes and has an important role in maintaining transcriptional fidelity by preventing spurious transcription initiation and can affect histone modification and alternative splicing (64).We therefore investigated the DNA methylation pattern across the exon/intron junction (Figure 4C).
Altogether these results confirm the presence of 5mC in the parasite genome with a higher level within gene bodies and in genes that are often marked by H3K9me3.While additional experiments will be needed to further validate the role of DNA methylation in P. falciparum, this epigenetic mark does not seem to directly control transcriptional activation and switching of var2csa expression, but is likely essential in maintaining transcriptional elongation and fidelity, guiding histone modification enzymes and controlling splicing.

DISCUSSION
The complex and highly variable repertoire of var genes constitutes an essential tool for P. falciparum pRBCs to express different PfEMP1 proteins in order to cytoadhere to vascular endothelium and avoid the host immune system.The ability to cytoadhere and thereby withdraw from the circulation is crucial for the highly rearranged, rigid and immunogenic pRBCs to avoid splenic clearance and for the intracellular parasites to survive (65,66).Although the prevention of splenic clearance helps the parasite to escape the immune defense of the host, the exposure of the cytoadhesive proteins presents another issue, namely the immune recognition of the adhesins themselves.Therefore, varied expression of these surface proteins through antigenic variation is a necessity, both for the sake of immune escape but also for altered receptor preference and tissue tropism.
Excessive cytoadherence of P. falciparum in the microvasculature of various organs has been strongly associated with malaria disease severity and death.Perhaps the most striking example is the VAR2CSA-mediated sequestration of pRBCs to syncytiotrophoblasts in the placental intervillous space that render previously semi-immune women at risk of severe malaria while pregnant (12).The causative var2csa gene is unique from other var genes in that it is relatively conserved between parasite genomes and is regulated at both a transcriptional and translational level.Parasites cultured continuously in vitro often modify their patterns of var gene expression by transcribing var2csa without translating the mRNA, as this additional layer of regulation presents the parasites with energy saving and competitive advantages in the absence of the need to cytoadhere for its survival (16,17).Similarly, the complex regulation of the gene allows for the parasites to not express VAR2CSA until infecting a pregnant woman and being presented with the new placental niche.
In the work presented here, we were adamant about analyzing the regulation of var2csa in parasites with an appropriate in vivo phenotype instead of parasites with in vitro artifactual and non-productive transcription of the gene.Therefore, several downstream analyses were performed post phenotypic selection, where we were able to confirm the placental binding parasites with abundant surface expression of VAR2CSA.In concordance with the adhesive phenotype observed, the selected parasite NF54CSAh showed exclusive and abundant transcription of var2csa, which was in stark contrast to the original line which displayed transcript evidence from a vast number of var genes.These parasite lines therefore presented an excellent opportunity to scrutinize the molecular requirements for var2csa selection and activation on a population level, similar to what would occur for isogenic parasites in the in vivo context of PAM.
Several studies have previously and conclusively linked gene activation in P. falciparum to alterations in DNA accessibility through the change from heterochromatic to euchromatic state (40)(41)(42)(43)(44).In line with our transcriptomic findings, which revealed very limited and var exclusive differences, we observed a remarkable similarity in global H3K9me3 occupancy between the two analyzed parasite lines.We found H3K9me3 primarily in antigenic variable gene contexts with over-representation in var, rif and stevor, which highlights the high degree of heterochromatic gene silencing required for these genes.Differential binding analysis revealed only four significant differences between the analyzed lines all centered within or around var2csa, with a near absence in H3K9me3 occupancy for NF54CSAh.This clearly confirms the pivotal role of epigenetic regulation through histone modifications in the establishment of the placental binding phenotype in P. falciparum.
Besides histone modifications, other molecular events are known to be involved in the transition of a gene from a heterochromatic to euchromatic state in eukaryotes.One such factor is the macromolecular events that involve physical repositioning of genes within the nuclear context to allow for their activation or repression (67).In P. falciparum, nuclear repositioning has previously been suggested to be involved in regulating the activation of the gametocyte-specific transcription factor locus, pfap2-g, in the transition from the asexual to the early sexual stages in the parasite life cycle progression (41).Nuclear repositioning has also been suggested in antigenic variation (68)(69)(70)(71) but its importance has remained somewhat unclear.This is partially due to how well the nuclear context of the parasite is resolved upon interrogation.Previous studies have most often relied on fluorescent in situ hybridizations (FISH) to determine the proximity of genes of interest to a single marker of heterochromatin, such as a subtelomeric repetitive DNA sequence.Considering the small size of the P. falciparum nucleus, the naturally close proximity of many var genes to these subtelomeric repeats and the limited resolution of fluorescent probe approaches, this is a non-trivial task even in the case of robustly selected parasites with discrete var phenotypes.This indeed applies to the subtelomerically located var2csa for which FISH based endeavors have been inconclusive (13,68).Our current Hi-C experiments, performed on parasites with vastly different var phenotypes proved able to provide the resolution needed to resolve this matter.
We observed a significant decrease in differential interchromosomal interactions for NF54CSAh in the subtelomere of chromosome 12 where var2csa is located, whereas interactions around other silent var genes were increased.Upon 3D structural modeling it became evident that the genome of NF54CSAh displays an overall compaction of telomere ends and var gene-containing intrachromosomal regions, which is in line with the observed heterochromatic silencing of var genes.On the contrary, var2csa was distanced from the nearest telomeric cluster to a transcription permissive area of the nucleus.This conclusively suggests that, besides alteration of epigenetic marks, nuclear repositioning and small and localized changes in the chromatin structure are indeed required for activation of var2csa transcription.
Lastly, we investigated the differences in DNA methylation between NF54CSAh and NF54.DNA methylation exists in most eukaryotic organisms, including plants, and is involved in various physiological processes and has a complex correlation with gene expression.While the role of promoter hypermethylation in repressing gene transcription has been well documented, emerging evidence indicates that gene body methylation can promote transcription in up to 60% of genes in some species.While the presence of low levels of 5mC and 5hmC marks has been identified throughout the P. falciparum (54,55) and P. vivax (57) genomes, the function and the molecular mechanism underlying its regulation has remained obscure.However, recent studies where perturbations of the methylation process by either small molecules (56,57) or reverse genetics (58) suggest essentiality and a role in gene regulation.Here, using MeDIP-seq, we show that DNA methylation is mostly absent from the promoter regions of most var genes but is high throughout the CDS.Although there is no clear indication that 5mC regulates the transcriptional activation of var2csa, mechanistically, it may be a required modification to guide and stabilize RNA polymerases as well as histone modifying enzymes, and ultimately control transcription at the initiation or elongation level.
While the presence and function of gene body methylation in some species remains controversial, a growing body of evidence suggests that its effects, while minor, may be shaped by natural selection (72) and correlated with fitness (73).A high level of DNA methylation in genes known to be involved in antigenic variation in malaria parasites may be crucial for parasite adaptation and survival in vivo and deserve additional investigation.
Collectively our data confirm a strong association between epigenetics, genome organization and gene expression in regulating parasite transcripts involved in malaria pathogenicity but also what appears as a fundamentally different role for DNA methylation through 5mC.

Figure 1 .
Figure 1.Generation of parasites with a homogenous placental binding phenotype and consequential dominant expression of var2csa.(A) The presence of in vivo relevant knobstructures on pRBCs was confirmed by SEM prior to phenotypic selection.Shown is a representative image of a pRBC with knobs appearing as white protrusions.(B) Flow cytometry-based antibody recognition of VAR2CSA surface expression on repeatedly selected pRBCs.NF54 ptefKO served as negative control, NF54 was the original and

Figure 1 G
Figure 1

Figure 2 .
Figure 2. The var2csa locus is devoid of repressive H3K9me3 in placental binding parasites.(A) Chromosomal distribution of repressive H3K9me3 as determined by ChIP-seq from merged biological triplicates of NF54 and NF54CSAh.Regional clusters of antigenically variable genes are noted with green below each plot.The only noticeable difference between the parasite lines was located to the left arm of chromosome 12, which contains the var2csa locus (dashed box).(B) Heatmaps of H3K9me3 occupancy in NF54 and NF54CSAh for all var genes (0.5 kb upstream of the transcription start site (TSS) and 1.5kb into the gene body) Figure 2 A