The enteric pathogen Cryptosporidium parvum exports proteins into the cytoplasm of the infected host cell

The parasite Cryptosporidium is responsible for diarrheal disease in young children causing death, malnutrition, and growth delay. Cryptosporidium invades enterocytes where it develops in a unique intracellular niche. Infected cells exhibit profound changes in morphology, physiology and transcriptional activity. How the parasite effects these changes is poorly understood. We explored the localization of highly polymorphic proteins and found members of the C. parvum MEDLE protein family to be translocated into the cytoplasm of infected cells. All intracellular life stages engage in this export, which occurs after completion of invasion. Mutational studies defined an N-terminal host-targeting motif and demonstrated proteolytic processing at a specific leucine residue. Direct expression of MEDLE2 in mammalian cells triggered an ER stress response that was also observed during infection. Taken together, our studies reveal the presence of a Cryptosporidium secretion system capable of delivering pathogenesis factors into the infected enterocyte.


INTRODUCTION 30
The Apicomplexan parasite Cryptosporidium is a leading cause of diarrheal disease 31 worldwide. Young children are highly susceptible to infection and cryptosporidiosis is an 32 important contributor to child mortality (Khalil et al., 2018;Kotloff et al., 2013). Children in 33 resource poor settings carry a disproportionate burden of severe disease (Choy and Huston, 34 2020). Malnutrition enhances the risk of severe cryptosporidiosis, and at the same time, the 35 disease impacts the nutritional state of children, which can lead to impaired growth (Costa et 36 al., 2011;Mondal et al., 2009). Infection with the parasite results in protective immunity, but 37 this immunity is not sterile and may require multiple exposures to develop (Chappell et al., 38 1999;Okhuysen et al., 1998). Most human disease is due to infection with C. hominis, which 39 only infects humans, and C. parvum, which can be zoonotically transmitted 40 Nader et al., 2019). The emergence of Cryptosporidium species is driven by host adaptation 41 resulting in specialization and narrowing host specificity; however, the sexual lifecycle of the 42 parasite allows for recombination and can lead to rapid convergent evolution of host specificity 43 (Guo et al., 2015;Nader et al., 2019). 44 Cryptosporidium infects the epithelium of the small intestine, where it lives in a unique 45 intracellular, but extracytoplasmic niche (Elliott and Clark, 2000). The mechanism by which this niche is established during invasion is still debated but involves the rearrangement of the host actin cytoskeleton, the formation of tight junction-like structures between host and parasite 48 membranes, and a dense band of unknown composition at the host-parasite interphase 49 (Bonnin et al., 1999;Elliott and Clark, 2000). Cryptosporidium has severely reduced metabolic 50 capabilities and relies heavily upon the host cell for nutrients and metabolites (Abrahamsen et 51 al., 2004;Xu et al., 2004). A number of specialized uptake mechanisms have been proposed to 52 fill this need, many of which are believed to be localized to the so-called feeder organelle at the 53 host parasite interface (Perkins et al., 1999). In summary, Cryptosporidium remodels the host 54 cell in significant ways that include its cytoskeleton (Bonnin et al., 1999;Elliott and Clark, 2000), 55 cellular physiology and metabolism (Argenzio et al., 1990;Kumar et al., 2018), as well as aspects 56 of immune restriction and regulation (Laurent and Lacroix-Lamandé, 2017). 57 Many bacterial, protozoan and fungal pathogens use translocated effectors to 58 manipulate their hosts to secure nutrients and to block host immunity. In Plasmodium 59 falciparum, exported effectors form adhesive structures on the surface of red blood cells to 60 alter tissue distribution and mechanical properties to prevent clearance (Crabb et al., 1997;61 First, we tagged MEDLE2 with beta lactamase, which has been used to reveal effector 179 export in bacteria and protozoa (Charpentier and Oswald, 2004;Lodoen et al., 2010). These 180 parasites also expressed a cytoplasmic red fluorescent protein (Figure 4-figure supplement 1B). 181 Cells infected with MEDLE2-BLA sporozoites were incubated with CCF4-AM, a cell-permeable 182 substrate of beta lactamase and imaged by live microscopy. Infected and uninfected cells 183 accumulated CCF4-AM (green, Figure 4C); however, we did not detect substrate cleavage 184 resulting in blue fluorescence ( Figure 4C). This could be due to the lack of MEDLE-BLA 185 expression or export. To visualize localization of the MEDLE2-BLA fusion protein during 186 infection, we performed IFA on MEDLE2-BLA infected cells using a BLA antibody and observed 187 that MEDLE2-BLA (green) was expressed but remained with the parasite (red, Figure 4- figure  188 supplement 1C). Next, we generated a MEDLE2 fusion with Cre recombinase and used these 189 but not red fluorescence (2.51%). Transfection of host cells with a Cre recombinase expression 193 plasmid resulted in a pronounced shift to red fluorescence (+control, 32%, Figure 4D). Cells 194 infected with either WT parasites or MEDLE2-Cre parasites remained green, despite robust 195 infection ( Figure 4-figure supplement 2). Therefore, we tested three reporters, none of which 196 resulted in detectable export to the host cell. We note that multiple algorithms predict MEDLE2 197 to be a highly disordered protein (low complexity regions are indicated in light blue in Figure  198 5B) and conclude that translocation is blocked when folded reporters are fused to the protein. 199

MEDLE2 export depends upon N-terminal sequence features 201
We then sought to determine whether MEDLE2 contains sequence specific information 202 for host-targeting. Using previously established host targeting motifs from P. falciparum and T. 203 gondii as models (Coffey et al., 2015;Marti et al., 2004), we searched the MEDLE2 amino acid 204 sequence to identify candidate export motifs. Preference was given to regions with a basic 205 amino acid, followed by one or two random amino acids, and a leucine residue (Pellé et al., 206 2015). While Plasmodium host targeting motifs are typically found in close proximity to the 207 signal peptide, T. gondii exhibits less rigid distance requirements (Coffey et al., 2015). We 208 identified 4 motifs, three sites in proximity to the N-terminus and one C terminal candidate for 209 mutational analysis ( Figure 5A). As folded reporters are not tolerated, we engineered parasite 210 lines to express an ectopic copy of MEDLE2 marked by a HA tag. A cassette driven by the 211 MEDLE2 promoter was inserted into the locus of the dispensable thymidine kinase gene (TK, 212 Figure 5B) and expression level and export of ectopic WT protein was indistinguishable from 213 protein tagged within the native locus. 214 Removal of the sequence encoding the N-terminal signal peptide (DSP) prevented 215 MEDLE2-HA export, and the resulting protein accumulated within the parasite ( Figure 5C). Next, 216 we constructed a series of parasite strains in which each of the candidate motifs was replaced 217 by a matching number of alanine residues (all mutants were confirmed by PCR mapping and 218 Sanger sequencing, Figure Figure 5-figure supplement 1). Mutagenesis of three of these 219 candidate motifs had no impact on MEDLE2 translocation to the host cell ( Figure 5C; Figure 5-220 figure supplement 1). In contrast, when the most N-terminal sequence KDVSLI was changed to 221 six alanines, HA staining accumulated in the parasite and host cell staining was lost ( Figure 5C).
We conclude that in this mutant, MEDLE2-HA is made but export is ablated. Next, we 223 constructed six additional strains using the same strategy to change each amino acid position of 224 the KDVSLI motif to alanine one residue at a time (Figure5-figure supplement 1). Mutation of 225 residue leucine 35 to alanine (L35A) ablated export and instead, MEDLE2-HA remained with the 226 parasite ( Figure 5D). Changing the remaining five amino acids individually did not alter MEDLE2 227 localization in the host cell ( Figure 5D). We conclude leucine 35 to be critical for export. with an apparent molecular weight of 31 kDa ( Figure 5E, the predicted molecular weight for full 237 length MEDLE2-HA is 26.9 kDA but the abundance of positive charges is likely to result in 238 reduced electrophoretic mobility). Protein KPVLKN/6A, carrying a mutation that did not affect 239 trafficking to the host cell cytoplasm, was of identical size to WT MEDLE2-HA. In contrast, the 240 KDVSLI/6A mutant, which is no longer exported, appeared to be of a larger molecular weight 241 ( Figure 5E). The mutant lacking the 22 amino acid signal peptide (DSP) produced an 242 intermediate size band larger than the exported WT but smaller than the retained DKDVSLI 243 mutant. We found a very similar pattern when analyzing the single amino acid mutants, where the L35A change caused the mutant protein to migrate more slowly when compared to WT or 245 the DSP mutant ( Figure 5F). To ensure the observed differences in apparent molecular weight 246 were due to processing by the parasite and not the consequence of folding or subsequent host 247 processing, we also expressed WT and mutants in mammalian cells (see Material and Methods 248 for detail). In this context, the proteins are the same size ( Figure 5-figure supplement 2). 249 Overall, we interpret the relative sizes of the mutated proteins to indicate processing of 250 MEDLE2 at a point beyond the signal peptide, a position that would be consistent with leucine 251 35. We note that this processing appears to require translocation into the ER, as it does not 252 occur in mutants lacking a signal peptide. Furthermore, the L35A mutation apparently 253 prevented removal of the signal peptide, suggesting that processing at L35 could replace the 254 canonical signal peptidase activity. 255

MEDLE2 induces an ER stress response in the host cell 257
To begin to understand the consequence of MEDLE2 export on the host cell, we 258 subjected to mRNA sequencing (3 biological repeats for each sample, Figure 6B). Differential 265 gene expression analysis revealed 413 upregulated genes and 487 genes with lower transcript 266 abundance in MEDLE2-GFP expressing cells compared to cells expressing GFP alone, with an 267 adjusted p-value less than 0.05 ( Figure 6C). Gene set enrichment analysis showed upregulation 268 in the response to ER stress, including changes in genes linked to the unfolded protein response 269 (UPR). Genes that are part of the core enrichment of the ER stress response are highlighted 270 (red) in the volcano plot and the most upregulated genes are identified by name ( Figure 6D). 271 Genes that were differentially expressed at an adjusted p-value less than 0.01 were used to  Figure 6E). To test whether this ER stress response also 277 occurs during in vivo infection, we performed qPCR on ileal segments resected from mice 278 infected with C. parvum or those that were uninfected. We measured the RNA abundance for 279 the four genes highlighted in the volcano plot and found three to be upregulated in infected 280 mice compared to uninfected controls (NUPR1, CHAC1, DDIT3, Figure 6F). We conclude that an 281 Intestinal cryptosporidiosis in animals and humans is caused by parasites that are 286 morphologically indistinguishable, and therefore were initially described as a single taxon, C. polymorphic genes for the localization of the proteins they encode by epitope tagging the 299 endogenous loci. We found that C. parvum exports multiple members of the MEDLE protein 300 family into the cytoplasm of the host cell. This is particularly robust for the highly expressed 301 MEDLE2 protein, where the observation is based on more than 15 independent transgenic 302 strains using different epitope tags and antibodies, in locus and ectopic tagging, and held true 303 in cultured cells and infected animals. Importantly, mutation of the tagged gene changed the 304 localization and molecular weight of the protein, highlighting the specificity of the reagents 305

used. 306
Apicomplexa evolved multiple mechanisms to deliver proteins to their host cells during 307 and following invasion. The timing of MEDLE2 expression and export and its sensitivity to BFA 308 treatment argues for a mechanism that becomes active after the parasite has established its 309 intracellular niche, in contrast to injection during invasion from organelles already poised to We demonstrate that MEDLE2 export depends on an N-terminal signal peptide and a 323 leucine residue at position 35. Point mutation of this residue ablates export and results in a 324 higher apparent molecular weight of the protein, which we interpret as a lack of processing at 325 the mutated site. Disorder may be critical for MEDLE2 export, as fusion of well folded domains 326 potently blocked its export into the host cell. This is consistent with an export mechanism 327 unable to unfold proteins as proposed for T. gondii and Plasmodium liver stages (Beck and Ho,

In vitro infection and Immunofluorescence Assay 465
Coverslips seeded with human ileocecal adenocarcinoma cells (HCT8) (ATCC® CCL-244™) were 466 infected when 80% confluent with 200,000 purified oocysts (bleached, washed, and 467 resuspended in RPMI medium containing 1% serum). For time course infections, parasites were 468 allowed to invade for 3 hours, then medium was removed, and the cells were washed with PBS 469 to remove unexcysted oocysts and replaced with fresh RPMI medium with 1% serum. At

Flow Cytometry analysis of transfected cells 553
HEK293T cells were subjected to lipofection with 25 µg GFP only plasmid or MEDLE2-GFP 554 plasmid, grown for 24h, trypsinized, washed and resuspended in PBS with DAPI and passed 555 through a 40 µM filter (BD Biosciences San Jose, CA). Cell viability was gated based upon DAPI 556 staining. Untransfected HEK293T served as negative control and GFP expressing HEK293T cells 557 as positive control to establish gates. 10,000 green, single cells were double sorted using an 558 Aria C flow cytometer first into PBS then into lysis buffer (3 biological replicates for each 559 condition). 560

RNA extraction sequencing and data analysis 562
Total RNA was extracted using the Qiagen RNeasy Microkit (Qiagen, Germantown, MD) and 563 input RNA was quality controled and quantified using a Tape Station 4200 (Aglient Technologies 564 Santa Clara, CA). cDNA synthesis was performed following the clonTechSMART-seq cDNA 565 synthesis protocol (15 cycles). Following DNA cleanup, a Nextera library was prepared and 566 nucleic acid was quantified using the Qubit 3 Fluorometer (Thermo Fisher Scientific Waltham, 567 MA). Samples were pooled for RNAsequencing of 4 nM of total cDNA and sequencing was 568 performed using a NextSeq 500 Instrument (Illumina Inc San Diego, CA). 569 RNAseq reads were pseudo-aligned to the Ensembl Homo sapiens reference transcriptome v86 570 using kallisto v0.44.0 (Bray et al., 2016). In R, transcripts were collapsed to genes using 571 OligoDT (Thermo Fisher Scientific Waltham, MA). qPCR was performed using a Viia TM 7 Real-591 time PCR System (Thermo Fisher Scientific Waltham, MA) and relative gene expression was 592 determined using the DDCT method. 593 GraphPad PRISM was used for all statistical analyses. When measuring the difference between 596 two populations, a standard t-test was used. For data sets with 3 or more experimental groups, 597 a one-way ANOVA with Dunnett's multiple comparison's test was used. Simple Linear 598 Regression was used to determine the goodness of fit curve for the number of MEDLE2 599 expressing cells and intracellular parasites. Quantification of imaging experiments was 600 performed using ImageJ macros programmed to count both parasites and host cell nuclei in 601 blinded images that were captured using a scanning function to avoid bias during acquisition. 602 603