A single-cell RNAseq atlas of the pathogenic stage of Schistosoma mansoni identifies a key regulator of blood feeding

Schistosomiasis is an ancient and chronic neglected tropical disease that infects over 240 million people and kills over 200,000 of the world’s poorest people every year1, 2. There are no vaccines and because there is only one drug available, the need for new therapeutics is great. The causative agents of this disease are flatworm parasites that dwell inside the host’s circulation, often for decades, where they feed on blood and lay eggs which are primarily responsible for disease pathology. As metazoans comprised of multiple tissue types, understanding the schistosome’s tissues on a molecular level and their functions during what can be decades of successful parasitism could suggest novel therapeutic strategies. Here, we employ single-cell RNAseq to characterize 43,642 cells from the pathogenic (adult) stage of the schistosome lifecycle. From these data, we characterize 68 molecularly distinct cell populations that comprise nearly all tissues described morphologically, including the nervous and reproductive systems. We further uncover a lineage of somatic stem cells responsible for producing and maintaining the parasite’s gut – the primary tissue responsible for digestion of host blood. Finally, we show that a homologue of hepatocyte nuclear factor 4 (hnf4) is expressed in this gut lineage and required for gut maintenance, blood feeding and inducing egg-associated pathology in vivo. Together, the data highlight the utility of this single-cell RNAseq atlas to understand schistosome biology and identify potential therapeutic interventions.

decades, where they feed on blood and lay eggs which are primarily responsible for disease characterize 68 molecularly distinct cell populations that comprise nearly all tissues described 31 morphologically, including the nervous and reproductive systems. We further uncover a lineage 32 of somatic stem cells responsible for producing and maintaining the parasite's gut -the primary 33 tissue responsible for digestion of host blood. Finally, we show that a homologue of hepatocyte 34 nuclear factor 4 (hnf4) is expressed in this gut lineage and required for gut maintenance, blood 35 feeding and inducing egg-associated pathology in vivo. Together, the data highlight the utility of 36 this single-cell RNAseq atlas to understand schistosome biology and identify potential therapeutic 37 interventions. syncytial tissues, including the tegument 13,14 (Extended Fig. 2f) and gut 15 (Fig. 1h, Extended Data 63 Fig. 2g). However, we failed to identify cells from two other syncytial tissues, i.e., the female 64 ootype (an organ involved in egg shell formation) and the protonephridial ducts (which are thought 65 to be syncytial in other parasitic flatworms 16 ) that together with the flame cells make up the 66 protonephridial system 17 . 67 We uncovered a surprising level of molecular complexity within the schistosome nervous system, 68 identifying 30 clusters of cells that express the neuroendocrine protein 7b2 (Fig. 1i, Extended data  Schistosome muscle is also very heterogeneous, with eight different clusters of cells that possess 81 unique expression patterns (Extended Data Fig. 3f,g). Some populations appear to be diffusely 82 arranged throughout the animal ("muscle 1" and "muscle 2"), whereas others are anatomically 83 restricted such as the "muscle 7" cells that reside at the midline proximal to the parasite's digestive 84 tract, suggesting that this cluster represents cells of the enteric musculature (Extended Data Fig.   85 3f, third column). 86 Similar to what has been observed in planarians 18 , tapeworms 19 , and acoels 20 , we find that many there is no evidence of whole-body tissue regeneration. It is interesting, therefore, that the 92 expression pattern of these signaling molecules is conserved in a non-regenerative animal. This 93 suggests that their anatomically restricted expression in neuromuscular tissues could regulate 94 schistosome neoblast fates during homeostasis. Further investigation of this hypothesis in 95 schistosomes could uncover novel regulators of stem cell biology in these parasites. 96 The pathology of schistosome infection is driven almost exclusively by the host's inflammatory 97 responses to parasite eggs 22 . Therefore, understanding the biology of schistosome reproductive 98 organs could lead to novel methods to target disease pathology. Our single-cell expression atlas 99 allows us to study the differences between not only male and female parasites, but also between 100 sexually mature and age-matched virgin females at the cellular level (Fig. 2a). Male, mature  Fig 5b). 106 However, mature gametes cluster according to sex, with substantial expression of "female gametes"-enriched genes only found in mature females ( Fig. 2d and Extended Data Fig. 5c) and 108 substantial expression of "male gametes"-enriched genes only found in males (Extended Data Fig.   109 5d). 110 Our scRNAseq data also enables us to study sexual cellular lineages. The sexually mature 111 schistosome ovary is structured such that GSCs reside at the anterior pole whereas mature 112 differentiated oocytes are found at the posterior end 23,24 . The "GSCs"-enriched genes such as that the "GSC progeny" cluster exists between "GSCs" and "female gametes" on the UMAP 117 projection plot, (Fig. 2a), so we would expect the "GSC progeny"-enriched genes such as meiob 118 to be expressed between the anterior and posterior ovary, which is indeed what we find (Fig. 2c,   119 left panel, Extended Data Fig. 5b, middle panels). We would also predict proliferative cells to be 120 concentrated in the nanos1 + GSCs, with little to no cell proliferation in meiob + GSC progeny cells 121 or bmpg + female gametes, which agrees with our observations (Extended Data Fig. 6a-d).

122
Concurrent visualization of ovarian stem cells, progenitors and oocytes reveals a highly-organized 123 linear architecture (Fig. 2e). Interestingly, both mature and virgin females express the "GSC 124 progeny" marker meiob (Fig. 2c), suggesting that the primordial ovary of the virgin female still 125 undergoes some level of differentiation without stimulus from the male. Thus, it appears that male 126 parasites may promote survival of differentiating GSCs rather than inducing GSC commitment.  We were also able to use our single cell atlas to examine the schistosome vitellaria, another male-131 sensitive, stem-cell dependent tissue responsible for producing the yolk cells that provide nutrients 132 to the parasite's eggs. Despite a wholly different function and organization, there were many 133 parallels between the maturation of the ovary and the vitellaria such as the presence of an apparent 134 lineage from stem cell to mature tissue (Extended Data Fig. 6e-h). Our atlas also confirmed the 135 decades-old observation that male parasites have a low frequency of vitellocyte-like cells 26

136
(Extended Data Fig. 6e, bottom two panels). Finally, we identified markers of pairing-independent 137 sexual tissues such as the flatworm-specific Mehlis' gland that plays an enigmatic role in egg 138 production 9 (Extended Data Fig. 6i).

139
Previous work suggests that adult schistosome neoblasts are homogeneous and predominantly give 140 rise to cells involved in tegument production 12,13 . We identified a putative non-tegument lineage 141 as suggested by a linear "path" of cells leading from a neoblast sub-population to the gut (Fig. 3a).

142
The putative lineage began with a rare population of proliferative cells that expressed the somatic 143 neoblast marker nanos2 (Fig. 1c), the juvenile neoblast marker eled 7 (Fig. 3b, Fig. 7c). Adjacent to these eled + neoblasts on the 146 UMAP projection plot was the "prom2+" population, characterized by expression of prom2 and 147 hnf4 in and around the gut (Fig. 3b, bottom left, Extended Data Fig 7d). Situated next to the 148 "prom2+" cluster was the "gut" cluster, which expressed definitive gut markers such as genes 149 encoding cathepsin B-like cysteine proteases ctsb (Fig. 3b, bottom right, Extended Data Fig. 7e).

150
Based on the localization of these genes on the UMAP projection plot (Fig. 3a), their expression 151 patterns, and that hnf4 is a marker of gut stem cells in planarians 27 , we hypothesized that the eled + 152 neoblasts, "prom2+" cells, and "gut" cells represent the schistosome gut lineage. In order to test 153 this model, we sought to perturb the eled + neoblasts at the top of lineage in order to observe the 154 effects on downstream cells. To this end, we performed a small-scale RNAi screen targeting 155 several genes expressed in the eled + neoblasts (Extended Data Fig. 8a, b). Remarkably, RNAi of 156 hnf4 resulted in massive expansion of eled + neoblasts along the parasite's gut (~3.8-fold increase 157 in hnf4(RNAi) animals compared to control, p < 0.0001) (Fig. 3c, Extended Data Fig. 8c-f).

158
According to our lineage model, an expansion of eled + neoblasts could either result in an increase 159 in gut production because of an expanded stem cell pool, or it could result in a decrease in gut were expressed almost exclusively in the gut (Extended Data Fig. 9d).

171
To determine whether these transcriptional changes in hnf4(RNAi) animals affected the gut 172 structure, we examined hnf4(RNAi) animals by transmission electron microscopy (TEM). The 173 schistosome gut is a syncytial blind tube-like structure with a microvilli-filled lumen 15 . Though  Although the gut was abnormal in hnf4(RNAi) animals, it was unclear whether the hnf4 RNAi 186 resulted in destruction of the gut, a block in new gut production or some combination of both.

187
There was no apparent difference in the number of TUNEL + apoptotic cells between control and 188 hnf4(RNAi) animals (Extended Data Fig. 9h). To understand whether stem cell differentiation was 189 grossly intact, we looked at tegument production using EdU pulse-chase approaches in hnf4(RNAi) 190 animals and found a significant increase in tegument production compared to control(RNAi) 191 animals (Extended Data Fig. 9i, j), ruling out a broad stem cell differentiation defect. Our ability 192 to monitor new gut production by EdU pulse-chase approaches was complicated by the fact that 193 gut marker expression was largely absent in most parasites (Fig. 3d, Extended Data Fig. 9b).

194
Examination of gut differentiation in cases where we could detect gut marker expression by EdU 195 pulse-chase approaches in hnf4(RNAi) parasites revealed that new gut-like tissue (i.e., expresses 196 gut markers like ctsb, though not always in the typical linear pattern along the parasite's midline) 197 was still being produced (Extended Data Fig. 9k), but the gut-like tissue that was present was locations where eled + neoblasts were able to partially overcome the differentiation block and form 204 gut-like tissue. However, given the relatively low basal rate of gut production 12 , a partial block of 205 gut differentiation is not likely to result in such a dramatic gut defect over the course of a 17 day 206 RNAi treatment. As such, hnf4 is likely required for both normal gut production and maintenance.

207
Based on the profound morphological defects in the gut, we next asked whether there were any 208 functional consequences of hnf4 RNAi. Although glucose can be absorbed across the parasite's 209 tegument, parasites rely on the gut to digest host blood cells 29 . To test the digestive capability of 210 hnf4(RNAi) parasites, we added red blood cells to the media and observed the parasites' ability to 211 uptake and digest the cells. While the vast majority of control(RNAi) parasites (67/69) were able 212 to ingest and digest red blood cells as evidenced by black pigmentation in the gut 30 , hnf4(RNAi) 213 parasites either failed to ingest red blood cells (15/69) or ingested red blood cells but couldn't 214 digest them as evidenced by red pigmentation in the gut (54/69) (Fig. 4a, b). These data suggest a 215 decrease in the blood ingestion and digestion capacity of the hnf4(RNAi) animals but does not 216 address the mechanism of any digestive defects. Because we measured a decrease in the expression 217 of many proteolytic enzymes in our RNAseq experiment (Supplementary Table 2), we next asked 218 whether there was a loss in the hnf4(RNAi) parasites of those cysteine (cathepsin) proteases that 219 contribute to hemoglobin digestion 31 . Accordingly, we measured the cathepsin activity of lysates 220 from control(RNAi) and hnf4(RNAi) parasites using the fluorogenic peptidyl substrate, z-Phe-Arg-221 AMC (Z-FR-AMC) 32 . We show that the majority of the activity (94%) in protein extracts from control(RNAi) parasites is due to cathepsin B, as this activity is sensitive to the selective cathepsin 223 B inhibitor, CA-074 (Fig 4c). In hnf4(RNAi) parasites, the cysteine protease activity is decreased 224 8.2-fold relative to control(RNAi) parasites. Thus, the functional assay data are consistent with 225 our gene expression analyses that show a significant reduction in five cathepsin B gene sequences 226 in the hnf4(RNAi) animals (Supplementary Table 2). In contrast, we show that aspartyl protease 227 activity is unchanged in control(RNAi) and hnf4(RNAi) parasites (Extended Data Fig. 10a), which 228 could reflect the expression of aspartic proteases in non-gut tissues that were not downregulated 229 following hnf4 RNAi (Supplementary Table 1, 2). Taken together, these data suggest that hnf4 is 230 required for cathepsin B-mediated digestion of hemoglobin in S. mansoni. 231 Given the importance of blood uptake and digestion for egg production 29 , the primary driver of the 232 pathology of schistosomiasis, we wondered whether hnf4 was required to cause disease in the host.

233
To test this, we transplanted control(RNAi) and hnf4(RNAi) parasites into uninfected mice and  Together, these results suggest that hnf4 is required for parasite growth and egg-induced pathology 242 in vivo. 243 Schistosomiasis is a neglected tropical disease due in no small part to the difficulty of studying 244 these parasites in the laboratory. Prior to this work, identification of specific tissue markers and 245 understanding the cellular and molecular consequences of experimental perturbations relied upon 246 a great deal of effort and guesswork 7,12,13,17,23,33 . Using scRNAseq, we not only generated the most 247 comprehensive single-cell atlas of any metazoan parasite to date, but also identified regulators of 248 gut biology, leveraging this knowledge to experimentally perturb schistosome-induced pathology 249 in the mammalian host. Indeed, our approach serves as a template for the investigation of other 250 understudied and experimentally challenging parasitic metazoans, thereby improving our 251 understanding of their biology and enabling us to discover novel therapies for these pathogens.       For each of 6 different neuron cluster-specific genes (from left to right "neuron 11": Smp_042120, 338 "neuron 12": Smp_159220, "neuron 14": Smp_072470, "neuron 15": Smp_319030, "neuron 18":    quantification of percentage of nanos1 + , meiob +, or bmpg + cells that are EdU + following a 30-394 minute EdU pulse. e, For the "S1"-enriched gene nanos1, the "S1 progeny"-enriched gene 395 msantd3, the "late vitellocyte"-enriched gene p48, and the "mature vitellocyte"-enriched gene 396 ataxin2: (left) violin plots showing gene expression levels across the clusters "S1", "S1 progeny", 397 "early vitellocytes", "late vitellocytes", "mature vitellocytes" colored by sex (mature female = 398 magenta, virgin female = green, male = yellow) and (right) representative micrographs of 399 colorimetric WISH of the indicated gene in the vitellaria of mature females (m♀) and the midline 400 of males (♂) as indicated on the image. f, For the "S1"-enriched gene nanos1, the "S1 progeny"-401 enriched gene msantd3, the "late vitellocyte"-enriched gene p48, and the "mature vitellocyte"-

548
Raw data from single cell RNAseq experiments are available from XXXX with accession number XXXX.

549
Parasite labeling and imaging 550 Colorimetric and fluorescence in situ hybridization analyses were performed as previously described 11,12 with the 551 following modification. To improve signal-to-noise for colorimetric in situ hybridization, all probes were used at 10 552 ng/mL in hybridization buffer. In vitro EdU labeling and detection was performed as previously described 11 . For 553 dextran labeling of the parasite gut, 10 male RNAi-treated parasites were given 10µL/mL of 5 mg/mL (in water) 554 solution of biotin-TAMRA-dextran (Life Technologies D3312) and cultured 12 hours. The parasites were then fixed 555 in fixative solution (4% formaldehyde in PBSTx (PBS + 0.3% triton-X100)) for 4 hours in the dark with mild agitation.

556
Worms were then washed with 10 ml of fresh PBSTx for 10 minutes, then dehydrated in 100% methanol and stored 557 at -20dC until used in fluorescence in situ hybridization as described 11,12 . All fluorescently labeled parasites were     Table 4. EdU pulses were performed at 5µM for 4 hours before either fixation or chase as previously 584 described 11 .

585
As a negative control for RNAi experiments, we used a non-specific dsRNA containing two bacterial genes 36 . cDNAs 586 used for RNAi and in situ hybridization analyses were cloned as previously described 36 ; oligonucleotide primer 587 sequences are listed in Supplementary Table 5. 588 qPCR and RNAseq 589 RNA collection was performed as previously described 12 with the following modifications. Parasites were treated with 590 dsRNA as described in Supplementary Table 3 ("strategy 4") and whole parasites were collected in Trizol. RNA was 591 purified from samples utilizing Direct-zol RNA miniprep kits (Zymo Research R2051). Quantitative PCR analyses 592 were performed as previously described 11,12 . cDNA was synthesized using iScript™ cDNA synthesis kit (Bio-Rad 593 1708891) and qPCR was performed as previously described 13  were performed with DESeq2 (version 1.12.2) 39 . Volcano plots were made with using the "volc" function from 601 ggplot2. In order to filter out noise, genes with a base-mean expression value less than 50 were excluded from analysis.

602
Furthermore, genes that were differentially expressed (padj < 0.05) that were not assigned to the automatically assigned 603 to the "gut" cluster during initial clustering were manually examined in the single-cell RNAseq data and those that 604 were expressed in the gut were reclassified to the "gut" cluster. Raw data from hnf4 RNAi RNAseq experiments are 605 available at XXXX with accession number XXXX.

606
Protease activity assays

607
To measure cysteine protease cathepsin activity 32 , five worms of each RNAi condition (see Supplementary Table 3 608 "strategy 7") were ground and sonicated in 300 µL assay buffer (0.1 M citrate-phosphate, pH 5.5). The lysate was 609 centrifuged at 15,000g for 5 minutes and the pellet was discarded. The total protein concentration was calculated using

614
The release of the AMC fluorophore was recorded in a Synergy HTX multi-mode reader (BioTek Instruments,

616
To measure aspartic protease cathepsin activity, five worms of each RNAi condition (See Table 3

624
All protease activity experiments were carried out as biological triplicates each in triplicate.

626
Surgical transplantation was performed as previously described 40 with the following modifications. Seven days prior 627 to surgery, 5-week-old parasites were recovered from mice and treated with 30 µg/ml dsRNA for 7 days in Basch

628
Media 169 (see Table 3 "strategy 8"). Before mice were anesthetized, 10 pairs (male and female) were sucked into a 629 1ml syringe, the syringe was fitted with a custom 25G extra thin wall hypodermic needle (Cadence, Cranston, RI), the 630 air and all but ~200 µL of media were purged from the needle, and the syringe was placed needle down in a test tube           worm length (mm)