Establishment of quantitative RNAi-based forward genetics in Entamoeba histolytica and identification of genes required for growth

Akhila Bettadapur; Samuel S. Hunter; Charles G. Barbieri; Matthew L. Settles; Katherine S. Ralston

doi:10.1101/2020.05.28.121780

Abstract

Entamoeba histolytica is a globally important pathogen that is dramatically understudied. Its challenging genome has limited tractability. To enable forward genetics, we constructed the first genome-wide E. histolytica RNAi knockdown mutant library. The library is designed to enable deep sequencing analysis for quantitative identification of mutants after selection. We developed a novel analysis pipeline to precisely define and quantify full-length gene fragments inferred from read mapping. We performed the first E. histolytica RNAi screen and identified slow growth mutants. Growth phenotypes were reproducible in independently generated mutants. Some of the genes targeted in slow growth mutants had annotated functions consistent with roles in growth or metabolism. Some targeted genes lacked annotation, supporting the power of forward genetics in uncovering gene function. This work opens up the possibility of applying genetics to improve understanding of this important pathogen. Moreover, the strategies behind this RNAi library, and its analysis, are novel, and can be applied to other organisms.

Introduction

Entamoeba histolytica is the causative agent of amoebiasis in humans. E. histolytica is common in developing nations and is responsible for ~50 million infections/year [1]. Children under five are particularly vulnerable to diarrheal infections including amoebiasis, which are responsible for close to half a million deaths annually [2]. In the first two years of life, approximately 80% of infants in an urban slum in Dhaka, Bangladesh become infected with E. histolytica [3]. Malnutrition and stunting are associated with childhood E. histolytica infections [4]. Among children aged 12 – 23 months enrolled in the in the Global Enteric Multicenter Study, E. histolytica infection was associated with the highest risk of death (5%) between enrollment and follow up [5]. Invasive infections result in an estimated 15,500 deaths/year in children and 67,900 deaths/year among people of all ages [2]. There is no vaccine and therapy is limited to nitroimidazoles [6]. The continued morbidity and mortality indicate that current treatment and prevention approaches are insufficient.

Infection can be asymptomatic, or can lead to diarrhea, bloody diarrhea with intestinal ulceration, and fatal abscesses outside of the intestine. Fitting with its capacity to cause ulcers and abscesses, E. histolytica was named for its ability to lyse tissue (histo-: tissue; lytic-: dissolving). Despite the name-giving property, it is not known how amoebic trophozoites (“amoebae”) damage tissue [7]. The major known virulence factors are the surface Gal/GalNAc lectin [8, 9], which mediates attachment to human cells, and the cysteine proteases, which cleave human substrates [10, 11]. It was recently shown that amoebae kill human cells by “nibbling” them, in a process termed trogocytosis [12]. Amoebic trogocytosis contributes to tissue damage, and warrants further study for its potential relevance to invasive disease [12]. However, trogocytosis is not yet well-defined at the molecular level [13].

Despite its impact on human health, E. histolytica is dramatically understudied relative to other parasites. Given the limitations of current therapies and the limitations of our current understanding of pathogenesis, new tools are needed to study this infection. Although the genome has been sequenced, it is difficult to identify genes relevant to pathogenesis from sequence information alone. The genome is relatively difficult to manipulate experimentally. E. histolytica is thought to be tetraploid [14]. The genome contains a very large number of expanded gene families, with 897 protein families that contain 4,564 genes, and represent 56% of the proteome [15, 16]. The nucleotide composition of the E. histolytica genome is ~75% A + T, which creates challenges for genomic approaches and has contributed to the relatively poor draft genome quality. The genome is ~20 Mb with ~8,000 genes [15, 17, 18]. For one third of genes, no homologues have been identified [17] and less than half of genes have been assigned gene ontology (GO) terms [15]. Thus, with limited sequence homology information and a difficult genome, functional approaches such as forward genetics are needed to identify genes that are relevant to pathogenesis.

Historically, the genetic tractability of E. histolytica has been limited. CRISPR/Cas9 has not yet been applied for use in E. histolytica. Previous efforts to integrate sequences into the genome or to genetically manipulate endogenous loci have been unsuccessful. Stable transfection with episomal plasmids has been used in order to overexpress tagged proteins or dominant negative mutants [19, 20]. Forward genetics have not yet been established in E. histolytica, outside of two studies that used a plasmid-based overexpression library to screen for phenotypes [21, 22]. Relatively recently, the endogenous RNAi pathway in E. histolytica has been characterized, and exploited as a useful tool for targeted gene knockdown.

The existence of an RNAi pathway in E. histolytica that could be exploited for gene knockdown was implied by studies that successfully knocked down expression of the amoebapore A gene [23]. Amoebae were stably transfected with a plasmid containing a fragment of the amoebapore A promoter together with a segment of a retrotransposon and a short T-rich stretch [23]. While this led to permanent silencing of amoebapore A, the mechanism of action was unclear, and the expression of off-target genes was affected [24, 25]. It was possible to silence other genes, but only in amoebae that already contained the amoebapore A silencing plasmid. A clonal line of stable transfectants carrying the amoebapore A plasmid was named the “G3” strain, and was subsequently used to knockdown additional genes in the context of an amoebapore A knockdown mutant background [25]. Other strategies for gene knockdown have included expression of genes in the antisense orientation [26], expression of long double stranded RNA [27], phagocytosis of bacteria expressing dsRNA [28], and expression of short hairpin RNA [29]. These techniques have been useful for reverse genetic studies, but not robust enough to be effective for every gene of interest.

After it had already been exploited for targeted gene knockdown, the E. histolytica RNAi pathway was characterized in detail. Small RNAs were found to associate with a Piwi-related protein [30]. Small RNAs were 27nt in length, 5’ polyphosphorylated, and mapped antisense to genes [30]. Thus far, 5’ polyphosphorylated siRNAs have only been seen in E. histolytica and Caenorhabditis elegans [31]. The level of E. histolytica small RNAs was inversely correlated with gene expression [30]. Supporting that these small RNAs are relevant to previous approaches that exploited gene silencing in E. histolytica, in the G3 strain, 27nt small RNAs mapped to amoebapore A, and when additional genes were silenced in this background, new small RNAs mapped to the newly silenced gene [32]. There is no obvious Dicer homologue, but a potential noncanonical Dicer and Argonaute homologues have been characterized [33, 34].

Because endogenous small RNAs are polyphosphorylated, it was initially a challenge to determine how to effectively exploit this pathway for gene knockdown. A major breakthrough was the discovery that RNAi knockdown could be induced by spreading of silencing [35]. A new “trigger” plasmid approach was established, which relies on endogenous small RNAs directed to a 132 bp fragment of an endogenously-silenced gene (called the trigger) to drive spreading of silencing to a fragment of gene of interest that is adjacent to the trigger [35]. This leads to long-term silencing of the gene of interest, even after the trigger plasmid is removed, which may be due to the deposition of the repressive histone mark H3K27 at the chromosomal copy of the silenced gene [36]. This trigger plasmid approach has been successfully applied to knockdown many genes, and has been adapted for use in E. invadens [37–40].

In other parasites, RNAi knockdown has been successfully adapted for forward genetic approaches, and this has been very effective in uncovering genes relevant to parasite metabolism [41, 42], DNA repair [43], virulence [44], transmission [45, 46], and drug therapies [47–50]. Initially, these approaches relied on the generation of clonal lines after selection, and Sanger sequencing analysis of individual clones to identify the knocked down gene [41, 42]. Later approaches have used next generation sequencing on non-clonal mutant populations in order to identify mutants, and their relative abundance [43–51]. The latter approaches are useful because they can quantitatively identify mutants that are enriched by selection, and mutants that are less abundant or are lost after selection.

To enable forward genetics, we constructed and validated the first genome-wide E. histolytica RNAi knockdown mutant plasmid library. Its design allows for Illumina deep sequencing, and our novel analysis pipeline allows for quantitative identification of plasmids that are enriched or depleted after selection. Using a pilot version of the library, we performed the first RNAi screen in E. histolytica and identified 12 slow growth mutants. Genes that were knocked down in slow growing mutants included genes for which existing annotation provided no clues on function. Independent knockdown mutants phenocopied the growth defects seen in the library mutants. In the course of this work, new details of the E. histolytica RNAi pathway were revealed, as we found that both frame and directionality of the gene of interest are dispensable for silencing. These studies establish the first RNAi screen in E. histolytica, thereby dramatically improving the tractability of this pathogen.

Results

Approach for quantitative RNAi-based forward genetics

We devised an overall strategy for RNAi-based forward genetics in E. histolytica (Fig. 1). For this approach, a library of genes would be introduced into the E. histolytica “trigger” RNAi plasmid (pTrigger) [35], to create an RNAi library (Fig. 1a). Amoebae would then be stably-transfected with the library, and exposed to a selective pressure or screening strategy. Illumina deep sequencing analysis would then be used to sequence the inserts and to compare the selected mutants to control, non-selected amoebae, in order to identify mutants that are quantitatively enriched or depleted. We empirically determined that the stable transfection efficiency with the backbone of pTrigger is ~1/1000 amoebae. This relatively high stable transfection efficiency suggested that transfecting enough amoebae to achieve genome-wide RNAi knockdown would be feasible.

Fig. 1. RNAi library strategy.

a, The plasmid RNAi library contains random fragments of gDNA cloned into the trigger RNAi plasmid (pTriggerAdaptor). Different plasmid inserts are represented by different colors in this schematic. A heterogeneous population of stably transfected RNAi knockdown mutants can be exposed to selection or screening. To identify mutants that are represented in the population after selection/screening, gDNA is extracted for analysis. gDNA extracted from non-selected amoebae is used as a control. b, Schematic representation of the plasmid RNAi library. The trigger plasmid is shown, with different inserts represented by different colors. The “trigger” is a 132 bp fragment of EHI_048600. pTrigger was modified to create pTriggerAdaptor, containing custom Illumina TruSeq adaptor sequences (IA) upstream and downstream of the cloning site to facilitate amplicon sequencing. A single round of PCR is needed to add the remaining Illumina sequences (shown in dark gray) and barcodes necessary for Illumina deep sequencing analysis. c, An 812 bp fragment of a BAR/SH3 domain-containing protein (EHI_091530) was inserted into pTrigger with or without custom Illumina TruSeq adaptors flanking the cloning site. The BAR/SH3 domain-containing protein was in the forward orientation and in frame relative to the trigger sequence. Stably transfected heterogeneous mutants were assessed for knockdown of the targeted gene using RT-PCR, with (+) or without (−) reverse transcriptase. GAPDH (EHI_187020) was used as a loading control.

The Illumina priming sequences that are necessary for sequencing are typically added to samples by using a nested strategy with two sequential PCR reactions. We modified the trigger plasmid in order to flank the insert with Illumina adaptor sequences, creating pTriggerAdaptor, so that after extracting amoebic genomic DNA (gDNA), only a single PCR step would be needed to generate DNA ready for sequencing (Fig. 1b). Using fewer PCR reactions is expected to reduce the potential for PCR bias that can change the representation of sequences. An additional benefit of this modification is that the original sequences flanking the insert were A/T rich and were not optimal for primer design. The insertion of these Illumina sequences into the knockdown plasmid did not change knockdown efficacy (Fig. 1c).

Construction of a plasmid RNAi library

Building on these findings, we devised a strategy to construct a genome-wide plasmid RNAi library (Fig. 2a). We reasoned that since ~50% of the genome is coding, constructing a gDNA library would be a practical approach. To generate the plasmid library, random fragments of gDNA were inserted into the pTriggerAdaptor plasmid. Library construction was initially done on a small scale, to create a “pilot” library, where plasmids were harvested from a limited number of bacterial colonies. Building on the results of the pilot, a second library was constructed on a larger scale, to make a “final” library that was expected to have genome-wide coverage.

Fig. 2. RNAi library construction.

a, Steps to create the library. E. histolytica gDNA was extracted and treated with RNAse. Sonication, followed by gel purification, was used to isolate gDNA fragments. Fragments were end-repaired, A-tailed, and ligated to Y-shaped adaptors. The adaptors contain sequences corresponding to the region upstream and downstream of the cloning site in pTriggerAdaptor. The Y-shaped design ensures that, after PCR to make these sequences double-stranded, each gDNA fragment will be flanked by both the upstream/IA plasmid sequence (represented in black), and the downstream/IA sequence (represented in gray). Finally, the adaptor-ligated gDNA fragments were cloned into pTriggerAdaptor using Gibson cloning. b – c, Bioanalyzer traces show appropriate size shift after adaptor ligation, supporting that Y-shaped adaptors were successfully ligated. Example data are shown from optimization experiments that were performed to establish the protocols for library construction. d, Gibson cloning and plasmid preps were done en masse with pooled plasmids. To check cloning efficiency, individual colonies were obtained and analyzed. Shown are plasmid samples and restriction analysis from the pilot plasmid library preparation. Restriction analysis was performed with restriction enzymes that flank the cloning site (U, uncut; D, digested). Essentially all plasmids that were analyzed contained an insert of the appropriate size range.

Previous studies defined that an insert size ≥ 500 bp is sufficient for knockdown [35, 52]. Larger insert sizes are likely to lead to more efficient knockdown, but fragments > 1 kb are problematic for Illumina sequencing. The pilot library was constructed using gDNA fragments between ~400 – 600 bp. Building on the robust Illumina sequencing results from the pilot library (described in detail below), we constructed the final library, where gDNA fragments were larger (between ~500 – 900 bp) in order to maximize the efficacy of RNAi knockdown.

To make the RNAi library, high quality, RNA-free, gDNA (Fig. S1a – b) was sonicated and size-selected (Fig. S1c). Y-shaped adaptors were ligated to the fragmented DNA to ensure that each fragment was flanked by sequences matching the trigger plasmid upstream and downstream of the cloning site, in order to enable subsequent Gibson cloning. Adaptor ligation was successful, with an appropriate size shift seen using bioanalyzer analysis (Fig. 2b – c). Gibson cloning and plasmid preps were done en masse with pooled plasmids. To check cloning efficiency, individual colonies were obtained and analyzed. Restriction analysis demonstrated that the ligation efficiency was very high, since 63 out of 67 plasmids examined contained an insert (Fig. 2d).

The RNAi library has genome-wide coverage

Illumina sequencing was used to analyze the pilot and final libraries, and the fragmented gDNA that was used to construct each library. Notably, our approach for analysis of the sequencing data is novel and differs from previous approaches for analysis of RNAi screens [43–51], since all results presented here are based on full-length DNA-fragments inferred from read mapping results (Fig. S2), rather than on more traditional read mapping coverage or CDS overlap counting strategies. Our approach was designed to better represent the actual DNA fragments present in the plasmid library, instead of just the short ends of these fragments that were sequenced on the Illumina platform. This strategy allowed us to precisely define whether a fragment might target one or multiple genes, and enabled a clearer view of fragment based genome coverage. This approach will also allow us to readily quantify the representation of gene fragments in future RNAi screen experiments, in order to identify specific fragments that are particularly impactful in a particular screen.

There did not appear to be bias in gDNA fragmentation or construction of the library, since the nucleotide and dinucleotide frequencies of the fragmented gDNA and the plasmids were the same as the genome (Fig. 3a and S3a). Rarefaction curves and read fragment analysis showed that sequencing was performed to a sufficient depth to identify all unique fragments within the plasmid samples, but the same depth of sequencing was not deep enough to identify all unique fragments within the fragmented gDNA samples (Fig. S3b – c).

Fig. 3. The final RNAi library has genome-wide coverage.

a, Nucleotide frequency analysis of the HM1:IMSS reference genome, the pilot and final gDNA fragments, and the pilot and final plasmid libraries. Stacked bar plots show the percentage of each base. The nucleotide composition of the gDNA fragments and the final plasmid library is the same as the genome. b, Histograms showing the distribution of fragments sizes for the unique fragments in the pilot and final gDNA fragments, and the pilot and final plasmid libraries. Size in nucleotides is plotted on the × axis, and probability distribution density is plotted on the Y axis. c, Genome coverage plots of the final library, over the five largest contigs in the reference genome. The positions of unique fragments and the depth in fragments overlapping each genomic position are shown. The positions of genes are shown with black and green bars. Fragments that are in the same orientation as genes that are in the forward orientation in the reference genome are shown in black, and fragments that are in the same orientation as genes that are in the reverse orientation in the reference genome are shown in green. d, Visual representation of fragment coverage in the final library, over 100 kb of contig DS571145.

Illumina sequencing analysis was performed on separate batches of the plasmid library that had been cloned on different days, and on technical sequencing replicates. Similar read counts per fragment were obtained from each batch and each sequencing replicate, showing that cloning and sequencing were reproducible and there was no evidence for bias within these steps (Fig. S4a – b, Table S1). Analysis of the sequencing results showed that the gDNA fragments that were used to create the pilot library had a mean size of 421 bp, and the fragments in the pilot plasmid library had a mean size of 427 bp (Fig. 3b). The gDNA fragments that were used to create the final library had a mean size of 540 bp, and the fragments in the pilot plasmid library had a mean size of 555 bp (Fig. 3b). The nearly identical size ranges of the gDNA fragments and cloned fragments further support that there was no bias during library construction.

Genome coverage plots of the final plasmid library over the 5 largest contigs in the genome visually showed that coverage of these contigs was complete, and the depth of coverage was consistent (Fig. 3c). The breadth and depth of coverage over a zoomed in region of the largest contig is represented in Fig. 3d. On a genome-wide level, the genome was well covered by fragments that were cloned into the library in both the forward and reverse orientation relative to the orientation of coding sequences (Fig. 4, Table 1). Supporting that the final library contained genome-wide coverage, 97.24% of the genome was represented (Table 1). 8307 of 8333 annotated genes were represented, and the majority (8301 genes) were represented by more than one unique fragment. Of the 26 genes that are missing, 22 are likely to be misannotated in the genome, since they were not seen in any sample, including fragmented gDNA (Fig. S5). Additionally, for the majority of these genes, there was no empirical evidence for gene expression in prior RNAseq or proteomic studies (Tables S2 and S3). From these findings, we concluded that the final plasmid library has genome-wide coverage. We named the final library pEhKDL, since it represents an E. histolytica knockdown library.

View this table:

Table 1. Library coverage analysis.

Analysis of coverage in the pilot and final gDNA fragments, and the pilot and final plasmid libraries. Genomic coverage (reported as bp covered, forward, reverse, and unstranded) was calculated using fragments of the appropriate orientation, inferred from mapped reads and the coverage function in the GenomicRanges R/Bioconductor package. The total number of covered bp are indicated, and the corresponding percentage of the HM1:IMSS reference genome that this represents. Forward fragments are in the same orientation as genes that are in the forward orientation in the reference genome. Reverse fragments are in the same orientation as genes that are in the reverse orientation in the reference genome. Stranded fragments are in the same orientation as genes, and are the combined total of both forward and reverse fragments. Unstranded fragments are the total number of fragments, both in the same orientation as genes in the reference genome and in the opposite orientation as genes. Genes that are targeted are those with at least one unique fragment with greater than or equal to 27nt that overlaps the gene. Genes that are missed are those that do not fit the criteria for targeted genes. Genes that are targeted by multiple fragments are those with two or more unique fragments. Counts were calculated using the countOverlaps function in the GenomicRanges package and annotation from AmoebaDB. The number of unique fragments is the total number of unique fragments in each sample.

Figure 4. Fragment directionality and coverage analysis.

Heat map showing the number of unique fragments per gene in the pilot and final gDNA fragments, and the pilot and final plasmid libraries. The number of unique fragments (Log₂) for each gene are plotted (color value, see legend). Stranded fragments are in the same orientation as genes. Unstranded fragments are the total number of fragments, both in the same orientation as genes and in the opposite orientation as genes.

Forward genetic isolation of slow growing mutants

While the final library was under construction, E. histolytica was stably transfected with the pilot library. To determine if the RNAi library could be used to identify genes that contribute to a phenotype, we screened for mutants with abnormal growth. To first ask how many different plasmids were present per stably transfected amoeba, clonal lines were obtained using limiting dilution. Sanger sequencing analysis of clonal lines was consistent with only one unique plasmid (i.e., one unique insert) per transfected amoeba (Fig. S6).

To screen for growth phenotypes, growth rates were measured for a selected number of clonal lines. Slow Growth mutants 1 (SG1) and 2 (SG2) were isolated first and chosen for further characterization. These mutants had inserts corresponding to EHI_194310 and EHI_007010, which are annotated as a putative amino acid transporter and a hypothetical protein, respectively (Table 2). These mutants had a statistically significant growth defect (Fig. 5a). We have previously knocked down superoxide dismutase (EHI_159160) and determined that this results in a growth defect; thus, superoxide dismutase RNAi was used as a control. RT-PCR analysis showed that EHI_194310 and EHI_007010 were knocked down in the corresponding mutants (Fig. 5b).

View this table:

Table 2. Identified slow growth mutants.

Summary of growth phenotypes of SG1 – SG12 slow growth mutants, and analysis of the sequences present in each corresponding library plasmid. Growth values represent cell numbers normalized to vector control transfectants on day 4 of growth assays. The corresponding raw data growth assay data are found in Fig. 5a and Fig. S5a, c, e). Mutants exhibited statistically significant growth defects relative to vector control transfectants; p-values from day 4 are shown. The insert sizes are shown in bp. Insert directionality relative to the corresponding coding sequence in the annotated genome is shown. Insert frame relative to the trigger sequence in the pTriggerAdaptor is shown. The accession numbers for coding sequences that inserts correspond to are indicated, with available annotation information and GO terms for each gene (NA, not applicable). Predicted roles in metabolic pathways are shown; genes that are predicted to function in more than one metabolic pathway are indicated as “several.” These predictions are based on data from KEGG Metabolic Pathways (https://www.kegg.jp), the MetaCyc Metabolic Pathway Database [53], and Chemical Entities of Biological Interest (ChEBI) database [54].

Figure 5. Identification of slow growth mutants, with phenotypes that are reproducible in independently-generated knockdown mutants.

a, Growth analysis of clonal knockdown mutant lines obtained from screening the heterogeneous library transfectants for growth defects. Amoebae transfected with pTriggerAdaptor (vector control) were used as a negative control, and superoxide dismutase (EHI_159160) knockdown mutants were used as a positive control for slow growth. The library mutants SG1 and SG2 each exhibited a significant growth defect when compared to vector control transfectants. b, RT-PCR analysis demonstrates that the genes corresponding to the plasmid inserts in SG1 and SG2 (EHI_194310 and EHI_007010, respectively), were knocked down. c, Knockdown mutants were independently generated by using fragments of these genes that differed from the fragments in the library plasmids. These independent mutants are referred to as SG1i and SG2i. Growth analysis was performed using heterogeneous stable transfectants, and growth mutants exhibited a significant growth defect when compared to vector control. d, RT-PCR analysis demonstrates that EHI_194310 and EHI_007010 were knocked down. e, Comparison of growth phenotypes in SG1, SG2, SG1i and SG2i. Values were normalized to vector control transfectants. Mutants exhibited statistically significant growth defects relative to vector control transfectants.

To determine if knockdown mutant phenotypes were reproducible, Gibson cloning was used to generate independent knockdown plasmids that contained a fragment of either EHI_194310 or EHI_007010, which differed from the fragment in the library mutants. E. histolytica was stably transfected with each of these new plasmids to create the independent knockdown mutants, SG1i and SG2i. Growth of heterogeneous, stable transfectants was measured, and these mutants had statistically significant growth defects (Fig. 5c). RT-PCR analysis demonstrated that the corresponding target genes were knocked down (Fig. 5d). The library mutants and independently generated mutants each exhibited statistically significant growth defects (Fig. 5e). Thus, the phenotypes seen in library mutants were reproduced in independently generated mutants.

Bioinformatic analysis of slow growing mutants

From the pilot RNAi library transfection, ~64 additional clonal lines were obtained and 10 additional clonal lines with statistically significant growth defects were identified (Fig. S7; Table 2). The insert present in each clonal line was identified by Sanger sequencing, and the results were again consistent with one unique plasmid per transfected amoeba. Inserts ranged in size from 322 to 555 bp. In previous studies that used pTrigger for gene silencing, the gene of interest for knockdown was in the forward orientation and in frame with the trigger [35, 37–40, 52]. In the slow growth mutants, inserts were in the forward or reverse orientation, and in and out of frame (Table 2). Inserts that were in the reverse orientation and out of frame, like SG1, nonetheless led to measurable knockdown (Fig. 5b). Both the directionality of the gene of interest, and the frame, are therefore dispensable for knockdown.

Nine of the twelve slow growth mutants contained an insert that mapped to a gene (Table 2). The three mutants with inserts that did not clearly map to an annotated gene (SG3, SG9 and SG12) did map to potentially unannotated ORFs, and also partially mapped (i.e., contained stretches of homology) to annotated genes. BLAST was used to search for homologs of the ORFs that mapped to SG3, SG9, and SG12, and homologs were found in E. dispar for each ORF (Expect values of 4e -31, 1e -10, 1e -9, respectively), suggesting that these ORFs are genes that were missed in the annotation of the E. histolytica genome. The E. histolytica genome contains a very large number of gene families [15, 16], evidencing a high level of functional redundancy. One mutant, SG5, contained an insert that mapped to more than one gene (Table 2). Due to the level of identity between the multiple hits, multiple related genes may be knocked down in this mutant.

Several mutants contained inserts that mapped to genes with annotated functions consistent with roles necessary for cellular growth, such as an amino acid transporter, a major facilitator transporter, and a splicing factor (Table 2). Three mutants contained inserts with predicted roles in metabolic pathways, based on data from KEGG Metabolic Pathways (https://www.kegg.jp), the MetaCyc Metabolic Pathway Database [53], and Chemical Entities of Biological Interest (ChEBI) database [54] (Table 2). Several mutants contained inserts that mapped to genes annotated as hypothetical, or that lacked annotated domains (Table 2). Thus, in addition to validating the RNAi library approach, the 12 slow growth mutants that were uncovered shed light on genes that are required for E. histolytica growth and emphasize how forward genetics can be used to assign functional information to hypothetical proteins.

Discussion

We constructed and validated the first genome-wide E. histolytica RNAi knockdown mutant library. We performed a screen for slow growth mutants and identified 12 new mutants, providing a proof-of-concept that demonstrates the library can be used to identify mutant phenotypes. Independent knockdown mutants phenocopied the growth defects seen in the library mutants, showing that library mutant phenotypes are reproducible. Thus, our studies both identify genes that are critical for growth, and establish RNAi-based forward genetics in E. histolytica for the first time. The library represents a valuable new tool for E. histolytica, that can now be applied towards the study of many processes.

Slow growth mutants were identified among clonal knockdown mutant lines. 12 mutants were identified among ~80 total clonal lines. Among the identified genes targeted in these mutants, many genes had annotated functions consistent with roles in cellular growth, and/or had predicted roles in metabolic pathways, based on available metabolic pathway databases (https://www.kegg.jp/) [53, 54]. No genes were identified for which a role in cellular growth would be implausible. Several of the identified genes were annotated as hypothetical, or lacked annotated domains. All of the gene hits in our screen had previous evidence of gene expression in RNAseq analyses [55, 56]. Thus, the slow growth phenotype begins to provide functional information to genes for which existing annotation does not inform understanding of protein function. All of the annotated gene hits in our screen had homologues in the related human-infectious Entamoeba species E. dispar, E. nuttali and E. moshkovskii. Thus, this type of screen may be informative in identifying potential drug targets that are conserved among human-infectious Entamoeba species.

The E. histolytica genome contains numerous gene families [15, 16]. An advantage of RNAi vs. alternative approaches (e.g., gene knockouts) is the potential to simultaneously interfere with multiple gene family members. RNAi is similarly advantageous for genomes that are polyploid, which is also relevant to the potentially tetraploid [14] E. histolytica genome. We identified one mutant that contained an insert that mapped to more than one gene. With the high levels of sequence identity, it is likely that multiple genes are knocked down in this mutant. We were able to identify a slow growth phenotype in this mutant, which suggests that the RNAi library may be effective in identifying mutant phenotypes even in the context of high levels of functional redundancy.

We have used pTrigger to knockdown many genes in our laboratory, and, in general, have been able to recover knockdown mutants for essentially every targeted gene. Compared to clonal lines, heterogeneous transfectants have typically had more variable levels of knockdown, and sometimes appear to have lower levels of knockdown (unpublished data). Among clonal lines obtained from heterogeneous transfectants, we have seen wide variation, including clones with high levels of knockdown and clones that lack knockdown. The SG1 and SG2 clonal lines initially obtained from the library appeared to have an equal or higher level of knockdown than the heterogeneous independent SG1i and SG2i mutants that targeted the same gene. The heterogeneous independent knockdown mutant SG1i had a less severe growth phenotype than the clonal library mutant SG1, which may be due to variation in knockdown in the heterogeneous population. Loss of gene silencing (and loss of a growth phenotype) over time was noted in the heterogeneous SG2i mutants (data not shown), consistent with the selective disadvantage of the growth phenotype. Related to this point, when a selection is applied to a library mutant population, it will be important to compare selected and non-selected controls, in order to differentiate mutants that are lost due to a growth disadvantage vs. mutants that are lost due to selection.

The library was designed to facilitate Illumina deep sequencing analysis in order to quantitatively identify mutants that are enriched or lost after selection. We constructed pilot and final versions of the library. Multiple lines of evidence supported that there was no bias in library construction, including nucleotide and dinucleotide frequencies, and size comparisons between the gDNA fragments and fragments in the final plasmids. Libraries were prepared in batches that were cloned on different days. Similar read counts per fragment were obtained from batches, again supporting that there was no bias in cloning or sequencing analysis. The majority of the genome (97.24%) was covered in the final plasmid library, and the majority of genes were represented (8307 out of 8333 annotated genes). 26 annotated genes are not represented. Of these, 22 genes were not seen in any sample, including gDNA, suggesting that they represent errors in annotation. Consistent with this, there was no published evidence for gene expression for the majority of these genes. The 4 remaining genes that were not seen in the final library had some of the lowest sequencing coverage in the fragmented gDNA sample, suggesting that they were missed due to lack of abundance in the fragmented gDNA that was used for library construction. Thus, with 8307 out of 8311 correctly annotated genes represented, and 97.24% of the genome, we concluded that the library represented full genome coverage.

A future variation on this approach would be to make a library using fragmented cDNA instead of gDNA. We reasoned that since ~50% of the genome is coding, gDNA is a reasonable approach, since ~50% of the plasmids in the library would lead to gene silencing. A limitation of a gDNA library is that some sequences such as rRNA and tRNA, which are abundant in the genome, are represented in the library, and are not useful for identification of mutant phenotypes. A cDNA library would remove these sequences, but would carry other limitations. The most highly expressed genes would be overrepresented in a cDNA library, and genes that are not expressed in the conditions in which cDNA was obtained would not be represented in the library. The lower abundance of material for fragmentation and cloning is also a consideration. mRNA is a very small proportion of total RNA, and in our initial attempts to prepare cDNA for library construction, insufficient material was recovered. With more initial total RNA, a cDNA library could be constructed in the future, and would be complementary to the gDNA library that was constructed here.

We asked whether the frame or the directionality of the insert was critical for knockdown using the trigger plasmid. In previous studies, the gene of interest for knockdown was in the forward orientation and in frame with the trigger [35, 37–40, 52]; thus, the frame and orientation requirements had not been tested experimentally. Among the slow growth mutants that we identified, inserts were in the forward or reverse orientation, and in and out of frame. Note that the adaptor sequence that was inserted between the trigger and the insert does not change the frame. Inserts that were in the reverse orientation and out of frame still led to knockdown. Both the directionality of the gene of interest, and the frame, are therefore dispensable for knockdown. Since the RNAi pathway in E. histolytica is unusual and mechanistic details are still coming into focus, our results add to understanding of this pathway. The lack of orientation and frame requirements for knockdown is useful, since gDNA fragments are random and thus are ligated into the library plasmid in either orientation and in any frame. Previous studies had defined that a gene fragment ≥ 500 bp is sufficient for knockdown [35, 52]. We found that fragments as small as 322 bp were sufficient to result in slow growth phenotypes, suggesting that efficient knockdown is possible with smaller fragments.

E. histolytica is dramatically understudied relative to other parasites, even though it heavily impacts human health, causing significant morbidity and mortality. The RNAi library that we developed dramatically improves the tractability of E. histolytica. In other parasites, RNAi screens have uncovered genes relevant to molecular and cellular biology, virulence and therapeutics. We look forward to the new information about E. histolytica that can be gleaned by applying RNAi screens in the future. Since the design of this RNAi library allows for quantitative identification of mutants that are enriched or depleted after selection, we expect that this library can be used to identify genes involved in a wide variety of processes, including essential genes that are relevant to the development of new therapies.

Materials and Methods

Cell culture

E. histolytica HM1:IMSS (ATCC 30459) trophozoites were cultured in TYI-S-33 media, supplemented with 15% heat inactivated Adult Bovine Serum (Gemini Bio Products), 80 units/ml penicillin and streptomycin (Gibco), and 2.3% Diamond Vitamin solution 80 Tween 40x (Sigma Aldrich), at 35°C. Amoebae were harvested when flasks were approximately 80% confluent.

gDNA Extraction

E. histolytica gDNA was harvested using the Quick-DNA miniprep kit (Zymo), cetyl trimethylammonium bromide (CTAB) extraction, or the Wizard gDNA extraction kit (Promega). The Quick-DNA miniprep kit (Zymo) was used according to manufacturer’s instructions for cell suspension samples. For CTAB extraction, cells were washed and then lysed at 55°C for 1 hour in a solution of 0.1 M EDTA with 0.25% SDS and 0.1 mg/ml proteinase K. NaCl was added to a final concentration of 0.7 M, pre-warmed CTAB was added to a final concentration of 1%, and samples were incubated for 20 minutes at 65°C. An equal volume of chloroform was added, and samples were centrifuged at 20,600 x g for 10 minutes. Supernatants were transferred into fresh tubes, an equal volume of phenol:chloroform:isoamyl alcohol (25:24:1) was added, and samples were centrifuged at 20,600 x g for 10 minutes. Supernatants were transferred into fresh tubes, DNA was precipitated with 2.5 volumes of 100% ethanol at room temperature for 5 minutes, and samples were centrifuged at 20,600 x g for 10 minutes. DNA pellets were washed in 70% ethanol. DNA was left to dry at room temperature and dissolved in water. For DNA harvested using the Quick-DNA miniprep kit or CTAB extraction, RNAseIF (NEB) was used to remove RNA contamination, according to manufacturer’s instructions. The Wizard Genomic DNA Purification kit (Promega) was used according to the manufacturer’s instructions for the tissue culture cells protocol. Because DNA harvested using the Wizard Genomic DNA Purification kit was of higher quality, gDNA for plasmid library construction was prepared using this kit.

Plasmids and cloning

The trigger plasmid [35] contains 132 base pairs of the trigger gene (EHI_048600) in the pKT3M expression plasmid backbone [57]. Q5 Site Directed Mutagenesis (NEB) was used to insert a SmaI restriction enzyme site after the trigger for cloning purposes, to create “pTrigger”. To create knockdown plasmids to target individual genes, pTrigger was linearized using SmaI (NEB) and dephosphorylated using rSAP (NEB). A fragment of the gene of interest was amplified from gDNA using PCR primers that contained homology to pTrigger, and the PCR product was then cloned into the backbone using the Gibson Assembly Ultra kit (Synthetic Genomics Inc.), following the manufacturer’s instructions. See Table S4 for the sequences of primers that were used during plasmid construction.

To test if the Illumina adaptor sequences interfere with knockdown, pTrigger plasmids were created with and without adaptors. First, a pTrigger plasmid targeting a BAR/SH3 domain-containing protein was created by inserting 812 bp of EHI_091500 (147 – 958). This plasmid was further modified using Q5 Site Directed Mutagenesis (NEB) to add the custom adaptors CACTCTTTCCCTACACGACGCTCTTCCGATCT and GATCGGAAGAGCACACGTCTGAACTCCAGTCAC, which are compatible with Illumina TruSeq sequencing primers, upstream and downstream of the insert, respectively. After determining that the presence of the Illumina adaptors did not interfere with knockdown, this construct was modified to remove the BAR/SH3 domain-containing protein and replace it with a SmaI restriction site for further cloning purposes, to create pTriggerAdaptor. pTriggerAdaptor was used as the backbone for plasmid library generation.

Superoxide dismutase knockdown mutants were generated by inserting 597 bp of EHI_159160 (1 – 597) into pTrigger. Independent knockdown mutants (SG1i) of SG1 were generated by inserting 525 bp of EHI_194310 (32 – 556) into pTrigger. Independent knockdown mutants (SG2i) of SG2 were generated by inserting 487 bp of EHI_007010 (59 – 545) into pTrigger.

Plasmid library construction

E. histolytica gDNA was harvested using the Wizard Genomic DNA Purification kit (Promega), following the manufacturer’s instructions for the tissue culture cells protocol. RNAse treatment is included in this protocol. Quality and size of gDNA was verified by Qubit and gel electrophoresis. gDNA was further purified and concentrated using TailorMag Purification beads (SeqMatic), and then sheared using a Covaris M220 sonicator. For size selection, sheared gDNA was loaded into three lanes on a 6% TBE gel. The gel was stained using SYBR Safe DNA Gel Stain (Invitrogen), and three bands, covering from ~400 – ~600 bp (for the pilot library) and ~500 – ~900 bp (for the final library) were extracted from the gel using the TailorCut Gel Cutter and Gel Breaker system (SeqMatic). gDNA was purified using TailorMag Purification beads. The sizing and concentration of the purified material was determined by TapeStation (Agilent). Purified gDNA corresponding to the second and third gel slices from each lane was combined. gDNA was further purified and concentrated using TailorMag Purification beads, and then processed using a TruSeq XT library prep kit (Illumina). All steps followed the manufacturer’s protocol except: 1. Ligation time was increased to 180 minutes, 2. Custom adaptors were used instead of Illumina TruSeq adaptors. The adaptors that were ligated had the following sequences: Adaptor 1: ACACTCTTTCCCTACACGACGCTCTTCCGATCCCC*T and Adaptor 2: /5’Phos/GGGGATCGGGAGAGCACACGTCTGAACTCTAGTCAC. After adaptor ligation, the gDNA fragments were amplified using 5 rounds of PCR with KAPA HiFi 2X master mix (KAPA Biosystems). See Table S4 for primer sequences. After PCR, fragmented gDNA was purified and concentrated using TailorMag Purification beads.

pTriggerAdaptor was linearized using SmaI (NEB) and dephosphorylated using rSAP (NEB). Linearization was verified using gel electrophoresis. 25 ng of fragmented gDNA with ligated adaptors was combined with 25 ng of linearized pTriggerAdaptor. The Gibson Assembly Ultra kit (Synthetic Genomics Inc.) was used according to the manufacturer’s instructions. High Efficiency 5-alpha Competent E. coli (NEB) were transformed with 3 μl of the Gibson mix and plated on four LB plates containing ampicillin (Sigma). We determined that the Gibson cloning efficiency with pTriggerAdaptor was ~1/250, which was used to calculate the amount of insert needed to generate a library at a given target level of coverage. With an average insert size of ~550 bp, a minimum of ~36,354 plasmids would be needed to fully cover the genome at 1X coverage. For the pilot library, one E. coli transformation reaction was performed and 4 plates of bacteria were harvested into 200 ml of LB liquid broth containing ampicillin, and incubated at 37°C for 12-18 hours, shaking. For the final library, Gibson cloning, E. coli transformation, and plasmid preparations were performed over four separate days to create four batches of final library plasmids. The final plasmid library represents the combined total of the four batches.

For the final library, 25 E. coli transformation reactions were performed and 100 plates of bacteria were harvested into LB liquid broth containing ampicillin, at a concentration of ~1,000 colonies per 200 ml. Plasmids were harvested using the GenElute HP plasmid maxiprep kit (Sigma), using 200 ml of culture per plasmid purification column, and ethanol precipitated according to the manufacturer’s instructions.

Illumina sequencing of gDNA fragments and plasmid samples

Fragmented gDNA (with ligated custom adaptors; see ‘plasmid library construction’), or plasmid inserts were amplified with KAPA PCR (Kapa Biosystems) using full length TruSeq Illumina primers, as follows: Primer 1: GATCGGAAGAGCACACGTCTGAACTCCAGTCAC[i7]ATCTCGTATGCCGTCTTCTGCTTG and Primer 2: AATGATACGGCGACCACCGAGATCTACAC[i5]ACACTCTTTCCCTACACGACGCTCTTCCGATCT. Three technical replicates were sequenced for each sample. Samples were sequenced on an Illumina MiSeq at either the DNA Technologies and Expression Analysis Core (University of California, Davis), or on an Illumina MiSeq at SeqMatic, LLC. Illumina sequencing data analysis was performed at the Bioinformatics Core (University of California, Davis).

Illumina sequencing analysis

Illumina sequencing data was demultiplexed using barcode sequences added during library preparation with Illumina demultiplexing software. All reads were sequenced as PE 151bp runs. The resulting paired reads were then cleaned using HTStream (https://github.com/ibest/HTStream) to 1) remove reads containing PhiX Control Library sequences, 2) trim Illumina sequencing adaptors, and 3) discard primer dimers and other reads where adaptor trimming was unsuccessful.

Cleaned reads were mapped against the E. histolytica release 46 genome assembly (downloaded from https://amoebadb.org/common/downloads/release-46/EhistolyticaHM1IMSS/fasta/data/AmoebaDB-46_EhistolyticaHM1IMSS_Genome.fasta) using bwa mem v2.0pre1 [58]. Output from bwa was passed to samtools v1.10 [59] for sorting and indexing, and also to extract reads from for visualization of specific loci. Results from mapping and subsequent analysis were visualized and reviewed in Geneious Prime 2020.0.3 (https://www.geneious.com). Identifying unique fragments and counting their frequency was done using a custom Python script. Briefly, the script iterates over read-sorted BAM records. Read pairs that were mapped to the same contig with proper orientation, a minimum of 146 matched bases, and no more than four bases soft clipped from the 5’ end of either read pair were used to identify fragments. Read mapping orientation was used to infer fragment orientation. A counter for each unique fragment was incremented for additional reads supporting the fragment. Reads that were mapped in proper orientation but that did not meet other criteria were compared against the fragment intervals with an incremented count assigned to the highest scoring overlap. As a last step, the script generates summary statistics, an interval table with counts for all samples, and SAF and gff annotation files for each sample.

Output from the Python script was then loaded into R v3.6.1 [60] for further analysis and to generate figures. Briefly, data.table [61] was used to load the full set of intervals, order them by genomic position, and to compute summaries by sample group. Tidyr [62]was used for reformatting data, and most plots were generated with ggplot2 [63]. The vegan package [64] was used for calculating saturation curve values. The GenomicRanges package [65] was used in combination with gene annotation (downloaded from https://amoebadb.org/common/downloads/release-46/EhistolyticaHM1IMSS/gff/data/AmoebaDB-46_EhistolyticaHM1IMSS.gff) to determine and count fragments targeting genes, all reported counts used a minimum overlap of 27bp. Genome coverage plots were generated with karyoploteR [66] and heatmaps were plotted using pheatmap [67]. Scripts and more details about the analysis are available at the GitHub repository associated with this publication: https://github.com/samhunter/Bettadapur-et-al-Entamoeba-histolytica-RNAi-library.

All sequencing data from this project have been deposited in the Sequence Read Archive (SRA) database (https://www.ncbi.nlm.nih.gov/sra), under project number SUB7459667.

Transfection and clonal lines

Library plasmids were transfected into amoebae using Attractene transfection reagent (Qiagen) [12, 68]. 50 ml M199 (Gibco medium M199 with Earle’s salts, L-glutamine, and 2.2 g/liter sodium bicarbonate, without phenol red), was supplemented with 0.045 g L-Cysteine, 0.3 g HEPES, 0.005 g ascorbic acid (all from Sigma) and the pH was adjusted to 6.8. To 42.5 ml of this medium, 7.5 ml of heat inactivated Adult Bovine Serum (Gemini bio-products) was added to make transfection medium. Amoebae were resuspended in transfection medium at a concentration of 2.5×10⁵ cells/ml, and then 1.8 ml of amoebae, 10 μg plasmid DNA (in 200 μl of transfection medium), and 37.5 μl Attractene reagent (Qiagen) were combined into a 2.0 ml screw-cap cryotube. Cryotubes were incubated horizontally at 35°C for 3 hours. Two cryotubes per transfection were then added to a T25 flask containing 66 ml of fresh TYI media. 24 hours later, 6 μg/ml geneticin (Invitrogen) was added. Stable transfectants were selected and maintained with geneticin at 6 μg/ml.

The efficiency of stable transfection was determined by performing limiting dilution. Three independent experiments were performed. Amoebae were transfected using the process described above, except that 24 hours after transfection, amoebae were harvested and resuspended in TYI at a concentration of 2×10⁶ cells/ml. Amoebae were then serially diluted into 96-well plates, in the presence or absence of 6 μg/ml geneticin. Plates were incubated in GasPak EZ pouch (BD Biosciences) and growth was monitored daily. The plates without Geneticin were used to calculate the plating efficiency. In the plates without geneticin, growth was seen in wells corresponding to approximately 2.5 cells on average. In the plates containing geneticin, growth was seen in wells containing 781 to 3125 amoebae. Taken together, the corrected stable transfection efficiency was ~1/1000 amoebae.

For generation of clonal mutant lines for phenotypic characterization, limiting dilution was performed in 96-well plates in a GasPak EZ pouch (BD Biosciences). For identification of mutants with growth defects, clonal lines that visually appeared to have slow growth were selected for further analysis. RNA was extracted using the Direct-zol RNA kit (Zymo). gDNA (which contains exogeneous plasmids) was extracted using the Quick-DNA miniprep kit (Zymo). Sanger sequencing was used to identify the library insert associated with each clonal line. Sequencing was carried out by Genewiz. To identify the gene targeted for knockdown in each clonal line, the insert sequence was used to search the reference genome. Genes with at least 50nt of continuous sequence identity with the insert were considered hits. See Table S4 for sequencing primer information.

Growth analysis

Clonal stable transfectants were resuspended to 8×10³ cells/ml and 1 ml of cells was added to three tubes containing 13 ml fresh TYI with geneticin. Cells were incubated at 35°C. At days 2, 4, and 6, one tube of cells was harvested per day and counted using a TC20 Automatic cell counter (BioRad). Vector control transfected amoebae (containing pTriggerAdaptor) were used as a negative control, and superoxide dismutase heterogeneous knockdown mutants were used as a positive control for slow growth. Two technical replicates were performed per experiment and three independent growth curve experiments were performed for each mutant.

RT-PCR

RT-PCR was used to assess knockdown in transfectants. RNA was extracted using the Direct-zol RNA kit (Zymo). DNAseI (Turbo) treatment and SuperScript Reverse Transcription II (Invitrogen) was performed as per manufacturer instructions to obtain cDNA. RT-PCR was performed on cDNA harvested from transfectants containing the following constructs: BAR/SH3 domain-containing protein with and without Illumina adaptors, Superoxide Dismutase; SG1/SG1 independent knock down; and SG2/SG2 independent knockdown. GAPDH (EHI_187020) RT-PCR was used as a positive control. See Table S4 for RT-PCR primer sequences.

Statistical analysis

GraphPad Prism was used for student’s unpaired t test statistical analysis and normalization (baseline removal). Mean values and standard deviations are shown in the figures, with t test values reported as follows: ns = p > 0.05, * = p < 0.05, ** = p < 0.01, *** = p < 0.001, **** = p < 0.0001.

Author Contributions

A.B. designed and performed all of the experiments, except fragmentation of gDNA. C.G.B. fragmented gDNA and processed the fragments to enable subsequent plasmid library construction. S.S.H. designed the analysis strategy for Illumina sequencing data, and analyzed all Illumina sequencing data. M.L.S. contributed to the overall strategy for constructing, sequencing and analyzing the plasmid library, and oversaw the Illumina analysis approach. K.S.R. conceived of the overall approach and oversaw the design and analysis of the experiments. A.B., S.S.H., and K.S.R. wrote the manuscript.

Competing Interests

The authors declare that they have no competing interests.

Supplementary Figure Legends

Figure S1. Preparation of RNA-free gDNA from E. histolytica.

Gel electrophoresis analysis of gDNA extracted using the Wizard gDNA extraction kit (Promega), Quick-DNA miniprep kit (Zymo), and CTAB extraction. a, gDNA isolated using the Wizard kit had no apparent RNA contamination, and the gDNA appeared to be of high quality. b, gDNA isolated using the Quick-DNA kit or CTAB was contaminated with RNA. RNA contamination was removed by treatment with RNAseIF and subsequent purification. c, Bioanalyzer traces of pilot and final gDNA samples, after fragmentation and size selection.

Figure S2. Fragment Mapping Analysis.

Example view of mapped genomic fragments (“intervals”) represented in the final plasmid library. For comparison, the mapped paired end reads that give rise to each fragment are shown below each fragment. Shown are results from final plasmid batch two, sequencing replicate one, mapped to contig DS571153.

Figure S3. Appropriate sequencing depth and lack of bias during library construction.

a, Dinucleotide frequency analysis of the HM1:IMSS reference genome, the pilot and final gDNA fragments, and the pilot and final plasmid libraries. Stacked bar plots show the percentage of each dinucleotide. The dinucleotide composition of the gDNA fragments and the final plasmid library is the same as the genome. b – c, Rarefaction curves for pilot (panel b) and final (panel c) samples. gDNA fragment samples are shown in grey and plasmid samples are shown in orange. The number of fragments detected with increasing sequence reads shows if sequencing was performed to saturation. Three sequencing replicates are shown for pilot gDNA samples and three sequencing replicates are shown for pilot plasmid samples. Three sequencing replicates are shown for final gDNA samples and three sequencing replicates are shown for each of four batches of final plasmid samples.

Figure S4. Reproducible fragment and read counts from biological and technical replicates.

“Batches” of the final plasmid library were cloned on different days, and represent biological replicates. “Reps” correspond to independently prepared sequencing preparations, and represent technical replicates. a, Box plot of read counts per fragment in plasmid samples, with pilot plasmids in orange and final plasmids in grey. Log₂ read counts are shown; outliers (outside of 2^nd and 98^th percentile) are plotted as points, and the box represents the 25^th, and 75^th percentile with median indicated as a bold line. Most fragments were represented by multiple sequenced reads. b, Stacked bar plots of the total number of fragments detected per sample shows that replicates are similar to each other in the number of fragments identified and that roughly equal numbers of forward and reverse fragments were produced consistently. Forward fragments are in the same orientation as genes that are in the forward orientation in the reference genome. Reverse fragments are in the same orientation as genes that are in the reverse orientation in the reference genome.

Figure S5. Annotated genes that were missed or were least covered by sequencing reads.

Heat map showing the number of unique fragments per gene, for the 50 least covered genes in the pilot and final gDNA fragments, and the pilot and final plasmid libraries. The color intensity indicates Log₂ fragment coverage, with dark blue indicating no coverage. Stranded fragments are in the same orientation as genes. Unstranded fragments are the total number of fragments, both in the same orientation as genes and in the opposite orientation as genes.

Figure S6. Clonal transfectants contain only one library plasmid.

SG1 gDNA was extracted using the Quick-DNA miniprep kit and PCR amplified using the CS5’ and CS3’ primers that sit in the pTriggerAdaptor backbone. Sanger sequencing was performed using the RNAi Sequencing Library F primer. Adaptor 2 starts at the “GGGG” series of nucleotides, and upstream of this is the SG1 (EHI_194310) sequence. Noncompeting peaks in the chromatogram suggest that there is one unique genome fragment present in this mutant.

Figure S7. Identification of slow growing RNAi knockdown mutants.

Growth analysis of mutants SG1 – SG12. a, c, e, Growth analysis demonstrates that selected clonal knockdown mutant lines have significant growth defects compared to vector control transfectants. b, d, f, Growth was normalized and compared to vector control transfectants on each day. Mutants exhibited statistically significant growth defects relative to vector control transfectants.

Supplementary Table Legends

Table S1. Sequencing and fragment count statistics.

Total number of paired reads mapping to the same contig (mappedPairs) and the number of fragments inferred from these mapped pairs for each sample. Reads were mapped with bwa mem, and processed from the resulting SAM formatted file using a custom Python script. Three sequencing replicates were performed for pilot gDNA samples and three sequencing replicates were performed for pilot plasmid samples. Three sequencing replicates were performed for final gDNA samples and three sequencing replicates were performed for each of four batches of final plasmid samples.

Table S2. Expression data for annotated genes that were missing in the final plasmid library.

Available empirical expression data for genes that were missing in the final plasmid library (genes that are represented in blue in Fig. S3). Accession numbers, annotation information, and the number of paralogs in the reference genome are indicated. Presence (+) or absence (−) in gDNA and plasmid datasets is shown. Gene expression from available RNAseq datasets [55, 56] is indicated, as the FPKM + 1 value. The highest value for FPKM + 1 for each gene is shown. The total number of spectra that have been identified in proteomics studies is indicated.

Table S3. Expression data for representative genes present in the final plasmid library.

To provide a frame of reference for the expression data in Table S2, shown are the available empirical expression data for 10 random, representative genes that were present in the final plasmid library. Accession numbers, annotation information, and the number of paralogs in the reference genome are indicated. Presence (+) or absence (−) in gDNA and plasmid datasets is shown. Gene expression from available RNAseq datasets [55, 56] is indicated, as the FPKM + 1 value. The highest value for FPKM + 1 for each gene is shown. The total number of spectra that have been identified in proteomics studies is indicated.

Table S4. Primer sequences.

Sequences of the primers used to generate and sequence plasmids, perform RT-PCR, and create the plasmid RNAi library.

Acknowledgements

We thank Hannah Miller, Rene Suleiman, Scott Dawson and Su-Ju Lin for helpful discussions. We thank Lutz Froenicke for helpful advice on designing the library to enable Illumina sequencing. DNA sequencing was carried out by the DNA Technologies and Expression Analysis Cores at the UC Davis Genome Center, supported by NIH Shared Instrumentation Grant 1S10OD010786-01. A.B. was supported by a fellowship from the UC Davis Training Program in Biomolecular Technology. This work was funded by NIH grant 1R01AI146914 and a Pew Scholarship awarded to K.S.R.

References

1.↵
WHO/PAHO/UNESCO report. A consultation with experts on amoebiasis. Mexico City, Mexico 28-29 January, 1997. Epidemiol Bull, 1997. 18(1): p. 13–4.
OpenUrl PubMed
2.↵
GBD 2015 Mortality and Causes of Death Collaborators. (2016). Global, regional, and national life expectancy, all-cause mortality, and cause-specific mortality for 249 causes of death, 1980-2015: a systematic analysis for the Global Burden of Disease Study 2015, in Lancet (London, England). Elsevier. p. 1459–1544.
3.↵
Gilchrist, C.A., et al., Role of the Gut Microbiota of Children in Diarrhea Due to the Protozoan Parasite Entamoeba histolytica. The Journal of infectious diseases-, 2016. 213(10): p. 1579–1585.
OpenUrl CrossRef PubMed
4.↵
Mondal, D., et al., Entamoeba histolytica-associated diarrheal illness is negatively associated with the growth of preschool children: evidence from a prospective study. Trans R Soc Trop Med Hyg, 2006. 100(11): p. 1032–8.
OpenUrl CrossRef PubMed
5.↵
Kotloff, K.L., et al., Burden and aetiology of diarrhoeal disease in infants and young children in developing countries (the Global Enteric Multicenter Study, GEMS): a prospective, case-control study. The Lancet, 2013. 382(9888): p. 209–222.
OpenUrl CrossRef
6.↵
Gonzales, M.L.M., L.F. Dans, and J. Sio-Aguilar, Antiamoebic drugs for treating amoebic colitis. The Cochrane database of systematic reviews, 2019. 1(1): p. CD006085–CD006085.
OpenUrl
7.↵
Ralston, K.S. and W.A. Petri, Jr. Tissue destruction and invasion by Entamoeba histolytica. Trends Parasitol, 2011. 27(6): p. 254–63.
OpenUrl CrossRef PubMed
8.↵
Ravdin, J.I. and R.L. Guerrant, Role of adherence in cytopathogenic mechanisms of Entamoeba histolytica. Study with mammalian tissue culture cells and human erythrocytes. J Clin Invest, 1981. 68(5): p. 1305–13.
OpenUrl CrossRef PubMed Web of Science
9.↵
Saffer, L.D. and W.A. Petri, Jr.., Role of the galactose lectin of Entamoeba histolytica in adherence-dependent killing of mammalian cells. Infect Immun, 1991. 59(12): p. 4681–3.
OpenUrl Abstract/FREE Full Text
10.↵
Keene, W.E., et al., The major neutral proteinase of Entamoeba histolytica. J Exp Med, 1986. 163(3): p. 536–49.
OpenUrl Abstract/FREE Full Text
11.↵
Lidell, M.E., et al., Entamoeba histolytica cysteine proteases cleave the MUC2 mucin in its C-terminal domain and dissolve the protective colonic mucus gel. Proc Natl Acad Sci U S A, 2006. 103(24): p. 9298–303.
OpenUrl Abstract/FREE Full Text
12.↵
Ralston, K.S., et al., Trogocytosis by Entamoeba histolytica contributes to cell killing and tissue invasion. Nature, 2014. 508(7497): p. 526–30.
OpenUrl CrossRef PubMed
13.↵
Bettadapur, A., H.W. Miller, and K.S. Ralston, Biting off what can be chewed: Trogocytosis in health, infection and disease. Infection and Immunity., 2020. In Press.
14.↵
Willhoeft, U. and E. Tannich, The electrophoretic karyotype of Entamoeba histolytica. Molecular and Biochemical Parasitology, 1999. 99(1).
15.↵
Lorenzi, H.A., et al., New Assembly, Reannotation and Analysis of the Entamoeba histolytica Genome Reveal New Genomic Features and Protein Content Information. PLOS Neglected Tropical Diseases, 2010. 4(6): p. e716.
OpenUrl
16.↵
Wilson, I.W., et al., Genetic Diversity and Gene Family Expansions in Members of the Genus Entamoeba. Genome biology and evolution, 2019. 11(3): p. 688–705.
OpenUrl
17.↵
Loftus, B., et al., The genome of the protist parasite Entamoeba histolytica. Nature, 2005. 433(7028): p. 865–8.
OpenUrl CrossRef PubMed Web of Science
18.↵
Lorenzi, H.A., et al., New assembly, reannotation and analysis of the Entamoeba histolytica genome reveal new genomic features and protein content information. PLoS neglected tropical diseases, 2010. 4(6): p. e716–e716.
OpenUrl
19.↵
Hamann, L., R. Nickel, and E. Tannich, Transfection and continuous expression of heterologous genes in the protozoan parasite Entamoeba histolytica. Proceedings of the National Academy of Sciences, 1995. 92(19): p. 8975–8979.
OpenUrl Abstract/FREE Full Text
20.↵
Hamann, L., H. Buß, and E. Tannich, Tetracycline-controlled gene expression in Entamoeba histolytica. Molecular and Biochemical Parasitology, 1997. 84(1): p. 83–91.
OpenUrl CrossRef PubMed Web of Science
21.↵
King, A.V., et al., A genome-wide over-expression screen identifies genes involved in phagocytosis in the human protozoan parasite, Entamoeba histolytica. PloS one, 2012. 7(8): p. e43025–e43025.
OpenUrl CrossRef PubMed
22.↵
Koushik, A.B., et al., A genomewide overexpression screen identifies genes involved in the phosphatidylinositol 3-kinase pathway in the human protozoan parasite Entamoeba histolytica. Eukaryotic cell, 2014. 13(3): p. 401–411.
OpenUrl Abstract/FREE Full Text
23.↵
Bracha, R., et al., Antisense inhibition of amoebapore expression in Entamoeba histolytica causes a decrease in amoebic virulence. Mol Microbiol, 1999. 34(3): p. 463–72.
OpenUrl CrossRef PubMed Web of Science
24.↵
Bracha, R., Y. Nuchamowitz, and D. Mirelman, Transcriptional silencing of an amoebapore gene in Entamoeba histolytica: molecular analysis and effect on pathogenicity. Eukaryotic cell, 2003. 2(2): p. 295–305.
OpenUrl Abstract/FREE Full Text
25.↵
Bracha, R., et al., Transcriptional silencing of multiple genes in trophozoites of Entamoeba histolytica. PLoS Pathog, 2006. 2(5): p. e48.
OpenUrl CrossRef PubMed
26.↵
Stock, R.P., et al., Inhibition of gene expression in Entamoeba histolytica with antisense peptide nucleic acid oligomers. Nature Biotechnology, 2001. 19(3): p. 231–234.
OpenUrl CrossRef PubMed Web of Science
27.↵
Kaur, G. and A. Lohia, Inhibition of gene expression with double strand RNA interference in Entamoeba histolytica. Biochemical and Biophysical Research Communications, 2004. 320(4): p. 1118–1122.
OpenUrl CrossRef PubMed
28.↵
Solis, C.F., et al., Use of bacterially expressed dsRNA to downregulate Entamoeba histolytica gene expression. PloS one, 2009. 4(12): p. e8424–e8424.
OpenUrl CrossRef PubMed
29.↵
Linford, A.S., et al., Short hairpin RNA-mediated knockdown of protein expression in Entamoeba histolytica. BMC microbiology, 2009. 9: p. 38–38.
OpenUrl CrossRef PubMed
30.↵
Zhang, H., et al., Small RNAs with 5⍰-Polyphosphate Termini Associate with a Piwi-Related Protein and Regulate Gene Expression in the Single-Celled Eukaryote Entamoeba histolytica. PLOS Pathogens, 2008. 4(11): p. e1000219.
OpenUrl CrossRef PubMed
31.↵
Sijen, T., et al., Secondary siRNAs Result from Unprimed RNA Synthesis and Form a Distinct Class. Science, 2007. 315(5809): p. 244.
OpenUrl Abstract/FREE Full Text
32.↵
Zhang, H., et al., Nucleus-localized antisense small RNAs with 5'-polyphosphate termini regulate long term transcriptional gene silencing in Entamoeba histolytica G3 strain. J Biol Chem, 2011. 286(52): p. 44467–79.
OpenUrl Abstract/FREE Full Text
33.↵
Pompey, J.M., B. Foda, and U. Singh, A Single RNaseIII Domain Protein from Entamoeba histolytica Has dsRNA Cleavage Activity and Can Help Mediate RNAi Gene Silencing in a Heterologous System. PloS one, 2015. 10(7): p. e0133740–e0133740.
OpenUrl CrossRef
34.↵
Zhang, H., et al., Functional Characterization of Entamoeba histolytica Argonaute Proteins Reveals a Repetitive DR-Rich Motif Region That Controls Nuclear Localization. mSphere, 2019. 4(5): p. e00580–19.
OpenUrl
35.↵
Morf, L., et al., Robust gene silencing mediated by antisense small RNAs in the pathogenic protist Entamoeba histolytica. Nucleic acids research, 2013. 41(20): p. 9424–9437.
OpenUrl CrossRef PubMed
36.↵
Foda, B.M. and U. Singh, Dimethylated H3K27 Is a Repressive Epigenetic Histone Mark in the Protist Entamoeba histolytica and Is Significantly Enriched in Genes Silenced via the RNAi Pathway. The Journal of biological chemistry, 2015. 290(34): p. 21114–21130.
OpenUrl Abstract/FREE Full Text
37.↵
Pompey, J.M., L. Morf, and U. Singh, RNAi pathway genes are resistant to small RNA mediated gene silencing in the protozoan parasite Entamoeba histolytica. PLoS One, 2014. 9(9): p. e106477.
OpenUrl CrossRef PubMed
38.
Manna, D., et al., An NAD(+)-dependent novel transcription factor controls stage conversion in Entamoeba. Elife, 2018. 7.
39.
Rastew, E., L. Morf, and U. Singh, Entamoeba histolytica rhomboid protease 1 has a role in migration and motility as validated by two independent genetic approaches. Exp Parasitol, 2015. 154: p. 33–42.
OpenUrl CrossRef PubMed
40.↵
Suresh, S., et al., Development of RNA Interference Trigger-Mediated Gene Silencing in Entamoeba invadens. Infect Immun, 2016. 84(4): p. 964–975.
OpenUrl Abstract/FREE Full Text
41.↵
Morris, J.C., et al., Glycolysis modulates trypanosome glycoprotein expression as revealed by an RNAi library. The EMBO journal, 2002. 21(17): p. 4429–4438.
OpenUrl Abstract/FREE Full Text
42.↵
Drew, M.E., et al., The adenosine analog tubercidin inhibits glycolysis in Trypanosoma brucei as revealed by an RNA interference library. J Biol Chem, 2003. 278(47): p. 46596–600.
OpenUrl Abstract/FREE Full Text
43.↵
Stortz, J.A., et al., Genome-wide and protein kinase-focused RNAi screens reveal conserved and novel damage response pathways in Trypanosoma brucei. PLoS Pathog, 2017. 13(7): p. e1006477.
OpenUrl
44.↵
Glover, L., et al., VEX1 controls the allelic exclusion required for antigenic variation in trypanosomes. Proc Natl Acad Sci U S A, 2016. 113(26): p. 7225–30.
OpenUrl Abstract/FREE Full Text
45.↵
Rico, E., et al., Genome-wide RNAi selection identifies a regulator of transmission stage-enriched gene families and cell-type differentiation in Trypanosoma brucei. PLoS pathogens, 2017. 13(3): p. e1006279–e1006279.
OpenUrl CrossRef
46.↵
Mony, B.M., et al., Genome-wide dissection of the quorum sensing signalling pathway in Trypanosoma brucei. Nature, 2014. 505(7485): p. 681–685.
OpenUrl CrossRef PubMed Web of Science
47.↵
Alsford, S., et al., High-throughput decoding of antitrypanosomal drug efficacy and resistance. Nature, 2012. 482(7384): p. 232–236.
OpenUrl CrossRef PubMed Web of Science
48.
Collett, C.F., et al., Chemogenomic Profiling of Antileishmanial Efficacy and Resistance in the Related Kinetoplastid Parasite Trypanosoma brucei. Antimicrob Agents Chemother, 2019. 63(8).
49.
Trenaman, A., et al., A post-transcriptional respiratome regulon in trypanosomes. Nucleic Acids Res, 2019. 47(13): p. 7063–7077.
OpenUrl
50.↵
Zhang, N., et al., Host-parasite co-metabolic activation of antitrypanosomal aminomethyl-benzoxaboroles. PLoS Pathog, 2018. 14(2): p. e1006850.
OpenUrl CrossRef PubMed
51.↵
Alsford, S., et al., High-throughput phenotyping using parallel sequencing of RNA interference targets in the African trypanosome. Genome research, 2011. 21(6): p. 915–924.
OpenUrl Abstract/FREE Full Text
52.↵
Khalil, M.I., et al., Technical advances in trigger-induced RNA interference gene silencing in the parasite Entamoeba histolytica. Int J Parasitol, 2016. 46(3): p. 205–212.
OpenUrl CrossRef
53.↵
Caspi, R., et al., The MetaCyc database of metabolic pathways and enzymes and the BioCyc collection of pathway/genome databases. Nucleic Acids Res, 2016. 44(D1): p. D471–80.
OpenUrl CrossRef PubMed
54.↵
Hastings, J., et al., The ChEBI reference database and ontology for biologically relevant chemistry: enhancements for 2013. Nucleic Acids Res, 2013. 41(Database issue): p. D456–63.
OpenUrl CrossRef PubMed Web of Science
55.↵
Hon, C.C., et al., Quantification of stochastic noise of splicing and polyadenylation in Entamoeba histolytica. Nucleic Acids Res, 2013. 41(3): p. 1936–52.
OpenUrl CrossRef PubMed
56.↵
Shabardina, V., et al., Environmental adaptation of Acanthamoeba castellanii and Entamoeba histolytica at genome level as seen by comparative genomic analysis. Int J Biol Sci, 2018. 14(3): p. 306–320.
OpenUrl
57.↵
Saito-Nakano, Y., et al., Rab5-associated vacuoles play a unique role in phagocytosis of the enteric protozoan parasite Entamoeba histolytica. J Biol Chem, 2004. 279(47): p. 49497–507.
OpenUrl Abstract/FREE Full Text
58.↵
Li, H., Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. arXiv e-prints, 2013.
59.↵
Li, H., et al., The Sequence Alignment/Map format and SAMtools. Bioinformatics, 2009. 25(16): p. 2078–9.
OpenUrl CrossRef PubMed Web of Science
60.↵
team, R.c., R: A language and environment for statistical computing. R Foundation for Statistical Computing. 2019: Vienna, Austria.
61.↵
Dowle, M. and A. Srinivasan, data.table: Extension of ‘data.frame’. R package version 1.12.8. 2019.
62.↵
Wickham, H. and L. Henry, tidyr: Tidy Messy Data. R package version 1.0.2. 2020.
63.↵
Wickham, H., ggplot2: Elegant Graphics for Data Analysis. 2016: Springer-Verlag, New York.
64.↵
Oksanen, J., et al., vegan: Community Ecology Package. R package version 2.5-6 2019.
65.↵
Lawrence, M., et al., Software for Computing and Annotating Genomic Ranges. PLOS Computational Biology, 2013. 9(8): p. e1003118.
OpenUrl
66.↵
Gel, B. and E. Serra, karyoploteR: an R/Bioconductor package to plot customizable genomes displaying arbitrary data. Bioinformatics (Oxford, England), 2017. 33(19): p. 3088–3090.
OpenUrl CrossRef PubMed
67.↵
Kolde, R., pheatmap: Pretty Heatmaps. R package version 1.0.12. 2019.
68.↵
Miller, H.W., R.L. Suleiman, and K.S. Ralston, Trogocytosis by Entamoeba histolytica Mediates Acquisition and Display of Human Cell Membrane Proteins and Evasion of Lysis by Human Serum. mBio, 2019. 10(2).

View the discussion thread.

Posted May 29, 2020.

Download PDF

Supplementary Material

Citation Tools

Subject Area

Microbiology

Subject Areas

All Articles

Animal Behavior and Cognition (5195)
Biochemistry (11697)
Bioengineering (8714)
Bioinformatics (29110)
Biophysics (14921)
Cancer Biology (12045)
Cell Biology (17346)
Clinical Trials (138)
Developmental Biology (9404)
Ecology (14133)
Epidemiology (2067)
Evolutionary Biology (18260)
Genetics (12214)
Genomics (16758)
Immunology (11838)
Microbiology (27985)
Molecular Biology (11543)
Neuroscience (60764)
Paleontology (450)
Pathology (1864)
Pharmacology and Toxicology (3224)
Physiology (4934)
Plant Biology (10379)
Scientific Communication and Education (1679)
Synthetic Biology (2876)
Systems Biology (7330)
Zoology (1640)

[1] 1.↵
WHO/PAHO/UNESCO report. A consultation with experts on amoebiasis. Mexico City, Mexico 28-29 January, 1997. Epidemiol Bull, 1997. 18(1): p. 13–4.
OpenUrl PubMed

[2] 2.↵
GBD 2015 Mortality and Causes of Death Collaborators. (2016). Global, regional, and national life expectancy, all-cause mortality, and cause-specific mortality for 249 causes of death, 1980-2015: a systematic analysis for the Global Burden of Disease Study 2015, in Lancet (London, England). Elsevier. p. 1459–1544.

[3] 3.↵
Gilchrist, C.A., et al., Role of the Gut Microbiota of Children in Diarrhea Due to the Protozoan Parasite Entamoeba histolytica. The Journal of infectious diseases-, 2016. 213(10): p. 1579–1585.
OpenUrl CrossRef PubMed

[4] 4.↵
Mondal, D., et al., Entamoeba histolytica-associated diarrheal illness is negatively associated with the growth of preschool children: evidence from a prospective study. Trans R Soc Trop Med Hyg, 2006. 100(11): p. 1032–8.
OpenUrl CrossRef PubMed

[5] 5.↵
Kotloff, K.L., et al., Burden and aetiology of diarrhoeal disease in infants and young children in developing countries (the Global Enteric Multicenter Study, GEMS): a prospective, case-control study. The Lancet, 2013. 382(9888): p. 209–222.
OpenUrl CrossRef

[6] 6.↵
Gonzales, M.L.M., L.F. Dans, and J. Sio-Aguilar, Antiamoebic drugs for treating amoebic colitis. The Cochrane database of systematic reviews, 2019. 1(1): p. CD006085–CD006085.
OpenUrl

[7] 7.↵
Ralston, K.S. and W.A. Petri, Jr. Tissue destruction and invasion by Entamoeba histolytica. Trends Parasitol, 2011. 27(6): p. 254–63.
OpenUrl CrossRef PubMed

[8] 8.↵
Ravdin, J.I. and R.L. Guerrant, Role of adherence in cytopathogenic mechanisms of Entamoeba histolytica. Study with mammalian tissue culture cells and human erythrocytes. J Clin Invest, 1981. 68(5): p. 1305–13.
OpenUrl CrossRef PubMed Web of Science

[9] 9.↵
Saffer, L.D. and W.A. Petri, Jr.., Role of the galactose lectin of Entamoeba histolytica in adherence-dependent killing of mammalian cells. Infect Immun, 1991. 59(12): p. 4681–3.
OpenUrl Abstract/FREE Full Text

[10] 10.↵
Keene, W.E., et al., The major neutral proteinase of Entamoeba histolytica. J Exp Med, 1986. 163(3): p. 536–49.
OpenUrl Abstract/FREE Full Text

[11] 11.↵
Lidell, M.E., et al., Entamoeba histolytica cysteine proteases cleave the MUC2 mucin in its C-terminal domain and dissolve the protective colonic mucus gel. Proc Natl Acad Sci U S A, 2006. 103(24): p. 9298–303.
OpenUrl Abstract/FREE Full Text

[12] 12.↵
Ralston, K.S., et al., Trogocytosis by Entamoeba histolytica contributes to cell killing and tissue invasion. Nature, 2014. 508(7497): p. 526–30.
OpenUrl CrossRef PubMed

[13] 13.↵
Bettadapur, A., H.W. Miller, and K.S. Ralston, Biting off what can be chewed: Trogocytosis in health, infection and disease. Infection and Immunity., 2020. In Press.

[14] 14.↵
Willhoeft, U. and E. Tannich, The electrophoretic karyotype of Entamoeba histolytica. Molecular and Biochemical Parasitology, 1999. 99(1).

[15] 15.↵
Lorenzi, H.A., et al., New Assembly, Reannotation and Analysis of the Entamoeba histolytica Genome Reveal New Genomic Features and Protein Content Information. PLOS Neglected Tropical Diseases, 2010. 4(6): p. e716.
OpenUrl

[16] 16.↵
Wilson, I.W., et al., Genetic Diversity and Gene Family Expansions in Members of the Genus Entamoeba. Genome biology and evolution, 2019. 11(3): p. 688–705.
OpenUrl

[17] 17.↵
Loftus, B., et al., The genome of the protist parasite Entamoeba histolytica. Nature, 2005. 433(7028): p. 865–8.
OpenUrl CrossRef PubMed Web of Science

[18] 18.↵
Lorenzi, H.A., et al., New assembly, reannotation and analysis of the Entamoeba histolytica genome reveal new genomic features and protein content information. PLoS neglected tropical diseases, 2010. 4(6): p. e716–e716.
OpenUrl

[19] 19.↵
Hamann, L., R. Nickel, and E. Tannich, Transfection and continuous expression of heterologous genes in the protozoan parasite Entamoeba histolytica. Proceedings of the National Academy of Sciences, 1995. 92(19): p. 8975–8979.
OpenUrl Abstract/FREE Full Text

[20] 20.↵
Hamann, L., H. Buß, and E. Tannich, Tetracycline-controlled gene expression in Entamoeba histolytica. Molecular and Biochemical Parasitology, 1997. 84(1): p. 83–91.
OpenUrl CrossRef PubMed Web of Science

[21] 21.↵
King, A.V., et al., A genome-wide over-expression screen identifies genes involved in phagocytosis in the human protozoan parasite, Entamoeba histolytica. PloS one, 2012. 7(8): p. e43025–e43025.
OpenUrl CrossRef PubMed

[22] 22.↵
Koushik, A.B., et al., A genomewide overexpression screen identifies genes involved in the phosphatidylinositol 3-kinase pathway in the human protozoan parasite Entamoeba histolytica. Eukaryotic cell, 2014. 13(3): p. 401–411.
OpenUrl Abstract/FREE Full Text

[23] 23.↵
Bracha, R., et al., Antisense inhibition of amoebapore expression in Entamoeba histolytica causes a decrease in amoebic virulence. Mol Microbiol, 1999. 34(3): p. 463–72.
OpenUrl CrossRef PubMed Web of Science

[24] 24.↵
Bracha, R., Y. Nuchamowitz, and D. Mirelman, Transcriptional silencing of an amoebapore gene in Entamoeba histolytica: molecular analysis and effect on pathogenicity. Eukaryotic cell, 2003. 2(2): p. 295–305.
OpenUrl Abstract/FREE Full Text

[25] 25.↵
Bracha, R., et al., Transcriptional silencing of multiple genes in trophozoites of Entamoeba histolytica. PLoS Pathog, 2006. 2(5): p. e48.
OpenUrl CrossRef PubMed

[26] 26.↵
Stock, R.P., et al., Inhibition of gene expression in Entamoeba histolytica with antisense peptide nucleic acid oligomers. Nature Biotechnology, 2001. 19(3): p. 231–234.
OpenUrl CrossRef PubMed Web of Science

[27] 27.↵
Kaur, G. and A. Lohia, Inhibition of gene expression with double strand RNA interference in Entamoeba histolytica. Biochemical and Biophysical Research Communications, 2004. 320(4): p. 1118–1122.
OpenUrl CrossRef PubMed

[28] 28.↵
Solis, C.F., et al., Use of bacterially expressed dsRNA to downregulate Entamoeba histolytica gene expression. PloS one, 2009. 4(12): p. e8424–e8424.
OpenUrl CrossRef PubMed

[29] 29.↵
Linford, A.S., et al., Short hairpin RNA-mediated knockdown of protein expression in Entamoeba histolytica. BMC microbiology, 2009. 9: p. 38–38.
OpenUrl CrossRef PubMed

[30] 30.↵
Zhang, H., et al., Small RNAs with 5⍰-Polyphosphate Termini Associate with a Piwi-Related Protein and Regulate Gene Expression in the Single-Celled Eukaryote Entamoeba histolytica. PLOS Pathogens, 2008. 4(11): p. e1000219.
OpenUrl CrossRef PubMed

[31] 31.↵
Sijen, T., et al., Secondary siRNAs Result from Unprimed RNA Synthesis and Form a Distinct Class. Science, 2007. 315(5809): p. 244.
OpenUrl Abstract/FREE Full Text

[32] 32.↵
Zhang, H., et al., Nucleus-localized antisense small RNAs with 5'-polyphosphate termini regulate long term transcriptional gene silencing in Entamoeba histolytica G3 strain. J Biol Chem, 2011. 286(52): p. 44467–79.
OpenUrl Abstract/FREE Full Text

[33] 33.↵
Pompey, J.M., B. Foda, and U. Singh, A Single RNaseIII Domain Protein from Entamoeba histolytica Has dsRNA Cleavage Activity and Can Help Mediate RNAi Gene Silencing in a Heterologous System. PloS one, 2015. 10(7): p. e0133740–e0133740.
OpenUrl CrossRef

[34] 34.↵
Zhang, H., et al., Functional Characterization of Entamoeba histolytica Argonaute Proteins Reveals a Repetitive DR-Rich Motif Region That Controls Nuclear Localization. mSphere, 2019. 4(5): p. e00580–19.
OpenUrl

[35] 35.↵
Morf, L., et al., Robust gene silencing mediated by antisense small RNAs in the pathogenic protist Entamoeba histolytica. Nucleic acids research, 2013. 41(20): p. 9424–9437.
OpenUrl CrossRef PubMed

[36] 36.↵
Foda, B.M. and U. Singh, Dimethylated H3K27 Is a Repressive Epigenetic Histone Mark in the Protist Entamoeba histolytica and Is Significantly Enriched in Genes Silenced via the RNAi Pathway. The Journal of biological chemistry, 2015. 290(34): p. 21114–21130.
OpenUrl Abstract/FREE Full Text

[37] 37.↵
Pompey, J.M., L. Morf, and U. Singh, RNAi pathway genes are resistant to small RNA mediated gene silencing in the protozoan parasite Entamoeba histolytica. PLoS One, 2014. 9(9): p. e106477.
OpenUrl CrossRef PubMed

[38] 38.
Manna, D., et al., An NAD(+)-dependent novel transcription factor controls stage conversion in Entamoeba. Elife, 2018. 7.

[39] 39.
Rastew, E., L. Morf, and U. Singh, Entamoeba histolytica rhomboid protease 1 has a role in migration and motility as validated by two independent genetic approaches. Exp Parasitol, 2015. 154: p. 33–42.
OpenUrl CrossRef PubMed

[40] 40.↵
Suresh, S., et al., Development of RNA Interference Trigger-Mediated Gene Silencing in Entamoeba invadens. Infect Immun, 2016. 84(4): p. 964–975.
OpenUrl Abstract/FREE Full Text

[41] 41.↵
Morris, J.C., et al., Glycolysis modulates trypanosome glycoprotein expression as revealed by an RNAi library. The EMBO journal, 2002. 21(17): p. 4429–4438.
OpenUrl Abstract/FREE Full Text

[42] 42.↵
Drew, M.E., et al., The adenosine analog tubercidin inhibits glycolysis in Trypanosoma brucei as revealed by an RNA interference library. J Biol Chem, 2003. 278(47): p. 46596–600.
OpenUrl Abstract/FREE Full Text

[43] 43.↵
Stortz, J.A., et al., Genome-wide and protein kinase-focused RNAi screens reveal conserved and novel damage response pathways in Trypanosoma brucei. PLoS Pathog, 2017. 13(7): p. e1006477.
OpenUrl

[44] 44.↵
Glover, L., et al., VEX1 controls the allelic exclusion required for antigenic variation in trypanosomes. Proc Natl Acad Sci U S A, 2016. 113(26): p. 7225–30.
OpenUrl Abstract/FREE Full Text

[45] 45.↵
Rico, E., et al., Genome-wide RNAi selection identifies a regulator of transmission stage-enriched gene families and cell-type differentiation in Trypanosoma brucei. PLoS pathogens, 2017. 13(3): p. e1006279–e1006279.
OpenUrl CrossRef

[46] 46.↵
Mony, B.M., et al., Genome-wide dissection of the quorum sensing signalling pathway in Trypanosoma brucei. Nature, 2014. 505(7485): p. 681–685.
OpenUrl CrossRef PubMed Web of Science

[47] 47.↵
Alsford, S., et al., High-throughput decoding of antitrypanosomal drug efficacy and resistance. Nature, 2012. 482(7384): p. 232–236.
OpenUrl CrossRef PubMed Web of Science

[48] 48.
Collett, C.F., et al., Chemogenomic Profiling of Antileishmanial Efficacy and Resistance in the Related Kinetoplastid Parasite Trypanosoma brucei. Antimicrob Agents Chemother, 2019. 63(8).

[49] 49.
Trenaman, A., et al., A post-transcriptional respiratome regulon in trypanosomes. Nucleic Acids Res, 2019. 47(13): p. 7063–7077.
OpenUrl

[50] 50.↵
Zhang, N., et al., Host-parasite co-metabolic activation of antitrypanosomal aminomethyl-benzoxaboroles. PLoS Pathog, 2018. 14(2): p. e1006850.
OpenUrl CrossRef PubMed

[51] 51.↵
Alsford, S., et al., High-throughput phenotyping using parallel sequencing of RNA interference targets in the African trypanosome. Genome research, 2011. 21(6): p. 915–924.
OpenUrl Abstract/FREE Full Text

[52] 52.↵
Khalil, M.I., et al., Technical advances in trigger-induced RNA interference gene silencing in the parasite Entamoeba histolytica. Int J Parasitol, 2016. 46(3): p. 205–212.
OpenUrl CrossRef

[53] 53.↵
Caspi, R., et al., The MetaCyc database of metabolic pathways and enzymes and the BioCyc collection of pathway/genome databases. Nucleic Acids Res, 2016. 44(D1): p. D471–80.
OpenUrl CrossRef PubMed

[54] 54.↵
Hastings, J., et al., The ChEBI reference database and ontology for biologically relevant chemistry: enhancements for 2013. Nucleic Acids Res, 2013. 41(Database issue): p. D456–63.
OpenUrl CrossRef PubMed Web of Science

[55] 55.↵
Hon, C.C., et al., Quantification of stochastic noise of splicing and polyadenylation in Entamoeba histolytica. Nucleic Acids Res, 2013. 41(3): p. 1936–52.
OpenUrl CrossRef PubMed

[56] 56.↵
Shabardina, V., et al., Environmental adaptation of Acanthamoeba castellanii and Entamoeba histolytica at genome level as seen by comparative genomic analysis. Int J Biol Sci, 2018. 14(3): p. 306–320.
OpenUrl

[57] 57.↵
Saito-Nakano, Y., et al., Rab5-associated vacuoles play a unique role in phagocytosis of the enteric protozoan parasite Entamoeba histolytica. J Biol Chem, 2004. 279(47): p. 49497–507.
OpenUrl Abstract/FREE Full Text

[58] 58.↵
Li, H., Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. arXiv e-prints, 2013.

[59] 59.↵
Li, H., et al., The Sequence Alignment/Map format and SAMtools. Bioinformatics, 2009. 25(16): p. 2078–9.
OpenUrl CrossRef PubMed Web of Science

[60] 60.↵
team, R.c., R: A language and environment for statistical computing. R Foundation for Statistical Computing. 2019: Vienna, Austria.

[61] 61.↵
Dowle, M. and A. Srinivasan, data.table: Extension of ‘data.frame’. R package version 1.12.8. 2019.

[62] 62.↵
Wickham, H. and L. Henry, tidyr: Tidy Messy Data. R package version 1.0.2. 2020.

[63] 63.↵
Wickham, H., ggplot2: Elegant Graphics for Data Analysis. 2016: Springer-Verlag, New York.

[64] 64.↵
Oksanen, J., et al., vegan: Community Ecology Package. R package version 2.5-6 2019.

[65] 65.↵
Lawrence, M., et al., Software for Computing and Annotating Genomic Ranges. PLOS Computational Biology, 2013. 9(8): p. e1003118.
OpenUrl

[66] 66.↵
Gel, B. and E. Serra, karyoploteR: an R/Bioconductor package to plot customizable genomes displaying arbitrary data. Bioinformatics (Oxford, England), 2017. 33(19): p. 3088–3090.
OpenUrl CrossRef PubMed

[67] 67.↵
Kolde, R., pheatmap: Pretty Heatmaps. R package version 1.0.12. 2019.

[68] 68.↵
Miller, H.W., R.L. Suleiman, and K.S. Ralston, Trogocytosis by Entamoeba histolytica Mediates Acquisition and Display of Human Cell Membrane Proteins and Evasion of Lysis by Human Serum. mBio, 2019. 10(2).