Genome-scale functional profiling of cell cycle controls in African trypanosomes

Trypanosomatids, which include major pathogens of humans and livestock, are divergent eukaryotes for which cell cycle controls and the underlying mechanisms are not completely understood. Here, we describe a genome-wide RNA-interference library screen for cell cycle regulators in bloodstream form Trypanosoma brucei. We induced massive parallel knockdown and sorted the perturbed population into cell cycle stages using flow cytometry. RNAi-targets were deep-sequenced from each stage and cell cycle profiles were digitally reconstructed at a genomic scale. We identify hundreds of proteins that impact cell cycle progression; glycolytic enzymes required for G1S progression, DNA replication factors, mitosis regulators, proteasome and kinetochore complex components required for G2M progression, flagellar and cytoskeletal components required for cytokinesis, mRNA-binding factors, protein kinases and many previously uncharacterised proteins. The outputs facilitate functional annotation and drug-target prioritisation and provide comprehensive functional genomic evidence for the machineries, pathways and regulators that coordinate progression through the trypanosome cell cycle. The data can be searched and browsed using an interactive, open access, online data visualization tool (https://tryp-cycle.onrender.com).


51
The canonical eukaryotic cell cycle encompasses discrete phases: G 1 (gap 1), when 52 the cell prepares for DNA replication; S (synthesis) phase, when nuclear DNA replication 53 takes place; G 2 (gap 2), when the cell prepares for mitosis; and M (mitosis) when the 54 replicated DNA is segregated and the nucleus divides (1). Mitosis is immediately followed by 55 cytokinesis (cell division), generating two daughter cells (2). Anomalies occurring during cell 56 cycle progression can result in cell cycle arrest, to allow the cell to resolve the anomaly; in 57 cell death, if the anomaly cannot be resolved or, among other outcomes, carcinogenesis.   The schematic illustrates the RITseq screen; massive parallel induction of RNAi followed by flow cytometry and RIT-seq, allowing for reconstruction of cell cycle profiles, using mapped reads from each knockdown. Each read-mapping profile encompasses the gene of interest and associated untranslated regions present in the cognate mRNA. The library data represents the uninduced and unsorted population. GeneIDs, Tb927.7.3160 for example, are shown without the common 'Tb927.' component.
<2C and >4C pools, less than one million cells were collected; these pools were retained in 147 their entirety for RIT-seq analysis.

168
The RIT-seq digital data for individual genes following knockdown provided a 169 measure of abundance in each pool and were, therefore, used to digitally reconstruct cell 170 cycle profiles for individual gene knockdowns ( Figure 1B). We expected to observe 171 accumulation of particular knockdowns in specific cell cycle phase pools, thereby reflecting 172 specific defects. This was indeed the case, and some examples are shown to illustrate; no 173 major defect, S phase overrepresented or >4C overrepresented, following knockdown 174 ( Figure 1B). These outputs suggest that loss of the cytoplasmic dynein heavy chain 175 (Tb927.7.3160) does not perturb cell cycle distribution; that a putative DNA helicase 176 (Tb927.11.12600) is required for the completion of S phase; and that knockdown of the 177 axonemal dynein heavy chain (Tb927.11.11220) results in endoreduplication in the absence 178 of cytokinesis. Dyneins are cytoskeletal motor proteins that move along microtubules, to 179 produce a flagellar beat, for example (29).

204
Next, we turned our attention to knockdowns reporting an overrepresentation of G 1 , S 205 phase or G 2 M cells. The pools of knockdowns that registered >25% overrepresented read 206 counts in each of these categories are highlighted in Figure 2C   Overall, the five components of the screen yielded 1,158 genes that registered a cell 216 cycle defect, based on the thresholds applied above. This is 16.1% of the 7,204 genes 217 analysed, and the distribution of these genes among the five arms of the screen are shown The plot on the left shows knockdowns overrepresented in the >4C experiment in red; those with >1.5-fold the sum of reads in the G 1 , S phase and G 2 M samples combined. The read-mapping profile and read-counts for /-tubulin are shown to the right. (B) The plot on the left shows knockdowns overrepresented in the sub-2C experiment in orange; those with >1.5-fold the sum of reads in the G 1 , S phase and G 2 M samples combined. The readmapping profile and read-counts for DOT1A are shown to the right. (C) The plots on the left shows knockdowns overrepresented in the G 1 , S phase and G 2 M experiments in purple, green and blue, respectively; those that were >25% overrepresented in each category. Read-mapping profiles and relative read-counts for example hits are shown to the right. PCNA, proliferating cell nuclear antigen; PPL2, PrimPol-like 2. (D) The Venn diagram shows the distribution of knockdowns overrepresented in each arm of the screen.
in the Venn diagram in Figure 2D. Since we predicted that knockdowns associated with a 219 cell cycle defect were more likely to also register a growth defect, we compared these 220 datasets to prior RIT-seq fitness profiling data (23). All groups of genes that registered cell 221 cycle defects, except for the <2C set, were significantly enriched for genes that previously 222 registered a loss-of-fitness phenotype following knockdown in bloodstream form cells ( 2 223 test; <2C, p = 0.15; G 1 , p = 0.015; S, p = 4.7 -4 ; G 2 M, p = 3.5 -24 ; >4C, p = 4.4 -199 ). This is 224 consistent with loss-of-fitness as a common outcome following a cell cycle progression 225 defect. Taken together, the analyses above provided validation for the RIT-seq based cell 226 cycle phenotyping approach and yielded >1,000 candidate genes that impact specific steps 227 during T. brucei cell cycle progression.

234
Consistent with these observations, -tubulin (see Figure 2A) and axonemal dynein heavy  Figure 3C. The

248
The heat-map in Figure 3D shows the data for all five sorted pools for the cohorts 249 described above and for additional cohorts of knockdowns enriched in the >4C pool; these 250 include radial spoke proteins, extra-axonemal paraflagellar rod (PFR) proteins, as well as 251 nucleoporins. The gallery in Figure 3E shows examples of RIT-seq read-mapping profiles 252 for twenty-four individual genes that register >4C enrichment following knockdown. In 253 addition to the categories above, these include the inner arm dynein 5-1 (37), FAZ proteins

338
The heat-map in Figure 5D shows the data for all five sorted pools for the cohorts

348
We next explored some of the cohorts of hits described above in more detail.

349
Glycolytic enzymes are particularly prominent amongst knockdowns that accumulate in G 1 350 and we illustrate the RIT-seq profiling data for these enzymes in Figure 6A. Seven    . Protein complexes and pathways associated with G 1 , S phase and G 2 /mitosis defects. (A) The RadViz plot shows glycolytic enzyme knockdowns. Those that registered >25% overrepresented read-counts in the G 1 category are indicated, purple. Black datapoints indicate other genes from each cohort. Grey data-points indicate all other genes. The read-mapping profiles and relative readcounts in the lower panel show example hits. (B) As in a but for DNA replication initiation factor knockdowns that registered >25% overrepresented read-counts in the S phase category, indicated in green. (C) As in a but for proteasome component knockdowns that registered >25% overrepresented read-counts in the G 2 M category, indicated in blue. (D) As in a but for kinetochore component knockdowns that registered >25% overrepresented read-counts in the S phase or G 2 M categories, indicated in green or blue, respectively.
in stumpy-form cells (67). We conclude that, as in other organisms (68), there is metabolic 365 control of the cell cycle and a nutrient sensitive restriction point in T. brucei, with glycolysis 366 playing a role in the G 1 to S phase transition.

367
DNA replication initiation factors are particularly prominent amongst knockdowns that 368 accumulate in S phase and we illustrate the RIT-seq profiling data for these factors in Figure   369 6B. Five knockdowns that register >25% overrepresentation in the S phase pool are 370 components of the eukaryotic replicative helicase, the CMG (Cdc45-MCM-GINS) complex.

371
At the core of this complex is the minichromosome maintenance complex (MCM2-7), a 372 helicase that unwinds the duplex DNA ahead of the moving replication fork (69).

373
Identification of this subset of components suggests that these particular subunits are 374 limiting for progression through S phase.

375
Proteasome components are particularly prominent amongst knockdowns that 376 accumulate in G 2 M, and we illustrate the RIT-seq profiling data for this protein complex in

385
Kinetochore components (17) are also amongst knockdowns that accumulate in G 2 M 386 and we illustrate the RIT-seq profiling data for this protein complex in Figure 6D. Although Tb927.11.12410) knockdowns registered >25% overrepresentation in the G 2 M pool, 390 suggesting that these particular kinetochore components, which all display temporal patterns 391 of phosphorylation from S phase to G 2 M (21), are limiting for progression through mitosis.

392
Notably, KKT10 is a kinase responsible for phosphorylation of KKT7, which is required for 393 the metaphase to anaphase transition (73); as well as for the phosphorylation of KKT1, 394 which is required for kinetochore assembly (74). These findings are consistent with the view 395 that kinetochore components control a non-canonical spindle checkpoint in trypanosomes   The RadViz plot shows mRNA binding protein knockdowns (RBPs). Those that registered >25% overrepresented read-counts in the G 1 , S phase or G 2 M categories are indicated, in purple, green and blue, respectively. The read-mapping profiles and relative read-counts in the lower panels show example hits. (B) As in a but for protein kinase knockdowns. (C) As in a but for hypothetical (conserved) protein knockdowns. G 1 , S phase and G 2 M pools revealed many putative mRNA binding proteins (RBPs) and 402 kinases. Indeed, RBPs are significantly enriched amongst knockdowns that registered G 1 , S 403 phase or G 2 M cell cycle defects ( 2 test, p = 7 -5 ). We show the RIT-seq profiling data for 404 seven RBP knockdowns that register >25% overrepresentation in these pools (Figure 7A).

405
These include knockdowns for two components of the translation initiation factor, eIF3,

423
Finally, we analysed genes encoding proteins annotated as hypothetical (conserved).

424
Despite excellent progress in genome annotation, 35% of the non-redundant gene-set in T.

425
brucei retain this annotation, amounting to >2,500 genes. We show data for more than 426 twenty of these knockdowns above, linked to enriched >4C (Figure 3-

462
Although the overlap between cell cycle regulated proteins (20) and those 463 knockdowns that registered a cell cycle defect here failed to achieve significance (overlap = 464 71 of 367,  2 p = 0.09), cell cycle regulated proteins do appear to be required for progression 465 through specific stages of the cell cycle. For example, multiple glycolytic enzymes 466 upregulated in G 1 were linked to accumulation in the G 1 pool following knockdown ( 2 p = 467 7.9 -11 ). In addition, proteins highly upregulated in G 2 and M were linked to accumulation in 468 the G 2 M ( 2 p = 1.8 -8 ) or >4C pools ( 2 p = 8.9 -9 ) following knockdown, including multiple 469 kinetochore and chromosomal passenger complex components, respectively.

470
In terms of specific cell cycle regulated genes/proteins, we focused on those that 471 previously registered a significant loss of fitness following knockdown (23) and now with a 472 RIT-seq based cell cycle progression functional assignment. Examples include putative 473 RBPs of the DNA polymerase suppressor 1 (PSP1) family, which display mRNA 474 upregulation in G 1 , protein upregulation in S phase, cell cycle regulated phosphorylation and 475 accumulation in G 2 M following knockdown (see Figure 5D and Figure 7A). The kinetochore 476 components, KKT1 and KKT7, and also CRK3, all display mRNA upregulation in S phase, 477 protein upregulation in G 2 and M, cell cycle regulated phosphorylation and accumulation in 478 G 2 M following knockdown (see Figure 6D and Figure 7B); KKT10 and CYC6 report a 479 similar profile (see Figure 6D), except for the mRNA regulation component. The cytokinesis 480 initiation factors, CIF1 and CIF2, also display mRNA upregulation in S phase, protein 481 upregulation in G 2 and M and cell cycle regulated phosphorylation, but instead accumulation 482 in the >4C pool following knockdown (see Figure 3E). Finally, the chromosomal passenger 483 complex components, CPC1 and AUK1, as well as furrow localized FRW1, report mRNA 484 and protein upregulation in G 2 M and accumulation in the >4C pool following knockdown (see 485 Figure 3E). Thus, several regulators linked to specific cell cycle progression defects by RIT-486 seq profiling, are themselves regulated.

488
Despite intense interest and study (13,15), many cell cycle regulators in 489 trypanosomatids remain to be identified and much remains to be learned about cell cycle 490 control and progression in these parasites. DNA staining followed by flow cytometry is a 491 widely used approach for quantifying cellular DNA content and analysing cell cycle

496
Functional annotation of the trypanosomatid genomes will continue to benefit from 497 novel high-throughput functional analyses, and RNAi-mediated knockdown has proven to be 498 a powerful approach for T. brucei. RIT-seq profiling provides data for almost every gene and, 499 using this approach, we previously described genome-scale loss-of-fitness data (23).

500
Amongst 3117 knockdowns that scored a significant loss-of-fitness in bloodstream-form cells 501 in that screen (42% of all genes analysed) were genes encoding all 18 intraflagellar transport 502 complex subunits ( 2 p = 1 -6 ), 12 of 13 dynein heavy-chains ( 2 p = 4 -4 ), all 8 TCP-1 503 chaperone components ( 2 p = 1 -3 ), 27 of 30 nucleoporins ( 2 p = 2 -7 ), all eleven glycolytic 504 enzymes ( 2 p = 2 -4 ) and 30 of 31 proteasome subunits ( 2 p = 2 -9 ). This set also included 18 505 of 19 kinetochore proteins ( 2 p = 6 -6 ), only later identified as components of this essential 506 complex (17). With this study, we now link many of these genes and many more to specific 507 cell cycle defects following RNAi knockdown. A large number of flagellar protein 508 knockdowns, in particular, yield cells with excess DNA, suggesting that DNA replication 509 typically continues following failure to complete cytokinesis. We identified a number of 510 pathways and protein complexes that impact cell cycle progression, such as glycolysis (G 1 /S 511 transition) and the proteasome (likely G 2 /M transition). We also identify many mRNA binding 512 proteins and protein kinases implicated in control of cell cycle progression. Notably, we link 513 multiple known potential and promising drug targets to cell cycle progression defects, such 514 as glycolytic enzymes (86), the proteasome (87), kinetochore kinases (74,88) and other 515 kinases (89).

516
Prior cell cycle studies have often focused on trypanosome orthologues of known 517 regulators from other eukaryotes. Since genome-scale profiling is unbiased, it presents the 518 opportunity to uncover divergent as well as novel factors and regulators that impact cell 519 cycle progression. Accordingly, we link many previously uncharacterised and hypothetical 520 proteins of unknown function to specific cell cycle progression defects. Thus, we uncover 521 mechanisms with an ancient origin in a common eukaryotic ancestor and others likely 522 reflecting trypanosomatid-specific biology. We also compared our functional data with cell 523 cycle regulated transcriptome and (phospho)proteome datasets.

524
The digital dataset provided in Supplementary File 1 facilitates further interrogation 525 and further analysis of the genome-scale cell cycle RIT-seq data. We have also made the 526 data available via an interactive, open access, online data visualization tool, which allows 527 searching and browsing of the data (see Figure 2-

539
In summary, we report RNAi induced cell cycle defects at a genomic scale and 540 identify the T. brucei genes that underlie these defects. The outputs confirm known roles in 541 cell cycle progression and provide functional annotation for many additional genes, including 542 many with no prior functional assignment and many that are trypanosomatid-specific. As 543 such, the data not only improve our understanding of cell cycle progression in these 544 important and divergent pathogens but should also accelerate further discovery. Taken 545 together, our findings further facilitate genome annotation, drug-target prioritisation and 546 provide comprehensive genetic evidence for the protein complexes, pathways and 547 regulatory factors that coordinate progression through the trypanosome cell cycle.

551
The bloodstream form T. brucei RNAi library (22) was thawed in HMI-11 containing 1 g.ml -1 552 of blasticidin and 0.2 g.ml -1 of phleomycin and incubated at 37C in 5% CO 2 . After 553 approximately 48 h, six flasks, each containing 2 x10 7 cells in 150 ml of HMI-11 as above, 554 were prepared; 1 g.ml -1 of tetracycline was added to five of them, while one served as the

970
green and blue, respectively; those that were >25% overrepresented in each category.

971
Read-mapping profiles and relative read-counts for example hits are shown to the right.