Gene regulatory network analysis predicts cooperating transcription factor regulons required for FLT3-ITD+ AML growth

AML is a heterogenous disease caused by different mutations. We have previously shown that each mutational sub-type develops its specific gene regulatory network (GRN) with transcription factors interacting with multiple gene modules, many of which are transcription factor genes themselves. Here we hypothesized that highly connected nodes within such networks comprise crucial regulators of AML maintenance. We tested this hypothesis using FLT3-ITD mutated AML as a model and conducted an shRNA drop-out screen informed by this analysis. We show that AML-specific GRNs predict identifying crucial regulatory modules required for AML but not normal cellular growth. Furthermore, our work shows that all modules are highly connected and regulate each other. The careful multi-omic analysis of the role of one (RUNX1) module by shRNA and chemical inhibition shows that this transcription factor and its target genes stabilize the GRN of FLT3-ITD AML and that its removal leads to GRN collapse and cell death.


Supplementary
TFs that are associated with more than one module are highlighted.  A: Venn diagrams showing integration of three RUNX1 ChIP data-sets from primary FLT3-ITD AML cells and cell lines. B: Pie chart showing the percentage of RUNX1 target genes in RUNX module in merged RUNX1 ChIP-Seq dataset. C: Table of enriched TF motifs in FLT3-ITD AML specific open chromatin regions compared

Primary sample and PBSC processing
Human tissue was obtained with the required ethical approval from the National Health Service (NHS) National Research Ethics Committee. AML and PBSC samples used in this study were either surplus diagnostic samples or fresh samples obtained with specific consent from the subjects. AML samples were obtained from (1) the Haematological Malignancy Diagnostic Service (St James's Hospital, Leeds, UK), (2) the Centre for Clinical Haematology, Queen Elizabeth Hospital Birmingham, Birmingham, UK, or (3) the West Midlands Regional Genetics Laboratory, Birmingham Women's NHS Foundation Trust, Birmingham, UK. Mononuclear cells were purified on the same day they were received, and in most cases were also directly further purified using either CD34 or CD117 (KIT) magnetic antibodies as described 2 . For some samples with >92% blast cells, column purification was not performed. Mobilized PBSCs were provided by NHS Blood & Transplant, Leeds, UK, and NHS Blood & Transplant, Birmingham, UK.

Cell lines
For this study, we used two cell lines containing a FLT3-ITD mutation 3 , MV4-11 (DMSZ, AC102) and MOLM14 (DSMZ, ACC 777). The cells were cultured in RPMI 1640 supplemented with 1% L-glutamine and 20% heat-inactivated FBS. For culture maintenance cells were split to 0.5x10 6 cells/ml every 3 days to not exceed 1-2 x 10 6 cells/ml. For the in vitro screen after sorting the media was also supplemented with 1% Penicillin/Streptomycin. HEK293T cells (DSMZ, ACC305) were used to produce lentivirus. These cells are cultured in HEPES-modified DMEM medium supplemented with 10% FBS, 4mM L-glutamine and 1mM sodium pyruvate. Cells were split using trypsin every 3 days to not exceed a confluency of 70%. All cells were cultured and treated in an incubator at 37C with 5% CO2.

FLT3-ITD AML shRNA screen Vector
The vector used in this study (now named pL40C) is described in detail in 4 . As described, the vector contains ampicillin resistance, a doxycycline-induced cassette that expresses the shRNA together with the fluorochrome dTomato and a constitutively expressed cassette containing the fluorochrome Venus.

shRNA oligo design
The shRNA oligos were designed using the informatic tool (https://felixfadams.shinyapps.io/miRN/) described previously 4,5 . 161 genes were included as targets and 3 shRNA oligos were designed per gene as described in the txt. As a positive control, FLT3 was included together with 10 NTC shRNA as negative controls. The oligos were ordered from Sigma Aldrich. Each oligo was 67 bp and was received with pre-mixed forward and reverse oligos at a concentration of 100 µM with desalt purification.

shRNA library cloning
The library of shRNA was produced following the process described in 4 . Briefly, the oligos were phosphorylated and annealed. Afterwards, all the oligos were pooled together. The vector was opened using the restriction enzyme BsmBI (Thermo, ER0451) following manufacturer recommendations. The opened plasmid was separated by running the product of the digestion in an agarose gel and extracting the DNA from the band using the Qiaquick Gel Extraction Kit (Qiagen, 28706) following manufacturer instructions. Plasmid and oligos were ligated using T4 DNA ligase kit (Thermo fisher, EL0011) with a molar ratio of 1:3 of vector and oligo. The ligation product was then transformed and amplified using XL-gold bacteria (Agilent, 200315) according to the manufacturer's protocol. After the transformation a maxiprep was perform to obtain high amounts of the cloned library, we used the EndoFree Plasmid Maxi Kit (Qiagen, 12362) and followed manufacturer's instructions.

Lentivirus production
The production of the lentivirus was done following the protocol described in Martinez et al 6 .In summary, HEK293T cells were cultured in 15cm petri dishes up to a confluence of 50-60%. The library vectors and vectors for packaging and envelope (pMD2.G and pCMVΔR8.91) were mixed with special water (deionized water with 2.5 mM HEPES) and CaCl2 0.5 M solution. Next, the mix of the plasmid/special water and CaCl2 was combined by dropwise addition o f HeBS. After incubation the solution was poured dropwise into the plates for transfection of the HEK293T cells. After one day the media is change, following 48h of incubation the supernatant containing the virus is collected, spin and freeze.

Cell transduction
MV4-11 and MOLM-14 cells were transduced following the protocol described in Martinez et al 6 . As a summary, cells at 10 6 cells/ml were transduced with the pooled library shRNA lentivirus particles present in the supernatant using Polybrene at a final concentration of 8 µg/ml. Afterwards, the plate was centrifuged at 34C for 50 min at 900xg. After centrifugation, cells were incubated for 3 days. The transduction was performed at a low MOI (0.3 TU/cell) to produce a population of cells with one integration event per cell. Following lentiviral transduction, cells successfully carrying the fluorescent Venus constitutive expressed were purified using fluorescence activated cell sorting (FACS) in an ARIA II. Most cells that remained after selection carried a single copy of the inducible shRNA. Cells were then used to perform the screening as described.

In vitro screen
For the in vitro screen we aimed for a coverage of 1000x of the library. Two different conditions were tested no doxycycline and doxycycline treatment. For transduced MOLM-14 the concentration of doxycycline (Sigma Aldrich, D5207) used was 500 ng/mL and 1µg/ml for transduced MV4-11. 5 million cells per condition were cultured at a concentration of 0.5x10 6 cells/ml, split every 3 days to maintain that concentration, change the media and refresh the doxycycline. Cells were cultured for 15 passages and samples were collected at different time points. For obtaining DNA, cells were collected, spin and freeze in a pellet. The DNA was then isolated using the DNeasy Blood & Tissue Kit (Qiagen, 69504) following manufacturer's instructions.

Mouse studies and PDX generation
All mouse studies were carried out in accordance with UK Animals (Scientific Procedures) Act, 1986 under project licence P74687DB5 following approval from Newcastle University animal ethical review body (AWERB). Mice were housed in specific pathogen free conditions in individually ventilated cages with sterile bedding, water and diet (Irradiated RM3 breeding diet, SDS Ltd). All procedures were performed aseptically in a laminar flow hood. NSG mice (NOD.Cg-Prkdcscid Il2rg tm1Wjl/SzJ) aged between 12 and 16 weeks, both sexes, from an in-house colony were used for PDX generation. They were transplanted intrafemorally with 1x10e6 patient or PDX cells under isoflurane anaesthetic and administered with subcutaneous NSAID analgesia (5 mg/kg subcutaneous Carprofen). Mice were checked daily, weighed and examined at least once weekly to ensure good health. Endpoints for humane killing were pale extremities, hunched posture, 20% weight loss compared to highest previous weight or 10% weight loss for 3 consecutive days. PDX cells were harvested from spleen and isolated by passing through a 50µM cell sieve (Falcon Corning). Cells were washed in PBS and stored frozen in 10%DMSO/90%FBS. Mice used in the screen were Rag2 -/-Il2rg -/-1293Balb/c (RG) mice (female) aged 8-10 weeks at study commencement.
In vivo shRNA screen 10 female mice Rag2 -/-Il2rg -/-1293Balb/c (RG) from an in-house colony and aged 8-10 weeks were injected intra-venously with 50.000 MOLM-14 cells containing the shRNA library per mouse in a volume of 100µl. Mice were randomly assigned to 2 groups .One group was fed the normal RM3 diet and one doxycycline containing diet ( 823747 -CRM (E) + 625ppm Doxycycline (P) 1kg 25kG, SDS Ltd) ad libitum on the day of cell injection. Diet was replaced every 3 days and mouse health assessed daily. Mice were humanely killed 19-22 days after cell injection when a weak tail or hind legs were first detected. Cells were isolated from spleen as above. Cells were isolated from the bone marrow by crushing the leg and hip bones in PBS in a pestle and mortar, vortexing and passing the supernatant through a cell sieve. Engrafted cells were sorted by FACS (Aria II, BD) using Venus and dTomato fluorophore DNA was isolated using the DNeasy Blood & Tissue Kit (Qiagen, 69504) following manufacturer's instructions.
Library preparation for shRNA screens PCR of genomic DNA was performed using ExTaq (Takara) using custom designed primers with Nextera i5 and i7 index sequences (supplementary table X) to amplify the mir30 insert containing the shRNA. Amplicons were electrophoresed on an agarose gel and DNA was purified using the QIAquick gel extraction kit (QIAGEN) according to manufacturer's instructions and further purified by ampure (Beckman Coulter). Samples were pooled and analysed on a Next Seq 2000 75 using a NextSeq 500/550 High output kit.

Inhibitor experiments in primary AML cells and healthy cells
The DUSP 1/6 inhibitor BCI (Selleckchem) and FLT3-ITD inhibitor Quizartinib (Selleckchem) were dissolved to a 10 mM stock concentration in DMSO (Merck) on arrival. CBFβi (AI-14-91) and its control compound (AI-4-88) 8 were both dissolved to a 40 mM concentration in DMSO. Prior to dosing primary cells were cultured as described above for 7 days after defrost. Samples were then transferred to a 96 well plate previously prepared with hMSC feeders and the desired concentration of inhibitor was added to the media ("untreated" control was treated with 0.1% DMSO). Cells were then incubated with the inhibitors for 6 days before viability was assessed by counting cells on a haemocytometer after a 1:1 dilution with Trypan Blue (Merck) to differentiate alive and dead cells. For dose response curves IC50 was calculated using Graphpad prism software by performing non-linear regression (log[inhibitor] vs normalized response). For colony formation assays -cells were treated with the inhibitor for 24h prior to seeding at a density of 5000 cells/ml in Methocult Express (StemCell Technologies). The inhibitor was also added to the colony medium at the same concentration. Colonies were counted after 12 days.
For NGS experiments -primary cells were treated with the desired concentration of inhibitor for 24 h prior to harvest with 0.1% DMSO as a control.

Lenitviral transduction of primary AML cells and healthy cells
pL40c shRNA were generated by cloning shRNAs (supplementary table X) into the pL40c vector. The dnFOS and dnCEBP inserts, originally generated by Charles Vinson (National Cancer Institute, Bethesda, MD, USA), were cloned into a pENTR backbone and then Gateway cloning was used to insert that into the Tet-on plasmid pCW57.1 (David Root, Addgene plasmid 41393). For virus production, Human embryonic kidney 293T (HEK293T) cells were cultured in DMEM supplemented with 10% FCS, 2 mM L-glutamine, 100 U/ml penicillin, 100 mg/ml streptomycin and 0.11 mg ml-1 sodium pyruvate and were seeded to achieve 70-80% confluence at time of transfection. HEK293T cells were transfected using calcium phosphate co-precipitation of the five plasmids (LEGO-iG with TAT, REV, GAG/POL and VSV-G) at a mass ratio of 24 μg:1.2 μg:1.2 μg:1.2 μg:2.4 μg per 150 mm-diameter plate of cells. Viral supernatant was harvested after 24 h and subsequently every 12 h for 36 h before concentration with Centricon Plus-70 100-kDa filter (Millipore), using the manufacturer's instructions. Concentrated viral particles were stored at -70 °C before lentiviral transduction. Cell lines were transduced with concentrated virus in the presence of 8 μg/ml polybrene and 1x CD34 supplement (StemCell Technologies) by spinoculation at 1,500g for 50 min. After 12-16 h incubation at 37 °C, viral medium was exchanged for fresh medium. Cells were cultured for 3 days prior to treatment with 1.5 μg/ml doxycycline (Merck), with a further treatment after an additional 48 hours. After 3 days doxycycline treatment FACS was performed to isolate GFP+ (pCW57.1 dnFOS & dnCEBP) or Venus+/Tomato+ cells (pL40c-shRNA). For colony formation assays -sorted cells were seeded at 5000 cells/ml in Methocult Express (StemCell Technologies) with 1.5 μg/ml doxycycline and counted after 12 days.

Mini-shRNA screen in healthy cells
For shRNA mini screen in healthy cells, cells were not treated with doxycycline prior to FACS and Venus+ cells were collected. Cells were cultured for 12 days with 1.5 μl doxycycline added every 3 days. After 12 days DNA was extracted from cultured cells in the presence or absence of doxycycline and genomic DNA was extracted using the DNeasy blood and tissue kit (QIAGEN) according to manufacturer's instructions.

ATAC-seq analysis of primary cells
Omni ATAC-seq was performed as in Corces et al. 9 . Briefly, cells were washed in ATAC resuspension buffer (RSB) (10mM Tris-HCl pH7.5, 10mM NaCl and 3mM MgCl2) and then lysed for 3 minutes on ice in RSB buffer with 0.1% NP-40, 0.1% Tween-20. Then the cells were washed with 1ml of ATAC wash buffer consisting of RSB with 0.1% Tween-20. Then the nuclear pellet was resuspended in ATAC transposition buffer consisting of 25μl TD buffer and a concentration of Tn5 transposase enzyme (Illumina) related to the number of input cells, 16.5 μl PBS, 5 μl water, 0.1% tween-20 and 0.01% digitonin and then incubated on a thermomixer at 37°C for 30 minutes. The transposed DNA was then amplified by PCR amplification up to ½ of maximum amplification, as assessed by a qPCR side reaction. The library was purified using a QIAquick PCR cleanup kit (QIAGEN) followed by ampure (Beckman Coulter) and analysed on a Next Seq 2000 75 using a NextSeq 500/550 High output kit.

RNA-seq of primary cells
RNA was extracted from primary cells using a RNeasy Micro Plus kit (QIAGEN) where less than 50,000 cells were harvested, and a RNeasy Micro Plus kit (QIAGEN) for larger cell numbers. After quantification by nanodrop and QC using an Agilent RNA 6000 Pico Kit (Agilent, bioanalyser), libraries for next generation sequencing were prepared using the NEBnext Ultra II Directional RNA Library Prep Kit for Illumina (NEB) with the NEBNext® rRNA Depletion Kit v2 for low RNA input (<100 ng RNA), or the Total RNA Ribo-zero library preparation kit (with ribosomal RNA depletion) (Illumina) for higher RNA input. Libraries were quantified using the High Sensitivity DNA kit (Agilent) and Kapa Library Quantification kit (Roche) prior to paired end sequencing on a Next Seq 2000 (PE 75) with a NextSeq High 150 v2.5 kit.
Proximity Ligation Assay (PLA ) of CBFb:RUNX1 interaction 1.5 × 10 5 cells were adhered to microscope slides using a Cytospin cytocentrifuge (Thermo Fisher Scientific) for 3 min at 800g and fixed in 4% formaldehyde (Pierce) for 15 min. Cells were permeabilised in 0.1% Triton X-100 and nonspecific staining was prevented by incubation in 3% bovine serum albumin. Anti-CBFβ (sc-56751; Santa Cruz Biotechnology at 1:100) and anti-RUNX1 (ab23980, Abcam) at 1:100 primary antibodies were applied for 1 hr at room temperature in PLA antibody diluent solution. Probes, ligation, and amplification solutions (Duolink; Sigma-Aldrich) were then applied at 37°C according to the manufacturer's instructions, and the slides were mounted in Duolink mounting medium with DAPI (Sigma-Aldrich). Slides were visualised using a Zeiss LSM 780 equipped with a Quasar spectral (GaAsP) detection system, using a Plan Achromat 40× 1.2 NA water immersion objective, Lasos 30 mW Diode 405 nm, Lasos 25 mW LGN30001 Argon 488, and Lasos 2 mW HeNe 594 nm laser lines. Images were acquired using Zen black version 2.1. Post-acquisition brightness and contrast adjustment was performed uniformly across the entire image.

Single cell treatment scRNA-Seq analysis of CBFβi treated FLT3-ITD+ AML
Primary AML cells for scRNA-seq were cultured on hMSC feeders as described above in the following media: SFEMII (StemCell Technologies), 1 μM UM729 (StemCell Technologies), 750 nM StemReginin 1 (StemCell Technologies) supplemented with 150 ng/ml SCF, 100 ng/ml TPO, 10 ng/ml IL-3, 10 ng/ml G-CSF (Perpro tech). After 1 passage (1 week) in culture cells were treated for 24 h with 10 μM CBFβi or 0.1 % DMSO in the absence of UM729 and StemReginin 1. After treatment cells were sorted for CD45 using magnetic beads (Miltenyi Biotec). Cells were loaded on a Chromium Single Cell Instrument (10X Genomics), to recover 5000 single cells. Library generation was performed using the Chromium single cell 3' library and gel bead kit v3.1. Illumina sequencing was performed on a NovaSeq 6000 S1 run in pairedend mode for 150 cycles at a depth of 20000 reads per cell.

Bulk RNA-Seq data analysis
Raw paired-end reads were trimmed to remove low-quality sequences and adaptors using Trimmomatic v0.39 10 . Reads were then aligned to the human genome (version hg38) using HISAT2 v2.2.1 11 with default settings. Counts were generated with featureCounts v2.0.1 12 using gene models from ensembl as the reference transcriptome. Differential gene expression analysis was carried out using Limma-Voom v3.50.3 13 in R v4.1.2.

Single-Cell RNA-Seq analysis
Fastq files from single-cell sequencing experiments were aligned to the human genome (version hg38) using the count function in CellRanger v5.0.1 from 10x genomics 14 using gene models from ensembl as the reference transcriptome. Analysis was then carried out using the Seurat package v4.3.0 15 in R v4.1.2. Cells from CBFbi treated and untreated samples were filtered to remove cells with less than 500 and more than 6000 detected genes, as well as cells with more than 20% of reads aligned to mitochondrial transcripts. The filtered cells were then combined into a single dataset for downstream analysis. UMI counts were normalized using the NormalizeData function with default settings. The cell cycle stage for each cell was inferred using the CellCycleScoring function in Seurat. This score was then used to remove the possible effect of cell cycle stage on the analysis by linear regression using the ScaleData function. Clustering was then carried out using the FindNeighbors and FindClusters commands, using the top 20 principal components and a cluster resolution value of 0.25. Differential gene expression analysis was carried out for each single cell cluster, comparing CBFBi treated cells to untreated cells using the FindMarkers command. A gene with a log2 fold-change of at least 0.25 and an adjusted p-value less than 0.1 were considered to be differentially expressed. Kyoto Encyclopaedia of Genes and Genomes (KEGG) pathway analysis was then carried out on the sets of differentially expressed genes using the ClueGO package v2.5.0 16 in Cytoscape v3.9.1 17 . Cell trajectory (pseudotime) analysis was carried out using Monocle3 v1.3.1 18 . To do this, the processed data from Seurat was first divided into two objects corresponding to CBFBi treated and untreated samples. These were then imported into Monocle using the as.cell_data_set function in SeuratWrappers. Single cell trajectories were calculated using the cluster_cells and learn_graph commands in Monocle. Pseudotime was then calculated by rooting the trajectory at the earliest point of the inferred trajectory that occurred in the early progenitor cells.

shRNA data analysis
To calculate read counts from shRNA experiments, 75bp single-end reads in fastq format were first processed to remove the first and last 25bp from each sequence, corresponding to the regions flanking the shRNA sequence that are common across all reads. The shRNA sequences were then compared to the library of oligonucleotide sequences used in the experiment, allowing for only a single base mismatch. Read counts were normalized using upper-quartile normalization using the edgeR package v3.36.0 19 in R v4.1.2. To calculate fold-changes between doxycycline induced and non-induced cells, the normalized counts were fitted to a generalized linear model using edgeR. A shRNA sequence was deemed to have been lost if it had a log2 fold-change less then -1 between induced and non-induced samples.

ATAC-Seq data analysis
Single-end reads from ATAC-Seq experiments were processed to remove low-quality sequences and Nextera ATAC adaptors using Trimmomatic. Reads were then aligned to the human genome (version hg38) using Bowtie2 v2.2.5 20 with the option --very-sensitive-local. Potential PCR duplicates were identified and removed from alignments using Picard MarkDuplicates v2.26.10 (http://broadinstitute.github.io/picard). Peaks were called using MACS2 v2.2.7.1 21 with the parameters --nomodel -B --trackline. The resulting peaks were then filtered to remove any peak with a peak height less than 10 or were found in the hg38 blacklist 22 Where replicates were available, only peaks that passed these filters in both replicates were retained. A peak union was then created for each set of experiments by first extending the peak region by 200bp either side of the peak summit. Overlapping peaks were then combined using the merge function in bedtools v2.30.0 23 . The distance between the peak summit and the closest gene was then calculated using the annotatePeaks.pl function in Homer v4.9.1 24 . A peak was classified as distal if it was at least 1.5kb from the nearest transcriptional start site (TSS), and as promoter-proximal otherwise. Distal and promoterproximal peaks were treated separately in downstream analyses.