Analysis of single-cell RNA-sequencing data to identify quiescent and proliferating neural cell populations in Glioblastoma

Background Diffuse Glioblastoma (GBM) has high mortality and remains one of the most challenging type of cancer to treat. Identifying and characterizing the cells populations driving tumor growth and therapy resistance has been particularly difficult owing to marked inter and intra tumoral heterogeneity observed in these tumors. These tumorigenic populations contain long lived cells associated with latency, immune evasion and metastasis. Methods Here, we analyzed the single-cell RNA-sequencing data of high grade glioblastomas from four different studies using integrated analysis of gene expression patterns, cell cycle stages and copy number variation to identify gene expression signatures associated with quiescent and cycling neuronal tumorigenic cells. Results The results show that while cycling and quiescent cells are present in GBM of all age groups, they exist in a much larger proportion in pediatric glioblastomas. These cells show similarities in their expression patterns of a number of pluripotency and proliferation related genes. Upon unbiased clustering, these cells explicitly clustered on their cell cycle stage. Quiescent cells in both the groups specifically overexpressed a number of genes for ribosomal protein, while the cycling cells were enriched in the expression of high-mobility group and heterogeneous nuclear ribonucleoprotein group genes. A number of well-known markers of quiescence and proliferation in neurogenesis showed preferential expression in the quiescent and cycling populations identified in our analysis. Through our analysis, we identify ribosomal proteins as key constituents of quiescence in glioblastoma stem cells. Conclusions This study identifies gene signatures common to adult and pediatric glioblastoma quiescent and cycling stem cell niches. Further research elucidating their role in controlling quiescence and proliferation in tumorigenic cells in high grade glioblastoma will open avenues in more effective treatment strategies for glioblastoma patients.

major patient groups (pediatric, adult and recurrent). Table 1 shows the major characteristics of 1 4 0 the included datasets. For differential gene expression analysis of GBM subpopulations, we also 1 4 1 included brain metastasis (lung squamous cell carcinoma) data set from the study GSE117891.  Significant inter and intratumoral heterogeneity is a challenge in identifying GSC like niches Approximation and Projection (UMAP) ( Figure 1A). The neuronal cluster significantly neoplasia. Interestingly, non-GBM tumor cells (GSE117891) did not express EGFR and showed 1 5 7 markedly different non myeloid/immune cluster profile. ( Figure 1D). These clusters showed 1 5 8 distinct expression profiles for glial cell markers for astrocyte(S100B) and  the pediatric group, all clusters except clusters 3 and 7 showed high expression of these genes 1 8 9 (Figure 2 C). Interestingly, CD44 expression did not follow this pattern, its expression was both adult and pediatric groups are included in additional file 1.

9 7
Identification of types and cell cycle stages To determine the presence of cycling and Quiescent cells in both groups, we sought to identify 1 9 9 and distinguish these cells from mature neural and glial cells. We used single cell datasets of To do a comparative analysis of the gene expression patterns between the groups, differential is a ligand for the Notch pathway and plays a pleotropic role in notch pathway regulation [32].

0
On the other hand, cGSCs in both groups were marked by overexpression of HMGB2, HSP90B1 2 2 1 and KPNA2 apart from TOP2A (figure 3 C). HMGB2 is a member of the high mobility protein 2 2 2 family, functioning as a modulator of chromatin structure. However, recent study has shown its  Similarly, HSP90B1, a member of the heat shock protein family, has a role in maintaining perhaps is the fact that pediatric brain cells are primed for development.

9
This is also evident from the cell cycle stage prediction. Previous studies have shown that the cellular states of these populations, we did a cell cycle state pseudotime prediction using 2 3 4 Tricycle R package as described in the methods section. As, the Quiescent or G0 state is not exclusively defined in continuous cell state pseudotime embedding, we expected to find the 2 3 6 qGSC cells to be predicted in the G0/G1 phase range, whilst cGSCs to be in G2/M state range. The results were as expected with qGSC almost exclusively in G0/early G1 state whilst cGSCs  phase. In terms of disease model, it is likely that the mechanism of transition from quiescent to whereas in the pediatric group the ribosome overexpressing cluster (cluster 2) was absent in BT- represent the total tumoral heterogeneity or that the qGSC and cGSC states are interconvertible.

7 1
Cluster wise differentially expressed genes for both adult and pediatric groups are included in 2 7 2 additional file 2. clusters were enriched in cell cycle stages of DNA replication and sister chromatid separation.

7 7
These clusters are likely representative of cycling cells from S to G2 phases. Interestingly, in the 2 7 8 pediatric group, we found that the cluster 5 which comprised of a few cells from samples BT-  19 showed a copy number gain while chromosome10 showed a loss of copy number. Locus gain 2 9 5 at chromosome 19 is relevant in this study's context because Chromosome 19 which has a high 2 9 6 gene density, also harbors a large number of ribosomal genes [41]. While there seems to be a 2 9 7 correlation between copy number alteration at chromosome 19 and ribosomal protein abundance, 2 9 8 we could not verify this correlation in terms of causation. However, we consider this an 2 9 9 interesting finding, which needs to be explored in detail in the future. plasticity. This behavior of GSCs would suggest that a marker based strategy, although very GSCs [15,18,20,22,42], these studies support the theory that malignancy in glioblastoma is a inherently a disease of neural stem/progenitor cells. quiescence or triggering proliferation.

4 5
The results of this analysis provides strong evidence that quiescent and cycling stem like cells in All Single-Cell RNA-Seq raw read count matrices and metadata files (wherever available) were Here, we used the reference dataset provided in tricycle with default parameters to infer cell To compare the copy number variations between clusters and datasets, we used CONICSmat added certainty, we used normal brain expression matrix of 322 normal brain cells from