Abstract
Biomarkers of spinal cord injury (SCI) could help determine the severity of the injury and facilitate early critical care decision making. We analyzed global gene expression in peripheral white blood cells during the acute injury phase and identified 197 genes whose expression changed after SCI compared to healthy and trauma controls and in direct relation to SCI severity. Unsupervised co-expression network analysis identified several gene modules that predicted injury severity (AIS grades) with an overall accuracy of 72.7% and included signatures of immune cell subtypes. Our findings indicate that global transcriptomic changes in peripheral blood cells have diagnostic and potentially prognostic value for SCI severity.
INTRODUCTION
Precision medicine (often interchangeably called personalized medicine) promises to optimize individualized treatment options based on demographic and genetic characteristics as well as the specific biological features of the presenting disorder. Cancer therapy has already benefited greatly from this approach,1,2 where blood- and tissue-based bioassays are now routinely used for treatment planning.3,4 Here we present a strategy for extending this approach to the diagnosis and treatment of human spinal cord injury (SCI), a devastating and heretofore intractable condition characterized by injury heterogeneity and highly variable outcomes. Currently, SCI prognostics are based principally on acute evaluation of neurological status using sensory and motor exams including the American Spinal Injury Association (ASIA) grading system,5 which in the acute phase can be unstable and difficult to obtain, especially when patients are unresponsive or obtunded.6-9 Magnetic resonance imaging (MRI) provides invaluable information on severity and spinal cord level of injury, but is not always available and may be contraindicated for certain patients, e.g. those with penetrating metallic injuries.
The first attempts to discover SCI biomarkers of initial injury severity and long-term outcomes, date back four decades.10-12 Progress has been slow due at least in part to the diffuse regional presentation of acute SCIs and the difficulty of obtaining ultra-acute samples and patient data. Most attempts have used proteomics to identify serum and cerebrospinal fluid (CSF) biomarkers associated with injury severity. This approach depends upon measuring proteins associated with CNS damage (e.g. GFAP, neurofilament protein) released into the bloodstream, or on the peripheral cytokine response to CNS-injury-induced chemokines.13-20 Recent work shows promise, with several target molecules providing some useful predictive value,21-26 however, these circulating protein (and recently, RNA) markers are difficult to measure and subject to degradation. An alternative approach is to consider that circulating immune cells represent ‘sensors’ of CNS-injury-induced molecules, and that WBC transcriptomic changes provide a read-out of the complex peripheral immune response to the totality of signals associated with SCI over time. A recent study of WBCs in people with chronic SCI, for example, found reduced expression of genes associated with natural killer (NK) cell activity and increased expression of toll-like receptor and inflammatory cytokine genes27. These preliminary findings, although limited by a small number of cases, are consistent with observations that people with chronic SCI have suppressed immune responses and are more susceptible to infections.28-30
TRACK-SCI (“Transforming Research And Clinical Knowledge-SCI”) is a multicenter prospective clinical study focused on acute critical care variables (e.g. MRI, multiple physiological variables, time to surgery) and blood transcriptomics as indices of severity and predictors of outcome.31-40 SCI has a profound impact on circulating white blood cells (WBCs), inducing peripheral inflammation and WBC phenotypic changes in a dynamic cascade that is likely to reflect the biological features of the evolving CNS lesion.41-44 TRACK-SCI provides a large WBC transcriptomic dataset that will be useful for developing RNA-based blood biomarkers of injury and recovery that can be related to patient characteristics and multivariate outcomes. Utilizing advanced analytic methods, these blood biomarkers, along with other critical care variables, may be instrumental in the development of predictive algorithms for acute SCI treatment planning, to stratify patients for clinical trials and to predict long-term outcomes.
We are using deep RNA sequencing and advanced analytics to develop a blood RNA biomarker profile for acute SCI. Here, we report that early WBC transcriptomic signatures alone can accurately predict injury severity on the ASIA impairment scale (AIS). Further, these signatures provide novel biological data that should be useful in understanding mechanisms of injury and repair. Our findings provide proof of concept for the development of an accurate blood RNA biomarker of acute SCI severity. These data provide a strong rationale for expanding this work to include longitudinal multivariate analysis of gene expression patterns across injury severities, individual patient characteristics, and time, in order to provide a comprehensive description of evolving WBC gene expression patterns and their relationships to long-term outcomes.
RESULTS
TRACK-SCI patient accrual, data collection, and retention
TRACK-SCI protocols for patient enrollment and data collection have been described recently.45 Patients admitted to the emergency department at the Zuckerberg San Francisco General Hospital and Trauma Center (ZSFG) are recruited to the study and consented as soon as possible after admission. The TRACK-SCI protocol includes rapid pre-operative imaging, emergent transfer to the OR for decompression surgery as indicated, immediate blood collection and processing, followed by high-density ICU monitoring of vitals and daily sensorimotor and International Standards for the Neurological Classification of Spinal Cord Injury (ISNCSCI) exams (see Fig S1). To date, 179 participants with SCI have been enrolled. The current report is based on deep sequencing of RNA from acute blood samples from 38 subjects with SCI, 10 healthy uninjured controls (HC), and 10 trauma controls with non-CNS injuries (TC; see Table 1).
The WBC transcriptome separates SCI patients from trauma and healthy controls
We isolated 4ml of peripheral blood from 38 enrolled patients within a few hours (Fig S1, Table 1) after SCI. The blood was immediately processed, and total RNA was isolated from WBCs. The same procedure was followed for 10 TCs and 10 HCs. After RNAseq, raw counts were produced and normalized, and a T-distributed Stochastic Neighbor Embedding (t-SNE) plot was created using the principal components responsible for 90% of the variance (Fig 1A). The three groups are clearly separated based on their transcriptomic status at the time of the blood draw. These results support our hypothesis that the transcriptome of the WBCs contains valuable information about the pathophysiological status of the patients and warrant a more sophisticated and deeper analysis to reveal more details. Next, we performed differential gene expression (DGE) analysis among the three groups, revealing 2,096 genes that were significantly altered (> 2-fold change, adjusted p-value < 0.05) only in the SCI population (Fig 1B and S2). We then queried, how many of those genes display an expression pattern that follows the injury severity levels of the AIS grade. Among the 2,096 genes that were differentially expressed after SCI, 197 of them showed directional expression patterns with SCI severity. 117 of them increased their expression with injury severity and 80 decreased their expression with injury severity (Fig 1C). Gene ontology enrichment analysis showed that processes involved in the immune response and cellular secretion and localization were the most highly enriched. (Fig S3).
Co-expression network analysis reveals gene modules that predict SCI severity
In order to study the modular organization of WBC transcriptomes in our sample cohort, we performed unsupervised gene co-expression network analysis46,47 and identified 16 modules (arbitrarily designated M1-M16) for which SCI patients’ combined expression was significantly different from both TCs and HCs (Fig 2A). These modules represent coherent transcriptomic signatures in WBCs that covary specifically as a result of SCI and are therefore targets for biomarker generation as well as potential indicators of underlying pathology and recovery. Among these, M13 exhibited the strongest correlation with AIS grade (Spearman rho = 0.82, p-value = 1.56 × 10−14). Fig 2B-C shows the details of top gene co-expression patterns for this module.
Next, we wanted to determine whether any of these 16 modules, alone or in combination, could predict the initial SCI severity as indicated by the AIS grade assigned between days 3-10 after SCI. We therefore performed multinomial logistic regression with LASSO48 regularization to predict AIS grade using all 16 module eigengenes. We identified one gene module (M12; Fig. S4) that predicted AIS ‘A’ patients with 83.3% accuracy and a combination of five modules (M1, M5, M10, M13, and M16) that predicted AIS ‘D’ patients with 90.9% accuracy (Fig. S4, Tables 3 & S1). Our cohort included too few patients with ‘B’ or ‘C’ classifications (n=4 and 6 respectively) to provide useful predictors of these grades (see supplemental files). Overall, our model shows 72.7% accuracy (p-value = 2.35 × 10−5). We proceeded to test the diagnostic value of the identified modules to detect AIS ‘A’ and ‘D’ patients in our cohort using a receiver operating characteristic (ROC) analysis. The area under the curve for AIS ‘A’ patients and AIS ‘D’ patients was 0.865 and 0.938, respectively, confirming that our model can predict these two injury severities with high sensitivity and specificity, despite small sample sizes (Fig 2D). Together these data show for the first time that WBC expression profiles can very accurately classify SCI patients based on AIS grade.
“Digital cytometry” and module enrichment analysis identifies WBC subtype differences between SCI and controls
To determine whether cellular composition varied among WBCs from our sample cohorts, we performed cell-type deconvolution with CIBERSORTx,49,50 which infers cell-type proportions in bulk tissues from global gene expression patterns. Applying this method to our samples revealed the relative proportions of 22 leukocyte subtypes. These “digital cell types” were then compared across groups (HC, TC and SCI) and AIS grade levels (Fig. S5A-B). Five cell types (neutrophils, resting NK cells, and resting CD4, naïve CD4 and gamma delta T cells) exhibited significantly different proportions among the 3 groups, but none of these were significant when AIS grades were compared.
To clarify which cell types contributed to the co-expression modules that predict SCI severity, we cross-referenced module composition with cell type-specific gene sets from published single-cell RNA-seq datasets.49,51 The M12 module, which predicted AIS ‘A’ injury severity, was significantly enriched with genes expressed by resting NK cells, mast cells, and CD66+ granulocytes (Fisher’s exact test p-values of 2.9 × 10−7, 5.19 × 10−7, and 7.81 × 10−6 respectively). The M13 module, which predicted AIS ‘D’ severity, was also significantly enriched with CD66+ granulocytes (p = 1.45 × 10−8) (see supplemental files). These data suggest that expression changes in WBCs associated with SCI can be further subdivided into specific contributions from distinct cell types, which could lead to refined assays and predictors.
DISCUSSION
Early biomarkers of SCI could enable more efficient and personalized clinical treatments, as well as better stratification of patients for clinical trials, since early clinical evaluations often lack reliability.7,19 In contrast to existing biomarkers that require rapid access to complex machinery (e.g. MR scanning), fluid biomarkers are more accessible and can provide novel insights about systemic responses to SCI. Although previous studies have proposed large structural proteins in blood serum/plasma and CSF as fluid biomarkers of SCI,14,23,52 these proteins are susceptible to proteases and degrade rapidly, rendering measurements of concentration variable and time sensitive. More recent studies of circulating miRNAs as biomarkers are promising since miRNAs are not as sensitive to degradation due to their small size (22nt) and the fact that most are protected inside exosomes.24,25,53,54
Our approach differs substantially from previous efforts to identify fluid biomarkers of SCI. First, we have analyzed potential biomarkers that are ‘safely housed’ in their cell of origin at the time of collection, as opposed to free-floating molecules in CSF or blood. Second, instead of pre-selecting candidate biomarkers in advance, we have performed a high-throughput analysis of 17,500 transcripts from WBCs isolated from each patient. This high-dimensional readout of the immune response during the acute phase of the injury provides important information about how the periphery affects the progress of the central lesion and may lead to new hypotheses and targets for intervention. Our results indicate that global gene expression patterns in WBCs can reliably distinguish SCI patients from healthy and non-CNS trauma controls. Moreover, when overlaying these gene expression patterns with the widely used AIS injury grade classification system, we identified 197 genes whose expression levels changed with increasing injury severity and may serve as novel fluid biomarkers of SCI. The list of these 197 genes includes a number of genes that seem appropriate, as well as some whose functions are yet to be delineated, which might provide clues to new therapeutic targets (see supplemental files). However, there is growing evidence (mainly from cancer research) to suggest that single genes do not usually make good biomarkers.55 Of course, genes do not work in isolation, and there is overwhelming evidence for reproducible transcriptional covariation in blood and other tissues, suggesting a modular organization to genomic function. These modules are not always neatly captured by existing gene ontologies, hence the need to perform unsupervised co-expression analysis. Multivariate analysis revealed 16 gene co-expression modules associated with SCI, a subset of which were able to predict AIS ‘A’ and AIS ‘D’ SCI patients with impressive accuracy. From a clinical perspective this finding can be a ‘game-changer’ considering that in many instances, several hours and logistically challenging infrastructure (e.g. MRI) is needed even for confirmation of the SCI, let alone for determining the degree of severity.
Another goal of our study is to predict long-term outcomes (e.g. the AIS grade at 6- and 12-months post SCI). We are collecting longitudinal data from enrolled patients and in the future will be able to extend our analysis in this fashion. Moreover, due to the well-known limitations of the AIS grade scale, additional outcome measures (e.g. upper and lower extremity motor scores and sensory scores from the ISNCSCI neurological examination) could offer more detailed insights into patients’ conditions and enable more accurate predictions. As we increase the number of enrolled patients and sequenced blood samples, such analyses will be feasible.
Although we have demonstrated proof-of-concept for our methodology in a sample of 38 SCIs, TRACK-SCI continues to enroll patients, with 179 participants to date. By combining multivariate and longitudinal analysis of WBC transcriptomes with detailed clinical information, we seek to create a novel framework for diagnosing SCI severity and predicting outcomes based on the systemic immune response. Such a framework could eventually replace the ASIA grading system or be used in combination to allow more precise clinical decisions. Additional studies with larger sample sizes will be required to validate these findings and move towards a practical RNA-based blood biomarker. Furthermore, it should be possible to determine whether diagnostic patterns reflect WBC responses to specific CNS-injury-induced signals such as CNS protein products and/or signaling molecules such as chemokines. The clear differentiation between expression profiles from trauma controls and SCI suggests that WBC transcriptomes contain latent information specific to CNS injury, raising the possibility that WBC RNA profiles may also respond to treatments that mitigate secondary damage or are associated with recovery of function. If so, the routine analysis of RNA expression profiles in blood may provide both a practical clinical tool and a new window into the biology of human CNS injury and its treatment.
Author contributions
Conception and design: N.K., A.R.F., M.C.O., J.C.B., M.S.B.; Data collection: N.K., X.D.F., L.H.T., R.E.T., D.D.H., L.U.P., V.S., W.D.W., S.S.D.; Data curation and analysis: N.K., A.T.E., P.G.S., J.R.H., A.C., X.D.F., L.H.T., R.E.T., A.R.F., M.C.O., J.C.B., M.S.B.; Manuscript preparation: N.K., A.T.E., P.G.S., M.C.O., J.C.B., M.S.B.; Manuscript revision: all authors; Funding acquired: N.K., M.S.B.
Competing interests
The authors declare no competing interests.
Data availability statement
All data generated or analyzed during this study are included in this published article and its supplementary information files.
MATERIALS AND METHODS
TRACK-SCI patients and controls enrollment
All procedures for this study were conducted with the approval of the Human Subjects Review Boards at the University of California at San Francisco (UCSF) and the U.S. Department of Defense Human Research Protection Office. All English and non-English speaking patients who presented to the emergency department (ED) and were diagnosed with a traumatic SCI were initially eligible for the study. Patients who were < 18 years old, in-custody, prisoners, pregnant, or on medically indicated psychiatric hold were excluded. Informed consent was sought for all patients. For patients who were unable to sign for themselves due to their injury, a witness unaffiliated with the study was present throughout the consenting process and signed on the patient’s behalf. Patients incapable of consenting themselves were initially enrolled via a legally authorized representative (LAR; next of kin) or another suitable surrogate when one was available, then later approached for patient consent. Patients and surrogates had the option to participate in all or some of the following study portions: blood draws, ISNCSCI exams, and/or follow-up assessments. Patients were compensated ($50) after each time point (hospital stay, 3-month phone call, 6-month in-person visit, 12-month in-person visit) for a total of $200.
Non-SCI subjects were either healthy controls (n=10) or trauma controls (n=10). Healthy controls were recruited using IRB-approved recruitment flyers posted at ZSFG and from friends and family of enrolled SCI patients. Subjects contacted study coordinators, were interviewed and consented, and provided basic demographic information and biospecimens. Trauma controls were recruited from ED patients with traumatic but non-CNS injuries. The same basic demographic and biospecimen data were collected for these patients as SCI patients for comparison purposes patients except for multiple blood samples per patient. No monetary compensation for participation was provided for the control subjects.
Patient data collection
The foundation of the TRACK-SCI database is the NINDS-recommended common data elements (CDEs).56 Core CDEs are data elements that all SCI studies are strongly encouraged to use in collection of basic participant information. Additional measures from the International Spinal Cord Society (ISCoS) were also used. Data collection domains include demographic, clinical, radiologic, and functional outcome measures. All data collected from these CDEs were housed in a Research Electronic Data Capture (REDCap)57,58 database and include more than 21,000 data fields including additional institutional variables, calculated fields, repeated measures, date/time stamping of measures, and completion status log. Upon admission to the inpatient service, another 19,148 data fields regarding trauma characteristics, injury severity, blood pressure management, operating room procedures, interventions, hospital outcomes, high-frequency operating room vital signs, as well as motor-sensory exams and pain questionnaires are obtained from both paper and electronic medical records as well as participant interview. REDCap is in full compliance with Health Insurance Portability and Accountability Act (HIPAA) security standards for protection of personal health information (PHI). The following CDE categories comprised the demographic and clinical data domain: (1) demographics, (2) health history, (3) injury-related events, (4) assessments and examinations. A total of 229 variables concerning patient demographics, medical history, and consent/contact information were collected through abstraction from electronic medical record systems and participant interviews.
The International Standard for Neurological Classification of Spinal Cord Injury (ISNCSCI)59 was used to assess motor and sensory function, and group patients by injury severity based on the ASIA impairment scale (AIS) which ranges from A (most severe – complete) to E (not impaired). ISNCSCI exams were conducted by trained personnel who completed the ASIA International Standards Training E Program (InSTEP) and in-person training. ISNCSCI exams were performed on all patients during the initial admission, either as part of clinical care if the treating provider completed InSTEP training, or separately for the research study if the ISNCSCI was not performed for clinical purposes. Occasionally, an ISNCSCI was not performed or not completed during the admission, usually because the patient was excessively sedated and could not participate in the exam. In the case of incomplete ISNCSCI exams, the assessor gave an estimated AIS grade based on the collected data and the overall clinical picture of the patient.
If possible, patients completed examinations at regular intervals including admission (day 0 = 0-23 hours from injury), every 24 hours until post-injury day 7, discharge, 6-month follow-up (+/− 2 weeks), and 12-month follow-up (+/− 2 weeks). All ISNCSCI exams results were included in the REDCap database.
Biospecimen collection
Two blood samples were collected, one, 4ml, for total RNA extraction from white blood cells (WBCs) and the other, 6ml, for serum isolation. Samples were aliquoted and frozen at −80°C within 1 hour of collection. To preserve WBC concentrations, 7.2mg K2-EDTA vacutainer tubes were used for WBCs collection and subsequent RNA extraction, instead of K3-EDTA vacutainer tubes. The second blood sample was collected in 6 ml Z serum clot activator vacuette tubes for serum extraction. To prevent reduction in sample volume, externally threaded cryovials were used for serum storage. Serum was divided into multiple 500 µl aliquots for storage. An inventory system was developed to track time of collection, processing and storage for all biospecimens.
WBC isolation and RNA extraction
Blood was centrifuged at 1,500 rpm for 15 minutes at room temperature within 0-15 minutes from blood draw. Then, the interface layer (buffy coat) was carefully aspirated with a pipette and placed in 10mL of 1X solution of Red Blood Cell Lysis Buffer (BioLegend) for 15 minutes in the dark. After the 15-minute incubation, the solution was centrifuged at 1,500 rpm for 10 minutes. The supernatant was discarded, and the cell pellet resuspended in 1mL of TRIZOL (Ambion) and either stored at −80°C or immediately processed for RNA extraction. Total RNA from WBCs was extracted using the TRIZOL method. The RNA yield was between 15 and 25μg per 5ml peripheral blood. 1μg of the total RNA was then used for generating the Illumina cDNA library, which was used for the downstream RNAseq.
RNA sequencing
1μg of total RNA was used for the library synthesis. cDNA libraries were synthesized using Illumina’s TruSeq Stranded Total RNA with Ribo-Zero Globin kit. The kit depletes ribosomal RNA which makes up more than 90% of total RNA, and globin mRNA that is present in very high levels in blood total RNA. The libraries were quantified using a Thermo Scientific Nanodrop 2000c spectrophotometer, and their quality and average fragment size was assessed using Agilent’s DNA 1000 kit and Agilent’s 2100 Bioanalyzer. After quantification, equal amounts of 10 libraries, each one with different barcoded adapters, were pooled together to be sequenced in one lane of the Illumina’s HiSeq4000 sequencer. Based on the specifications of the HiSeq4000 and our sample pooling per lane, we aimed to get about 40 million reads per sample which has been shown to be sufficient to reveal the vast majority of the differentially expressed genes of a well-annotated genome.60 The sequencing output of our samples can be seen in Table S2.
Bioinformatic analysis
Data analysis was performed in R61,62 using the statistical packages that are specifically mentioned below as well as the packages dplyr63, ggplot264, cowplot65, table166, rgl67, PCAtools68, magick69, EnhancedVolcano70, VennDetail71, Rtsne72, and ggdendro.73 The raw reads of the fastq files were tested for quality control using the FastQC software74 and were then aligned to the human reference genome (hg38 from UCSC) using the software TopHat2.75 After the alignment, we used the featureCounts76 program to summarize the gene counts, and then the programs edgeR77 and limma78 for differential gene expression analysis through linear modeling. Gene ontologies enrichment analysis was performed with the GOrilla79 tool and the visualization of the enriched GO terms with the tool ReviGO.80
SCI-specific differentially expressed genes and neurological outcome
Preliminary examination of the data using network methods81 revealed several samples as outliers as well as library and sequencing batch effects. Using the ComBat82 function of the sva83 package in R to target the library batch effect we were able to remove both batch effects and as a result no sample appeared as an outlier anymore. We performed differential gene expression analysis among the three groups (HC, TC, and SCI) and after the intersection of the three comparisons (HC vs TC, HC vs SCI, and TC vs SCI; fold change > 2 and adjusted p-value < 0.05) we selected only the genes that significantly and specifically changed their expression after SCI (n = 2,096). In order to examine the relationship between the gene expression changes and injury severity, we grouped the SCI patients based on the assigned AIS grade given after a neurological examination between days 3 and 10 at the hospital. From the 38 SCI patients whose transcriptome was sequenced, 5 did not have an AIS grade during that time window and hence were removed from this part of the analysis. We then averaged the expression levels of each gene per group and queried for genes which exhibit a stepwise increased or decreased expression as the SCI severity increased. That query resulted in 197 genes shown in Fig 1C. The complete gene list can also be found in the supplemental files.
Gene co-expression network analysis and identification of SCI-specific gene modules
The normalized expression matrix that was generated for the differential gene expression analysis with edgeR was used as a template for the unsupervised gene co-expression network analysis.46,47 We built a series of gene co-expression networks and identified one that included the gene module with the highest Spearman correlation to AIS grade. That network contained 57 gene modules. Using one-way ANOVA with Tukey’s multiple comparison correction we identified 17 gene modules that were highly specific for SCI (significantly different from both HC and TC; adjusted p-value < 0.05). One of these modules contained genes annotated to be involved in rRNA processes, which likely represented an artifact, and was eliminated from the subsequent analysis. The 16 remaining SCI-specific modules (M1 – M16; Fig 2A) were used to create a predictive model of SCI severity using multinomial logistic regression.
Multinomial logistic regression with regularization
We generated a predictive model of SCI severity using the eigengenes (first principal components)84 of the 16 SCI-specific gene modules as predictors. AIS at discharge from the hospital was used as the target outcome variable in a multinomial logistic regression model. In order to deal with the high number of predictors (16 modules), LASSO regularization was applied, using leave-one-out cross-validation to determine the regularization parameter (λ). The final model was chosen with λ producing a model with a misclassification error at one standard deviation from the minimal misclassification error. The model was specified using the glmnet48 and the glmnetUtils85 packages in R. The model was assessed by confusion matrix metrics (Tables 3 & S1) of internal prediction obtained using the caret R package.86 Overall accuracy (percentage of correct classification) was 72.7% with a 95%CI of 54.5-86.7%, resulting in significant accuracy (p-value<0.0001) against random classification (No Information Rate of 36.4%). The uniform weighted overall accuracy (accounting for class unbalance) was 62.3%, with accuracy for AIS ‘A’ = 83.3%, AIS ‘B’ = 25%, AIS ‘C’ = 50%, and AIS ‘D’ = 90.9%. Receiver operating characteristic (ROC) curves for each AIS class were obtained by binarizing the problem (e.g. for AIS ‘A’, A = 1; B, C, D = 0) and re-running the model as a binary classification. The curves were obtained using the roc() function of the pROC R package87 and smoothing transformation was applied to each ROC curve using the smooth() function of the pROC R package.
CIBERSORTx and module enrichment analysis
CIBERSORTx50 is a machine learning algorithm that uses deconvolution methods to infer the proportions of cell types from gene expression patterns in bulk tissues. It is called digital cytometry because it performs an analogous function to regular flow cytometry without the need to physically isolate cells. We used the CIBERSORTx tool (https://cibersortx.stanford.edu/index.php) on all 58 of our samples and cross-referenced it with the LM22 signature49 using 100 permutations. LM22 is a validated leukocyte gene signature matrix containing 547 genes that distinguish 22 human hematopoietic cell types. The output of the algorithm is the relative abundance of each of the 22 subtypes for all samples (Fig S5A). For enrichment analysis, modules were defined as all unique genes with positive kME values (Pearson correlation coefficients of module eigengenes)84 that were significant after applying a Bonferroni correction for multiple comparisons (p < 0.05 / (# genes × # modules)). If a gene was significantly correlated with more than one module eigengene, it was assigned to the module for which it had the highest kME value. Enrichment analysis was performed for each gene set of interest with published human RNA-seq datasets,49,51 using a one-sided Fisher’s exact test as implemented by the fisher.test R function.
Acknowledgements
The current work was supported by grants from the Department of Defense (W81XWH-13-1-0297 and W81XWH-16-1-0497) and Craig H. Neilsen Foundation (UCSF SCI Center of Excellence special project award) to M.S.B. and an individual grant from Wings for Life (WFL-US-07/18) to N.K. The authors would also like to thank the Center for Advanced Technology (CAT) at UCSF for providing consultation and critical equipment for the completion of this study.
REFERENCES
- 1.↵
- 2.↵
- 3.↵
- 4.↵
- 5.↵
- 6.↵
- 7.↵
- 8.
- 9.↵
- 10.↵
- 11.
- 12.↵
- 13.↵
- 14.↵
- 15.
- 16.
- 17.
- 18.
- 19.↵
- 20.↵
- 21.↵
- 22.
- 23.↵
- 24.↵
- 25.↵
- 26.↵
- 27.↵
- 28.↵
- 29.
- 30.↵
- 31.↵
- 32.
- 33.
- 34.
- 35.
- 36.
- 37.
- 38.
- 39.
- 40.↵
- 41.↵
- 42.
- 43.
- 44.↵
- 45.↵
- 46.↵
- 47.↵
- 48.↵
- 49.↵
- 50.↵
- 51.↵
- 52.↵
- 53.↵
- 54.↵
- 55.↵
- 56.↵
- 57.↵
- 58.↵
- 59.↵
- 60.↵
- 61.↵
- 62.↵
- 63.↵
- 64.↵
- 65.↵
- 66.↵
- 67.↵
- 68.↵
- 69.↵
- 70.↵
- 71.↵
- 72.↵
- 73.↵
- 74.↵
- 75.↵
- 76.↵
- 77.↵
- 78.↵
- 79.↵
- 80.↵
- 81.↵
- 82.↵
- 83.↵
- 84.↵
- 85.↵
- 86.↵
- 87.↵