Abstract
Single-cell and single-nucleus RNA sequencing have been widely adopted in studies of heterogeneous tissues to estimate their cellular composition and obtain transcriptional profiles of individual cells. However, the current fragmentary understanding of artefacts introduced by sample preparation protocols impedes the selection of optimal workflows and compromises data interpretation. To bridge this gap, we compared performance of several workflows applied to adult mouse kidneys. Our study encompasses two tissue dissociation protocols, two cell preservation methods, bulk tissue RNA sequencing, single-cell and three single-nucleus RNA sequencing workflows for the 10x Genomics Chromium platform. These experiments enable a systematic comparison of recovered cell types and their transcriptional profiles across the workflows and highlight protocol-specific biases important for the experimental design and data interpretation.
Background
Single-cell RNA sequencing (scRNA-seq) is an increasingly powerful technology which enables analysis of gene expression in individual cells. ScRNA-seq has been recently used to study organism development [1–3], normal tissues [4–6], cancer [7–10] and other diseases [11, 12]. These studies have shed light on tissue heterogeneity and provided previously inaccessible insights into tissue functioning.
Advances in high-throughput droplet-based microfluidics technologies have facilitated analysis of thousands of cells in parallel [13–15], and Chromium from 10x Genomics has become a widely used commercial platform [15]. Multiple tissue preparation protocols are compatible with Chromium, but the protocol of choice should ideally maintain RNA integrity and cell composition of the original tissue.
Solid tissues need to be dissociated to release individual cells suitable for 10x Genomics Chromium scRNA-seq. However, optimal dissociation needs to achieve a balance between releasing difficult to dissociate cell types while avoiding damage to those that are fragile. Tissue dissociation is most commonly conducted using enzymes, which require incubation at 37°C for variable times based on tissue type. At this temperature, the cell transcriptional machinery is active, hence, gene expression can be altered in response to the dissociation and other environmental stresses [16, 17]. A recent alternative approach minimising this artefact uses cold-active protease to conduct tissue dissociation on ice [18]. Alternatively, single-nucleus RNA sequencing protocols (snRNA-seq) use much harsher conditions to release nuclei from tissue and can be applied to snap frozen samples thus avoiding many of the dissociation-related artefacts [19, 20]. Single nuclei methods should also permit profiling of nuclei from large cells (>40) µm that do not fit through the microfluidics.
Additional restrictions and challenges are faced by complex experimental designs where specimens cannot be processed immediately. In this case, samples need to be preserved either as an intact tissue or in a dissociated form as a single-cell suspension. However, current preservation protocols may introduce additional biases.
Each of the approaches mentioned above introduce specific biases and artefacts which can manifest themselves in altered transcriptional profiles or altered representation of cell types. These biases need to be considered when designing and analysing data from a single-cell experiment; however, they are still incompletely understood.
Some of the artefacts have been investigated in recent studies comparing single-cell profiles of methanol-fixed and live cells [21, 22], cryopreserved and live cells [22, 23], single-cell and single-nucleus protocols [24–26], or tissue dissociation using cold-active protease and traditional digestion at 37°C [18]. However, these assessments were performed in different tissues under different conditions and lack extensive comparison to bulk data.
Here, we performed a comprehensive study in healthy adult mouse kidneys using 10x Genomics Chromium workflows for scRNA-seq and snRNA-seq, along with bulk RNA-seq of undissociated and dissociated tissue. We compare and contrast two tissue dissociation protocols (digestion at 37°C, further referred to as warm dissociation, or with cold-active protease, further referred to as cold dissociation), two single-cell suspension preservation methods (methanol fixation and cryopreservation) and three single nuclei isolation protocols (Fig. 1). A total of 77,656 single-cell, 98,303 single-nucleus and 9 bulk RNA-seq profiles were generated and made publicly available [in process]. Our dissection of artefacts associated with each of the approaches will serve as a valuable resource to aid interpretation of single-cell and single-nucleus gene expression data and help to guide the choice of experimental workflows.
Results
Comparison of tissue dissociation protocols
In the first series of experiments, we set out to compare two tissue dissociation protocols using kidneys from adult male C57BL/6J mice. Kidneys were dissociated at 37°C using a commercial Miltenyi Multi Tissue Dissociation Kit 2 or on ice using a cold-active protease from Bacillus licheniformis (Methods, Fig. 1A, B). Aliquots of single cell suspensions were profiled using 10x Genomics Chromium scRNA-seq and a bulk RNA-seq protocol (Methods, Fig. 1A, B). All experiments were performed in triplicate and data were processed as described in Methods.
Warm tissue dissociation induces stress response
Bulk RNA-seq profiling of single-cell suspensions revealed induction of stress response genes in warm-dissociated samples. Differential expression (DE) analysis of warm-and cold-dissociated suspensions reported 71 DE genes (DEGs) with higher expression in warm-dissociated kidneys and 5 DEGs with higher expression in cold-dissociated kidneys (logFC > 2, FDR < 0.05, edgeR exact test [27], gene lists are available in Supplementary Table 1). Functional analysis with ToppGene [28] reported “regulation of cell death” as the top significantly enriched Gene Ontology Biological Process (GO BP) for the genes more highly expressed in warm-dissociated kidneys (an overlap of 22 genes, FDR = 1.666E-7, see Supplementary Table 1 for DEG lists and functional analysis results). Genes with the highest logFC values (>4) included immediate-early genes Fosb, Fos, Jun, Junb, Atf3, Egr1 and heat shock proteins Hspa1a and Hspa1b (Fig. 2A). These findings confirm the original observations of Adam et al. [18] that warm tissue dissociation induces substantial stress-response-related changes.
Single-cell sequencing reveals heterogeneous stress response across cell populations
We next characterised differences between the two tissue dissociation protocols by scRNA-seq profiling of freshly dissociated cells (Fig. 1A, B). This dataset comprised 23,108 cells, including 11,851 cells from cold-and 11,257 cells from warm-dissociated kidneys (see Methods). These cells were then annotated using scMatch [29] by comparing their expression to reference gene expression profiles extracted from three previously published single-cell or single-nucleus RNA-seq studies of adult mouse kidneys [26, 30, 31]. The inferred identities were then used to calculate cell-type-specific gene expression signatures and further refine the annotations (see Methods and Supplementary Fig. 1). Using this approach, we inferred the cell type of 23,093 cells (15 cell types), leaving 306 cells unannotated (see Supplementary Fig. 2 for tSNE plots coloured by tissue dissociation protocol, library or cell type, Supplementary Fig. 3 for expression of selected cell type marker genes, Supplementary Table 2 for cell annotations).
Differential expression analysis was performed in each cell type separately using the Wilcoxon test implemented in Seurat [32] (thresholds of logFC = 0.5, minimum detection rate 0.5, FDR < 0.05). Podocytes and transitional cells (CD_Trans) were excluded from the analysis due to low cell numbers. A total of 64 genes were more highly expressed in warm-dissociated samples in at least one cell type (Fig. 2B, Supplementary Table 3). Functional analysis of these genes with ToppGene [28] again reported “regulation of cell death” as one of the top significantly enriched GO BP (an overlap of 23 genes, FDR = 3.89E-7, Supplementary Table 3). The numbers of differentially expressed genes varied among cell types (Fig. 2B), suggesting that different cell types respond differently to warm tissue dissociation.
To quantify these differences, we defined a set of stress-response-related genes (17 genes, Fosb, Fos, Jun, Junb, Jund, Atf3, Egr1, Hspa1a, Hspa1b, Hsp90ab1, Hspa8, Hspb1, Ier3, Ier2, Btg1, Btg2, Dusp1) and calculated their expression score and significance across cell types and dissociation protocols. Briefly, an average gene expression level was calculated for these genes in each cell and then subtracted by averaged expression of randomly selected control gene sets. These values were then averaged across cell types for each tissue dissociation protocol and significance was calculated in a Monte-Carlo procedure with 1000 randomly selected gene sets of the same size (see Methods). Fig. 2C shows that significantly high expression scores were detected only in warm-dissociated samples in eight out of 14 cell types, while the remaining six cell types did not over-express this stress-response related gene signature in warm-dissociated kidneys.
DEGs most commonly over-expressed in warm-dissociated cell populations are shown in Fig. 2D and include immediate-early response genes such as Junb and Jund (both DE in seven cell types) and Jun and Fos (both DE in five cell types). Taken together, these results highlight that certain cell types, such as immune and endothelial cells, are particularly sensitive to warm tissue dissociation.
In contrast to the 64 genes with higher expression in the warm dissociation, only 20 genes had higher expression in the cold-dissociated cell populations (Fig. 2B, Supplementary Table 4) and only five of them (Hbb-bs, Hba-a1, Hba-a2, mt-Co1 and Malat1), were identified in at least two cell types. We note the levels of haemoglobin transcripts suggest contamination from erythrocytes is higher in the samples dissociated on ice.
Cell composition differs between two tissue dissociation protocols
Many cell populations were less abundant in warm-dissociated samples in comparison to cold-dissociated ones (Fig. 2E left). These included podocytes, mesangial cells and principal cells of collecting duct (CD_PC), as well as more common cell types such as endothelial cells, T and B cells, macrophages and fibroblasts (chi-square test p-value < 0.001). Each of these cell populations (except for podocytes which were not tested) also showed significantly high expression of the stress-response-related gene set described above (Fig. 2C). Only three podocytes were detected in warm-dissociated samples (0.03% of the total cell count), compared to 330 (2.78%) in the cold-dissociated samples. These findings suggest that these populations are likely to be depleted in warm-dissociated tissues due to their sensitivity to the conditions of the dissociation protocol. Cells that were more abundant in warm-dissociated samples included cells of the ascending loop of Henle (aLOH, 4.99% vs. 2.52% in cold-dissociated) and proximal tubule (PT, 71.36% vs. 63.34% in cold-dissociated), potentially indicating less efficient dissociation of these cell types by cold-active protease (Fig. 2E right).
Microfluidics may further alter cell type composition
To determine whether microfluidic partitioning could affect cell composition, we used bulk RNA-seq data generated on the same dissociated kidney samples (Fig. 1A, B) and BSEQ-sc [33] to predict the proportions of each cell type present in the samples before they were loaded on the microfluidics. Most notably, BSEQ-sc predicted cells of the ascending loop of Henle (aLOH) to be present at 18.6% and 14.3% (averages of 3 biological replicates for warm and cold dissociation, respectively, Supplementary Fig. 4), while they were only present at 4.99% and 2.52% in the scRNA-seq libraries (warm-and cold-dissociated, respectively). Given that cells of aLOH are thought to be the second most populous cell type in kidney (estimated at 23.71% of kidney epithelial cells [34]), the analysis suggests the BSEQ-sc estimates are more likely to be correct and that aLOH cells may be lost in the microfluidic partitioning.
As mentioned in the previous section, podocytes are highly depleted in the warm dissociated samples (0.03% in warm vs 2.78% in cold); however, BSEQ-sc predicted podocytes to be at similar abundance in the warm- and cold-dissociated samples (0.86% and 1.36%, respectively). In addition, known podocyte markers Nphs1 and Nphs2 were not significantly differentially expressed between bulk RNA-seq profiles of warm- and cold-dissociated kidneys (logFC = 0.397, FDR = 0.039 and logFC = 0.390, FDR = 0.032, respectively, edgeR exact test [27]). Together, this suggests that microfluidic partitioning likely contributes to the depletion of podocytes specifically in warm-dissociated kidneys.
Comparison of cell preservation protocols
We next set out to evaluate whether cryopreservation and methanol fixation maintain cell composition and transcriptional profiles of kidneys. Aliquots of single-cell suspensions of cold- and warm-dissociated kidneys were cryopreserved (50% FBS, 40% RPMI-1640, 10% DMSO) and stored for 6 weeks or methanol-fixed and stored for 3 months. These stored samples were then profiled with 10x Genomics Chromium scRNA-seq (Fig. 1A, B) as described in Methods. The resulting datasets consisted of 11,864 and 11,298 freshly profiled cells, 11,627 and 5,545 methanol-fixed cells, and 3,519 and 3,483 cryopreserved cells derived from cold- and warm-dissociated kidneys respectively. Notably, only ∼32% of cells were recovered after cryostorage and the average viability estimated by the Countess was 75%. Despite loading similar numbers of cells, the number of high-quality cells obtained from the cryopreserved samples after quality control and filtering (see Methods) was substantially lower (∼30%) than that of the fresh and methanol fixed samples.
Cryopreservation depletes epithelial cell types
The most prominent difference in recovery rates pertained to cells of the proximal tubule (PT), the most populous cell type in kidney [34]. In freshly profiled suspensions, PT composed 63.12% and 70.86% of all cells in cold- and warm-dissociated samples, respectively. In contrast, PT were scarcely detected in cryopreserved samples, at 0.31% and 0.57%, respectively (see Fig. 3A for cold-dissociated samples, Supplementary Fig. 5 for warm-dissociated samples, Supplementary Fig. 6 for plots showing biological replicates). We next compared recovery rates of other cell populations in freshly profiled and cryopreserved samples relative to all non-PT cells (Supplementary Fig. 7). This comparison revealed significant underrepresentation (chi-square test p-value < 0.001) of five kidney cell types in cryopreserved samples prepared with the cold dissociation protocol (intercalated cells of collecting duct (CD_IC), ascending loop of Henle (aLOH), distal convoluted tubule (DCT), connecting tubule (CNT), podocytes), three of which were also underrepresented in the cryopreserved warm dissociation samples (aLOH, DCT, CNT). Together with the loss of PT cells, this indicates that the cryopreservation and subsequent thawing protocol failed to efficiently recover kidney epithelial cell populations.
Given that others have reported that cryopreservation generated comparable data to that of fresh cells [22, 23], we repeated the experiment comparing cryopreserved and freshly profiled cold-dissociated single cell suspension aliquots using different mice (Balb/c female), 10x chemistry (v3 as opposed to v2), storage length (2 weeks as opposed to 6 months) and centrifugation speed for thawing and resuspension (1200g as opposed to 400g). Again there was a significant depletion of PT cells, with them making up 55.55% of the freshly profiled cells but only 7.65% of the cryopreserved cells (Supplementary Fig. 8). From this, we conclude that, at least in the case of mouse kidneys, cryopreservation of dissociated cells (using 50% FBS, 40% RPMI, 10% DMSO) can induce substantial deleterious changes in cell composition.
In contrast to a previous report assessing storage of cell lines and immune cells [22], in the case of dissociated mouse kidneys, methanol fixation better preserved cell type composition than cryopreservation (Fig. 3A, see Supplementary Fig. 5 for warm-dissociated samples, Supplementary Fig. 6 for biological replicates). Nevertheless, certain cell types were moderately under-represented in the methanol fixed samples in comparison to freshly profiled samples, with macrophages showing the largest reduction from 5.36% to 3.2% in cold-dissociated samples and from 4.28% to 2.54% in warm-dissociated samples.
Cryopreservation induces stress response
To gain further insights into preservation-related artefacts, we compared gene expression between preserved and freshly profiled samples in each cell type separately (Wilcoxon test in Seurat [32]; logFC = 1, min detection rate 0.5, FDR < 0.05 as thresholds). A total of 31 and 27 DEGs were over-expressed in cold-dissociated samples in at least one cell type for cryopreserved or methanol-fixed cells, respectively, when compared to freshly profiled suspensions (Fig. 3B, see Supplementary Fig. 5 for warm-dissociated samples). Differential expression analysis revealed induction of stress response-related genes, including multiple immediate-early response genes and heat shock proteins, in cryopreserved samples (see Fig. 3C for cold-dissociated samples, Supplementary Table 5 for the full lists of DEGs). We also observed significantly lower expression of PT marker genes, such as Kap, Gpx3, Fxyd2 [34], in the cryopreserved libraries across most cell types. For instance, Kap and Gpx3 were differentially expressed in 12 or more cell types in both dissociation protocols (Supplementary Table 5, Supplementary Table 6). This finding suggests that the depletion of PT cells we observed in cryopreservation is accompanied by a lower contamination of other cells with highly expressed PT-specific genes.
Genes over-expressed in methanol-fixed cells were distinct from those reported for the cryopreservation. Specifically, we observed an increased contamination with genes expressed in cells of the tubule and haemoglobin genes (Fig. 3D, see Supplementary Table 7 and Supplementary Table 8 for the full lists of DEGs). The contamination affected multiple cell types and might indicate higher level of damage of tubular cells by methanol fixation.
Comparison of single-cell and single-nucleus sequencing protocols
Having identified cold-active protease as a less damaging tissue dissociation approach, we next set out to compare it to single-nuclei isolation protocols, which may be better able to dissect cell types. We performed a series of experiments using kidneys from Balb/c male mice (v2 10x chemistry) or female mice (v3 10x chemistry), cold tissue dissociation for scRNA-seq and three single-nuclei isolation protocols for snRNA-seq (Fig. 4A, Methods, Fig. 1B-E). Briefly, the first and second single-nuclei isolation protocols made use of fluorescence activated nuclei sorting (FANS). The first protocol washed the nuclei three times and used a centrifugation speed of 500g (further referred to as SN_FANS_3×500g, Fig. 1C). In the second protocol nuclei were washed only once and a centrifugation speed of 2000g was used (further referred to as SN_FANS_1×2000g, Fig. 1D). Finally, in the third protocol, nuclei were initially washed using a 500g spin and then cleaned using a sucrose cushion avoiding the requirement to sort isolated nuclei (further referred to as SN_sucrose, Fig. 1E). In addition, we performed bulk RNA-seq of intact, flash-frozen whole kidneys (Fig. 1F, Fig. 4A) and again used BSEQ-sc [33] to estimate cell type abundances (see Methods).
Detection rates of non-epithelial kidney cell types were markedly different between scRNA-seq and snRNA-seq libraries (Fig. 4B, Supplementary Table 9). Immune cells were detected at lower rates in snRNA-seq (average of 0.73%) than in scRNA-seq (average of 5.66%) across all experiments performed (Fig. 4B). Using the bulk RNA-seq from intact kidneys (Fig. 1F), we estimated approximately 4.84% should correspond to immune cells. This suggests an underrepresentation of immune cells in the snRNA-seq data. Furthermore, macrophages were the only type of immune cells recovered in snRNA-seq libraries, whereas in scRNA-seq libraries, we also detected T cells (1.3% on average), B cells (0.72%) and NK cells (0.58%). Similarly, podocytes composed only 0.7% in snRNA-seq libraries as opposed to 3.35% in scRNA-seq (Fig. 4B).
Cell types more abundant in snRNA-seq libraries included loop of Henle, endothelial and mesangial cells (Fig. 4B). The proportions of loop of Henle and mesangial cells seen in the snRNA-seq libraries were similar to those estimated from bulk RNA-seq deconvolution (Fig. 4B), and for the loop of Henle cells confirmed their rank as the second most populous cell type in kidney after PT cells [34].
We next compared the observed cell composition to estimates of epithelial cell type contribution based on quantitative renal anatomy, as reported by Clark et al. recently [34]. Fig. 4C shows that for some cell types, such as podocytes, scRNA-seq yields proportions most similar to the quantitative renal anatomy estimates, whereas for other cell types, such as loop of Henle cells, snRNA-seq better captures cell composition. Lastly, bulk RNA-seq-based proportions largely contradicted the anatomical estimates, which might reflect inaccurate deconvolution of the sample.
Differential expression analysis comparing individual cell types profiled by snRNA-seq and scRNA-seq suggested higher expression of long noncoding RNAs in snRNA-seq libraries and higher expression of mitochondrial and ribosomal genes in scRNA-seq, in agreement with previous reports [24, 26] (Supplementary Table 10).
Finally, as mitotic cells lack a nuclear membrane and in principle should not be observed in the snRNA-seq data, we inferred cell cycle phases for cells and nuclei using Seurat (see Methods) [32]. Notably Seurat predicted a higher fraction of G1 phase cells and lower fraction of S phase cells in scRNA-seq libraries when compared to snRNA-seq libraries for virtually all cell types (Fig. 4D and Supplementary Fig. 9). This suggests that there are indeed underlying biases in cell cycle phase distributions in snRNA-seq data in comparison to scRNA-seq data; however, to fully dissect this, a classifier that can discriminate mitotic cells from early G1 and late G2 is needed.
Discussion
Interrogating complex tissues at the level of individual cells is essential to understand organ development, homeostasis and pathological changes. Despite the rapid advancement and widespread adoption of scRNA-seq and snRNA-seq technologies, the associated biases remain incompletely understood. To characterise some of the biases, we performed a systematic comparison of recovered cell types and their transcriptional profiles across two tissue dissociation protocols, two single-cell suspension preservation methods and three single-nuclei isolation protocols.
Previous studies have reported on artefactual gene expression changes induced by proteolytic tissue digestion at 37°C in sensitive cell populations [16, 18]. Our findings corroborate this bias and show stronger induction of heat shock proteins and immediate-early response genes in warm-dissociated libraries in comparison to cold-dissociated libraries. In the case of cold tissue dissociation, low temperature minimises new transcription [18], hence, the cold-dissociated libraries can serve as a baseline to highlight artefactual changes induced in the warm-dissociated cell populations. Our results further indicate that cell populations prone to these transcriptional changes are also depleted from the samples, with podocytes being the extreme example of a cell type practically lost in warm-dissociated libraries. Over-expression of stress-response-related genes was also detected by bulk RNA-seq analysis of dissociated tissues, confirming that this artefact stems from the dissociation protocol rather than from microfluidic separation, single cell sequencing or data processing. These findings have important implications and suggest that data from samples digested at 37°C needs to be interpreted in light of this bias.
One possible drawback of cold tissue dissociation is lower efficiency of releasing hard-to-dissociate cell types in comparison to warm tissue dissociation. In our study, this may have affected cells of loop of Henle, which were detected at 2.52% in cold- and at 4.99% in warm-dissociated samples. However, both protocols dramatically under-estimated abundance of this second most populous kidney cell type. While one possible explanation could be incomplete tissue dissociation in both cases, deconvolution of bulk RNA-seq profiles of single-cell suspensions indicated that cells might be lost during cell encapsulation on the microfluidic device.
Two recent studies have shown that cryopreservation generated comparable data to that of fresh cells for cell lines and immune cells, and also for complex tissues cryopreserved prior to single-cell separation [22, 23]. Here, however, we report that cryopreservation of single-cell suspensions of dissociated mouse kidneys resulted in depletion of epithelial cell types. This artefact was reproducible across two mouse strains, both genders and two 10x chemistry versions. However, we observed higher fraction of recovered PT cells (7.65% vs. 0.57%) in the repeated experiment, which might be explained by either gender or strain differences, or higher sensitivity of 10x v3 chemistry. Together with the depletion of PT cells, we observed reduced contamination with their highly expressed transcripts, which indicates that the cells might be lost in the thawing and resuspension. A possible explanation for the differences from previous reports is the proportion of serum used in the freezing media. 10x genomics recommends 40% FBS (10x Genomics (CG00039, Rev D), whereas the other studies used either 90% FBS (peripheral blood, minced tissues, cell lines and immune cells) or 10% FBS (cell lines) [22, 23]. Notably, despite loading similar numbers of fresh, methanol fixed and cryopreserved cells, the number of the usable cells observed in the cryopreserved samples was only ∼30% of the others. This raises the possibilities that the missing PT cells may be present but are failing to make it into the microfluidics, failing to lyse, or are so badly damaged that there is insufficient RNA remaining to generate a usable library. In contrast to cryopreservation, cell composition of methanol-fixed suspensions resembled that of freshly profiled libraries. Similarly to previous studies, we observed contamination with highly abundant transcripts suggesting cell damage by methanol fixation [22]. However, this contamination did not impede cell type identification in our experiment.
Studies comparing scRNA-seq and snRNA-seq reported that, although the two technologies profile different RNA fractions, both detect sufficient genes and allow adequate representation of cell populations [24–26]. In this work, one of the most notable differences between single-cell and single-nuclei experiments was the low detection rate of immune cells, in particular the failure to detect T, B, or NK cells in any of the snRNA-seq libraries. This depletion of leukocytes is also observed in the Wu et al. [26] dataset (commented upon by O’Sullivan et al. [35]). Notably Slyper et al. [36] also observe much lower fractions of T, B, and NK cells in matched snRNA-seq - scRNA-seq datasets from adjacent pieces of a metastatic breast cancer and a neuroblastoma.
As Wu et al. have suggested, although these differences might indicate under-estimation of immune cells by snRNA-seq, another plausible explanation is that immune cell content is inflated in single-cell experiments as other cell types may be underrepresented due to incomplete dissociation. A major hurdle to determining which explanation is correct is the lack of a “ground truth” for cell composition of mouse kidneys. Clark et al. recently reported cell frequency estimates based on quantitative renal anatomy. However, these were restricted to renal epithelial cells [34]. Based on these estimates, some cell types, such as podocytes, appear to be better represented in scRNA-seq, whereas others, such as loop of Henle cells, were captured more effectively by snRNA-seq. We also attempted to use computational deconvolution of bulk RNA-seq of intact kidneys to infer its cell composition. However, the approach is sensitive to the input marker gene list used and may overlook rare and novel cell types. In addition, cell abundance estimates from bulk data would be influenced by both cell number and relative mRNA content of each cell. We will continue to search for approaches to better define the “ground truth” for cell composition.
Conclusions
Our comparison of two tissue dissociation protocols revealed better performance of the cold dissociation protocol, while traditional digestion at 37°C introduced artifactual changes in sensitive cell populations affecting both representation of cell types and their transcriptional profiles. We also found that profiling of fresh single-cell suspensions is preferred; however, if immediate sample processing is challenging, methanol fixation gives satisfactory results introducing moderate cell damage. In contrast, cryopreservation of dissociated cells induces stress response and results in loss of the main epithelial cell type in kidney. Finally, we highlight differences in cell type composition between scRNA-seq and snRNA-seq libraries. Both approaches appear to have specific biases, thus when possible, studies would benefit from applying both for the same tissue.
Methods
Mice
Acknowledging the principles of 3Rs (Replacement, Reduction and Refinement) all kidneys used in this study were from mice that were euthanised by cervical dislocation as parts of other ongoing ethically approved experiments. In the first series of experiments, comparing cold and warm tissue dissociation and two preservation protocols, male AFAPIL.1DEL C57BL/6J mice from the same litter were used. These mice were 19 weeks old when euthanised and had no exposure to any experimental procedures. For the subsequent experiments, comparing cold-dissociated scRNA-seq to single-nuclei isolation protocols, we used untreated 18 week old male Cnga3.GFP Balb/c mice from the same litter or untreated 15 week old female wild type Balb/c mice that were previously used as breeders, as specified in Fig. 4A.
Kidney harvesting
Mice were euthanised and their kidneys were dissected and placed into a 1.5mL tube containing 1mL of ice cold PBS. The capsules were then removed on ice and the samples processed as detailed below.
Warm tissue dissociation
Kidneys were dissociated using the Multi-tissue dissociation kit 2 from Miltenyi Biotec [130-110-203] as per manufacturers’ instruction, with minor variations. Once the weight of the kidney was determined, the kidney was quartered and placed into a gentleMACS C-tube [Miltenyi Biotech; 130-096-334] containing the enzyme mix described in the kit’s protocol. The tube was centrifuged briefly, then placed onto the gentleMACS octo dissociator (Miltenyi Biotech), and the 37C_Multi_E program was run after attaching the heating elements. Following completion of the program, the tube was briefly centrifuged.
The homogenate was filtered through a 70µm cell strainer [Greiner; 54207] into a 50mL centrifuge tube [Greiner; 227270], the strainer was then rinsed with 15mL of PBS. The cell suspension was centrifuged at 400g for 10 minutes; once complete, the supernatant was removed and the pellet was resuspended in 5mL of PBS+0.04% BSA [Sigma; A7638]. The cell suspension was then filtered through a 40µm strainer [Greiner; 542040], which was subsequently rinsed with 2mL of PBS+0.04% BSA. The cells were again centrifuged at 400g for 10 minutes. The supernatant was then removed and the pellet was resuspended in 5mL of PBS+0.04% BSA. Cell count and viability was estimated using the Countess II FL (ThermoFisher) and the ReadyProbes Blue/Red kit [Invitrogen; R37610]. The cells were then diluted to 700cells/µL and were immediately loaded onto a 10x chip A and processed on the 10x Chromium controller. The remaining cells were then either methanol fixed or cryopreserved.
Cold tissue dissociation
Kidneys were dissociated using a modified version of the published protocol described in [18]. Based on the weight, in a pre-cooled Miltenyi C-tube, a protease solution (5mM CaCl2 [Invitrogen; AM9530G], 10mg/mL B. Licheniformis protease [Sigma; P5380], 125U/mL DNase I [Sigma; D5025], 1xDPBS) was prepared for each kidney.
The kidneys were then minced on ice into a smooth paste using a scalpel. The minced kidney was transferred into 4-6mL of the protease solution (dependent on weight) and triturated using a 1mL pipette for 15 seconds every two minutes for a total of eight minutes.
Following trituration, the C-tubes were placed onto a Miltenyi gentleMACS octo dissociator in a cool room (4°C), and the m_brain_03 program was run twice in succession. Once complete, the samples were triturated for 15 seconds every two minutes on ice for an additional 16 minutes using a 1mL pipette. 10µL of each sample was then loaded into a haemocytometer to assess whether tissue dissociation was complete. The dissociated cells were transferred to a 15mL centrifuge tube and 3mL of ice-cold PBS+10%FBS [Gibco; A3160401] was added.
The cell suspension was centrifuged at 1200g for five minutes at 4°C. The supernatant was removed and the pellet was resuspended in 2mL of PBS+10%FBS. The cells were then filtered through a 70µm cell strainer, which was subsequently rinsed with 2mL of PBS+0.01% BSA. The cells were then centrifuged again at 1200g for five minutes at 4°C followed by removal of the supernatant and resuspension of the pellet in 5mL of PBS+0.01%BSA. The cells were then filtered through a 40µm cell strainer, which was subsequently rinsed with 2mL of PBS+0.01% BSA. The cells were again centrifuged at 1200g for five minutes at 4°C followed by removal of the supernatant and resuspension of the cells in 5mL of PBS+0.04%BSA. The cells were counted and checked for viability using the ReadyProbes Blue/Red Kit on the Countess II FL. The cells were further diluted to a concentration of 700cells/µL with PBS/0.04%BSA and loaded directly onto a 10x chip (A/B depending on experiment) and isolated using the 10x Chromium controller. The remaining cells were either methanol fixed or cryopreserved.
Methanol fixation
Fixing
The methanol-fixation protocol was based on [37]. After tissue dissociation, the cells were concentrated to approximately 5×106 cells/mL by centrifuging at 1000g for 10 minutes. 200µL of the cell suspensions were aliquoted into 2mL cryovials resting on ice. 800µL of 100% methanol [Sigma; 494437] (chilled at −20°C) was then added dropwise to each sample while gently stirring the cells to prevent clumping. The cryovials were stored at −20°C for 30 minutes, then directly transferred to −80°C (no gradient cooling).
Rehydrating
Cryovials of methanol fixed cells were removed from −80°C and placed on ice to equilibrate to 4°C (approximately 10 minutes). The cells were then transferred to a 1.5mL centrifuge tube and centrifuged at 1000g for five minutes at 4°C. The supernatant was discarded and the pellet was resuspended in a small volume of SSC cocktail (3xSSC [Sigma; S0902], 0.04% BSA, 40mM DTT [Sigma; 43816], 0.5U/mL RNasin plus [Promega; N2615]) to reach a concentration of approximately 2000cells/µL. The cells were then filtered through a pre-wetted (with 1mL of nuclease-free water) 40µm pluristrainer mini filter [PluriSelect; 43-10040]. The cells were counted using the ReadyProbes Blue/Red Kit on the Countess II, then adjusted to 2000cells/µL based on the count. The cells were loaded onto a 10x chip (A/B depending on version used) at a volume that dilutes the SSC to 0.125x to prevent reverse transcription inhibition.
Cryopreservation
Freezing
After tissue dissociation, the cells were centrifuged at 400g for 10 minutes (1200g for five minutes at 4°C for the repeated experiment), then resuspended in freezing media (50% FBS, 40% RPMI-1640 [Gibco; 11875093], 10% DMSO [Sigma; D4540]) to achieve a concentration of 1×106cells/mL. 1mL of the cell suspension was aliquoted into 2mL cryovials, then placed into an isopropanol freezing container (Mr. Frosty) and stored at −80°C overnight. The following day, the cells were transferred to liquid nitrogen storage.
Thawing
The samples were removed from −80°C and immediately placed into a 37°C waterbath for 2-3 minutes to rapidly thaw. The cells were then mixed using a 1mL pipette with a wide-bore tip and the entire volume transferred to a 15mL centrifuge tube [Greiner; 188261]. The cryovial was then rinsed twice with RMPI+10%FBS (rinse media); each time the 1mL of media was added to the 15mL centrifuge in a dropwise manner while gently shaking the tube. 7mL of rinse media was added to the centrifuge tube using a serological pipette – the first 4mL was added dropwise while gently shaking the tube, and the following 3mL added down the side of the tube over two seconds. The tube was then inverted to mix.
The cells were centrifuged at 300g for five minutes. Once completed, the supernatant was removed (leaving 1mL), placed into another 15mL centrifuge tube and centrifuged at 400g for five minutes. The supernatant was discarded (leaving 1mL). The pellet from the supernatant was then resuspended, combined with the pellet in the initial centrifuge tube and mixed. 2mL of PBS+0.04% BSA was added to the centrifuge tube and shaken gently to mix. The cells were then centrifuged again at 400g for five minutes. The supernatant was discarded leaving 0.5mL behind. 0.5mL of PBS+0.04% BSA was added to the cells and gently pipette-mixed 10-15 times to fully resuspend. The cells were then filtered through a pre-wetted (with 1mL of PBS+0.04% BSA) 40µm pluristrainer mini filter. A 20µL aliquot of the cells was used to obtain an estimate of cell count and viability using the ReadyProbes Blue/Red Kit on the Countess II FL. Based on the count, the cells were diluted to a concentration of 700cells/µL. The cells were then loaded onto a 10x chip (A/B depending on version) and immediately processed on the 10x Chromium controller.
For the repeated experiment, the above method was altered: Rather than a 300g spin followed by two 400g spins, two 1200g spins were performed, omitting the second centrifugation step.
Flash freezing of whole kidney
Following the removal of the renal capsule, the kidney was placed into an isopentane [Sigma; 320404] bath resting on dry ice for five minutes. The temperature of the bath was maintained between −30°C and −40°C. Once frozen, the kidney was placed into a pre-cooled (on dry ice) cryovial and then buried in dry ice. The process was repeated for all designated kidneys. The flash-frozen kidneys were then transferred to a −80°C freezer for storage.
Single nuclei isolation
SN_FANS_3×500g
This method is an adaptation of the Frankenstein protocol [38] and the 10x demonstrated protocol [39].
The kidneys were removed from −80°C and immediately placed on ice. Each kidney was then transferred to a 1.5mL tube containing 300µL of chilled lysis buffer (10mM Tris-HCl [Invitrogen; AM9856], 3mM MgCl2 [Invitrogen; AM9530G], 10mM NaCl [Sigma; 71386], 0.005% Nonidet P40 substitute [Roche; 11754599001], 0.2U/mL RNasin plus) and incubated on ice for two minutes. The tissue was then completely homogenised using a pellet pestle [Fisherbrand; FSB12-141-364] using up and down strokes without twisting. 1.2mL of chilled lysis buffer was added to the tube and pipette-mixed (wide-bore). The full volume was then transferred to a pre-cooled 2mL tube. The homogenate was incubated on ice for five minutes and mixed with a wide-bore tip every 1.5 minutes.
Following the incubation, 500µL of the lysis buffer was added to the homogenate, which was subsequently pipette-mixed and split equally into four 2mL tubes. 1mL of chilled lysis buffer was added to each tube and pipette-mixed using a wide-bore tip. The four tubes were incubated for a further five minutes on ice, mixing with a wide-bore tip every 1.5 minutes. The homogenate from the four tubes was then filtered through a 40µm strainer into a pre-cooled 50ml centrifuge tube. Following this, the sample was split again into four 2mL tubes resting on ice.
The samples were centrifuged at 500g for five minutes at 4°C. The supernatant was removed leaving 50µL in the tube. 1.5mL of lysis buffer was then added to two of the tubes and the pellet resuspended by mixing with a pipette. This resulted in two tubes containing 1.5mL resuspended nuclei in lysis buffer, and two tubes containing a nuclei pellet in 50µL of lysis buffer. The resuspended nuclei in one tube was then combined with the nuclei pellet of another, resulting in two tubes containing resuspended nuclei in lysis buffer.
The nuclei were centrifuged again at 500g for five minutes at 4°C. The supernatant was removed completely and discarded. 500µL of nuclei wash buffer (1xDPBS, 1% BSA, 0.2U/mL RNasin plus) was added to the tube containing the pellet and left to incubate without resuspending for five minutes. Following incubation, an additional 1mL of nuclei wash buffer was added, and the nuclei were resuspended by gently mixing with a pipette. The nuclei were again centrifuged at 500g for five minutes at 4°C, followed by discarding of the supernatant. The pellets were resuspended in 1.4mL of nuclei wash buffer, then transferred into a pre-cooled 1.5mL tube. Another 500g centrifugation step for five minutes at 4°C was performed. The supernatant was then discarded, and the nuclei pellet was resuspended in 1mL of nuclei wash buffer.
The nuclei were then filtered through a 40µm pluristrainer mini filter. 200µL of the filtered nuclei suspension was transferred into a 0.5mL tube and set aside to be used as the unstained control for sorting. To the remaining 800µL, 8µL of DAPI (10µg/mL) [ThermoScientific; 62248] was added, and the nuclei were mixed with a pipette. A quality control step was performed by viewing the nuclei under a fluorescence microscope on a haemocytometer to check nuclei shape and count.
A BD Influx Cell Sorter was then used to sort 100,000 DAPI-positive events using a 70µm nozzle and a pressure of 22 psi (as per gating strategy, Supplementary Fig. 10). The post-sort nuclei concentration and quality were then checked using a fluorescence microscope and haemocytometer. Nuclei were then loaded onto a 10x chip (A/B depending on version used) and processed immediately on the 10x Chromium controller.
SN_FANS_1×2000g
The flash frozen kidneys were removed from −80°C and transferred to a 1.5mL tube containing 500µL of pre-chilled lysis buffer same recipe as previous protocol) and allowed to rest on ice for two minutes. Each kidney was then homogenised with a pellet pestle with 40 up and down strokes without twisting the pellet. The resulting homogenate was mixed with a pipette and transferred to pre-cooled 15mL centrifuge tube containing 2mL of lysis buffer. The homogenate was incubated for 12 minutes on ice with mixing every two minutes using a glass fire-polished silanised Pasteur pipette [Kimble; 63A54]. Once incubation was complete, 2.5mL of nuclei wash buffer (same recipe as previous protocol) was added to the homogenate. The remaining tissue fragments were completely dissociated by repeated trituration of the homogenate using the glass Pasteur pipette.
The homogenate was then filtered through a 30µm MACS Smart Strainer [Miltenyi Biotech; 130-098-458] into a new 15mL centrifuge tube. The nuclei were centrifuged at 2000g for five minutes at 4°C. The supernatant was removed and the nuclei pellet was resuspended in 1mL of nuclei wash buffer. 200µL was aliquoted into a 0.5mL tube to be used as an unstained control for sorting. 8µL of DAPI (10µg/mL) was added to the remaining 800µL of nuclei. Quality and quantity of the nuclei was checked using a fluorescence microscope prior to sorting. Sorting and post sorting QC was performed in the same manner as for the SN_FANS_3×500g protocol. Nuclei were then loaded onto a 10x chip B and processed immediately on the 10x Chromium controller.
SN_sucrose
Kidneys were removed from −80°C and transferred to a 1.5mL tube containing 500µL of pre-chilled lysis buffer II (same recipe as previous protocols, with 125U/mL of DNase I added) and allowed to rest on ice for two minutes. Each kidney was then homogenised using a pellet pestle with 40 up and down strokes without twisting. The homogenate was transferred to a 15mL centrifuge tube containing 2mL of lysis buffer II and incubated for 12 minutes on ice with mixing every two minutes using a glass fire-polished silanised Pasteur pipette. Following the incubation, 2.5mL of nuclei wash buffer II (1xDPBS+2%BSA) was added to the homogenate. Remaining tissue clumps were dissociated by repeated trituration of the homogenate using the glass Pasteur pipette.
The homogenate was then filtered through a 30µm MACS Smart Strainer into a new 15mL centrifuge tube. Subsequently, the homogenate was centrifuged at 2000g for five minutes at 4°C. The supernatant was removed, and the pellet was resuspended in 510µL of nuclei wash buffer II. 10µL of the suspension was transferred to a 1.5mL tube and placed on ice for use in nuclei recovery calculations. 900µL of 1.8M sucrose solution [Sigma; NUC201] was added to the remaining 500µL of nuclei suspension and homogenised by mixing with a pipette. 3.6mL of 1.3M sucrose solution [Sigma; NUC201] was added to a 5mL tube. The nuclei/sucrose homogenate was then gently layered on top of the 1.3M sucrose solution.
The 5mL tube containing the sucrose solutions and nuclei was then centrifuged at 3000g for 10 minutes at 4°C. Once centrifugation was complete, the sucrose phase containing debris was soaked up using a Kimwipe wrapped around a pellet pestle. The remaining supernatant was removed and discarded using a pipette. The nuclei pellet was then resuspended in 5mL of wash buffer II, of which 10µL was transferred to a 1.5mL tube to assess nuclei recovery.
To the 10µL of nuclei suspension removed prior to the sucrose gradient, 980µL of wash buffer II and 10µL of DAPI (10µg/mL) was added. To the 10µL of nuclei suspension removed after the sucrose gradient, 89µL of wash buffer II and 1µL of DAPI (10µg/mL) was added. The yield from the pre- and post-sucrose aliquots were compared to assess nuclei recovery after filtration through the gradient. The post-sucrose count was used to dilute the nuclei to a concentration of 700nuclei/µL, which was immediately loaded onto a 10x Chip B and processed with the 10x Chromium controller.
Single cell RNA-seq library preparation
All single cell libraries were constructed in biological triplicate using the 10x Chromium 3’ workflow as per manufacturers’ directions. In the first series of experiments, comparing cold and warm tissue dissociation and two preservation protocols, version two chemistry was used. For single-cell versus single-nuclei comparisons, versions two and three were used as indicated in Fig. 4A. All experiments and conditions aimed for a capture of approximately 9000 cells, except for methanol fixed samples. Due to the reverse transcription inhibition of 3x SSC, the sample had to be loaded at a concentration of 0.125x SSC, resulting in an approximate cell capture of 4000-5000 cells.
Bulk RNA-seq library preparation
For the undissociated samples, total RNA was extracted from flash-frozen kidneys using the Nucleospin RNA Midi kit [Macherey Nagel; 740962.20] as per manufacturers’ directions. For the dissociated samples, total RNA was extracted from the remaining cells from each of the tissue dissociation protocols. RNA was assessed for quantity and quality using the TapeStation 4200 RNA ScreenTape kit [Agilent; 5067-5576], which showed all RNA used had a RIN of >8. Bulk RNA-seq was performed using the NEBNext Ultra II RNA Library Kit for Illumina [NEB; E7760] and NEBNext rRNA Depletion Kit (Human/Mouse/Rat) [NEB; E6310] as described in the manufacturers’ protocol, with 100ng of total RNA as input.
Sequencing
All libraries were quantified with qPCR using the NEBnext Library Quant Kit for Illumina and checked for fragment size using the TapeStation D1000 kit (Agilent). The libraries were pooled in equimolar concentration for a total pooled concentration of 2nM. 10x single cell libraries were sequenced using the Illumina NovaSeq 6000 and S2 flow cells (100 cycle kit) with a read one length of 26 cycles, and a read two length of 92 cycles for version two chemistry. Version three chemistry had a read one length of 28 cycles, and a read two length of 94 cycles. Bulk libraries were sequenced on the Illumina NovaSeq 6000 using SP flow cells (100 cycle kit) with a read one and two length of 150 for dissociated bulk, undissociated bulk had a read one and two length of 50.
Bulk RNA-seq data processing
BCL files were demultiplexed and converted into FASTQ using bcl2fastq utility of Illumina BaseSpace Sequence Hub. FastQC was used for read quality control (https://www.bioinformatics.babraham.ac.uk/projects/fastqc/). Adapters and low-quality bases were trimmed using Trim Galore with parameters --paired --quality 5 --stringency 5 --length 20 --max_n 10 (https://www.bioinformatics.babraham.ac.uk/projects/trim_galore/). Reads matching to ribosomal DNA repeat sequence BK000964 (https://www.ncbi.nlm.nih.gov/nuccore/BK000964) and low complexity reads were removed with TagDust2 [40]. The remaining reads were mapped to GRCm38.84 version of mouse genome using STAR version 2.6.1a with default settings [41]. Picard MarkDuplicates tool was employed to identify duplicates (https://broadinstitute.github.io/picard/command-line-overview.html#MarkDuplicates). FeatureCounts was then used to derive gene count matrix [42]. Counts were normalised to gene length and then to library sizes using weighted trimmed mean of M-values (TMM) method in edgeR [27], to derive gene length corrected trimmed mean of M-values (GeTMM) as described in [43].
scRNA-seq and snRNA-seq data processing
BCL files were demultiplexed and converted into FASTQ using bcl2fastq utility of Illumina BaseSpace Sequence Hub. scRNA-seq and snRNA-seq libraries were processed using Cell Ranger 2.1.1 with mm10-2.1.0 reference. Reads mapped to exons were used for scRNA-seq samples, whereas both intronic and exonic reads were counted for snRNA-seq. Custom pre-mRNA reference for snRNA-seq was built as described in https://support.10xgenomics.com/single-cell-gene-expression/software/pipelines/latest/advanced/references#premrna. Raw gene-barcode matrices from Cell Ranger output were used for downstream processing. Cells were distinguished from background noise using EmptyDrops [44]. Only genes detected in a minimum of 10 cells were retained; cells with 200-3000 genes and under 50% of mitochondrial reads were retained, as per Park et al. study [30]. Nuclei were additionally filtered to have at least 450 UMIs for v2 chemistry and 900 UMIs for v3 chemistry, and mitochondrial genes were removed. Outlier cells with high ratio of number of detected UMI to genes (>3 median absolute deviations from median) were removed using Scater [45]. Seurat v2 was used for sample integration (canonical correlation analysis), normalisation (dividing by the total counts, multiplying by 10,000 and natural-log transforming), scaling, clustering and differential expression analysis (Wilcoxon test) [32].
Inferring cell identity
To infer cell identity for freshly profiled samples in the first series of experiments, we performed a reference-based annotation using scMatch [29] and refined cell labels based on marker gene expression in a two-step procedure described below (Supplementary Fig. 1).
Reference dataset
To construct the reference dataset for scMatch [29], we obtained gene counts and cell types reported in three single-cell (or single-nuclei) adult mouse kidney studies [26, 30, 31]. Counts were normalised to cell library size and averaged within each cell type to derive reference vectors (Supplementary Fig. 1, Step 1). The reference vectors were clustered using Spearman correlation coefficient and five vectors were removed as outliers. The remaining 66 vectors composed a reference dataset, available as Supplementary Table 11. With this reference dataset, we ran scMatch [29] (Supplementary Fig. 1, Step 2) using options --testMethod s --keepZeros y to label each individual cell with the closest cell type identity from the reference dataset.
Refining cell identities
We next refined scMatch-derived cell types based on gene expression. First, for each cell type, we calculated gene signatures as genes over-expressed in the given cell type when compared to all other cells (FindMarkers function of Seurat [32], minimum detection rate of 0.5, logFC threshold of 1 and FDR < 0.05 were used as thresholds; only cell types with at least 10 cells were considered; Supplementary Fig. 1, Step 3). Second, cell type gene signature scores were calculated for each cell and for each gene signature (AddModuleScore function of Seurat [32], genes attributed to signatures in more than two cell types were excluded; Supplementary Fig. 1, Step 4). Third, we used these scores to assign cell types to cells (Supplementary Fig. 1, Step 5). A cell type was assigned to a cell if the score for that cell type was the highest among all cell types, positive and significant with FDR < 0.05. Significance was determined in a Monte-Carlo procedure with 1000 randomly selected gene sets of the same size [46], correction for multiple hypothesis testing was performed using Benjamini-Hochberg procedure [47]. Cells without cell type annotation were manually explored to identify whether the corresponding cell type might be a novel one, absent from the reference.
Second iteration
Cell types inferred in our dataset were added to the reference dataset (Supplementary Fig. 1, Step 6), and annotation with scMatch and gene set signature scoring was repeated. Cells left unannotated at this stage were labelled as “unknown”. Cell type gene signatures are available in Supplementary Table 12.
This approach failed to identify cells of connecting tubule (CNT) and, instead, matched them to other similar cell types. To resolve this, annotation for cell types labelled as DCT, aLOH, CD_IC, CD_PC, CD_Trans was additionally refined as follows. These cells were extracted from the dataset and clustered separately. Candidate CNT cells were identified as a cluster over-expressing Calb1 and Klk1 genes [34, 48]. The cell type signature score procedure was then applied for this subset as described above.
Cell type labels assigned to each cell are available in Supplementary Table 2.
Preserved cells
Cells of preserved single-cell suspensions from the first series of experiments were annotated using the cell type gene signatures derived from the corresponding freshly profiled samples (Supplementary Table 12) and the gene set signature scoring procedure described above. Cell type labels assigned to each cell are available in Supplementary Table 2.
Subsequent experiments
In subsequent experiments, we used a combined reference dataset, which included the public data as well as data from freshly profiled cells generated in the first series of experiments (Supplementary Table 13, note that two cell types were excluded from the reference as outliers). Single-cell datasets were annotated using a single iteration of scMatch. For single-nucleus datasets we repeated the two-step annotation procedure described above. Cell type labels assigned to each cell or nucleus are available in Supplementary Table 2.
Stress response score
Stress response score was calculated for 17 genes (Fosb, Fos, Jun, Junb, Jund, Atf3, Egr1, Hspa1a, Hspa1b, Hsp90ab1, Hspa8, Hspb1, Ier3, Ier2, Btg1, Btg2, Dusp1) for each cell using AddModuleScore function of Seurat version 2 [32]. The score represents an average expression levels of these genes on single-cell level, subtracted by the aggregated expression of control gene sets. All analysed genes were binned based on averaged expression, and the control genes were randomly selected from each bin. Significance was determined in a Monte-Carlo procedure with 1000 randomly selected sets of 17 genes [46], correction for multiple hypothesis testing was performed using Benjamini-Hochberg procedure [47].
Cell cycle phase prediction
Cell cycle phases were inferred using CellCycleScoring function of Seurat version 2 [32] with the following genes: S-genes: Atad2, Blm, Brip1, Casp8ap2, Ccne2, Cdc45, Cdc6, Cdca7, Chaf1b, Clspn, Dscc1, Dtl, E2f8, Exo1, Fen1, Gins2, Gmnn, Hells, Mcm2, Mcm4, Mcm5, Mcm6, Msh2, Nasp, Pcna, Pcna-ps2, Pola1, Pold3, Prim1, Rad51ap1, Rfc2, Rpa2, Rrm1, Rrm2, Slbp, Tipin, Tyms, Ubr7, Uhrf1, Ung, Usp1, Wdr76; G2M-genes: Anln, Anp32e, Aurka, Aurkb, Birc5, Bub1, Cbx5, Ccnb2, Cdc20, Cdc25c, Cdca2, Cdca3, Cdca8, Cdk1, Cenpa, Cenpe, Cenpf, Ckap2, Ckap2l, Ckap5, Cks1brt, Cks2, Ctcf, Dlgap5, Ect2, G2e3, Gas2l3, Gtse1, Hjurp, Hmgb2, Hmmr, Kif11, Kif20b, Kif23, Kif2c, Lbr, Mki67, Ncapd2, Ndc80, Nek2, Nuf2, Nusap1, Psrc1, Rangap1, Smc4, Tacc3, Tmpo, Top2a, Tpx2, Ttk, Tubb4b, Ube2c. Note cells not annotated as S or G2M phase are by default labelled as G1 phase.
Bulk RNA-seq deconvolution
BSEQ-sc was used for bulk expression deconvolution [33]. In the first series of experiments, marker genes for the deconvolution were calculated from scRNA-seq data, using only cold-dissociated samples to avoid the influence of the identified warm dissociation-related biases. We also excluded cells labelled as “Unknown” and “CD_Trans” from the calculation. For each of the remaining cell types, marker genes were calculated using Seurat function FindMarkers with the following thresholds: logfc.threshold=1.5, min.pct = 0.5, only.pos = T. Genes identified in more than two cell types were removed and the remaining genes were used for the deconvolution. The same set of genes was used to deconvolve bulk RNA-seq data from intact kidneys.
Funding
This work was carried out with the support of a collaborative cancer research grant provided by the Cancer Research Trust “Enabling advanced single-cell cancer genomics in Western Australia” and an enabling grant from the Cancer Council of Western Australia. AF is supported by an Australian National Health and Medical Research Council Fellowship APP1154524. TL is supported by a Fellowship from the Feilman Foundation. RL was supported by a Sylvia and Charles Viertel Senior Medical Research Fellowship and Howard Hughes Medical Institute International Research Scholarship. RH is supported by an Australian Government Research Training Program (RTP) Scholarship. AF was also supported by funds raised by the MACA Ride to Conquer Cancer, and a Senior Cancer Research Fellowship from the Cancer Research Trust. Analysis was made possible with computational resources provided by the Pawsey Supercomputing Centre with funding from the Australian Government and the Government of Western Australia.
Author Contributions
Conception and design: AF. Analysis and interpretation of data: ED with help from RH and TL. Writing, review and revision of the manuscript: ED and AF with input from all authors. DP and BS developed the SN_FANS_1×2000g protocol, OC developed the SN_sucrose protocol. BS and DP performed FANS. BG adapted the dissociation and SN_FANS_3×500g protocols. MJ, BG and LdK generated the single cell/nuclei and bulk libraries. MJ and LdK adapted the SN_sucrose protocol. MJ performed the sequencing for the libraries. Study supervision: AF.
Supplementary Tables
Supplementary Table 1. Genes differentially expressed between bulk RNA-seq profiles of cold- and warm-dissociated kidney single-cell suspensions.
Supplementary Table 2. Cell type labels assigned to cells and nuclei in this study.
Supplementary Table 3. Differentially expressed genes with higher expression in cell populations of warm-dissociated kidneys.
Supplementary Table 4. Differentially expressed genes with higher expression in cell populations of cold-dissociated kidneys.
Supplementary Table 5. Genes differentially expressed between cryopreserved and freshly profiled cold-dissociated kidney single-cell suspensions.
Supplementary Table 6. Genes differentially expressed between cryopreserved and freshly profiled warm-dissociated kidney single-cell suspensions.
Supplementary Table 7. Genes differentially expressed between methanol-fixed and freshly profiled cold-dissociated kidney single-cell suspensions.
Supplementary Table 8. Genes differentially expressed between methanol-fixed and freshly profiled warm-dissociated kidney single-cell suspensions.
Supplementary Table 9. Number of cells in each cell population across single-cell and single-nuclei experiments.
Supplementary Table 10. Genes differentially expressed between single-cell and single-nuclei libraries in male mice profiled with v2 10x chemistry.
Supplementary Table 11. Public reference dataset used for the first scMatch run.
Supplementary Table 12. Cell type gene signatures for refined annotation.
Supplementary Table 13. Extended reference dataset used for scMatch annotation.