Abstract
Single nucleus RNA-Seq (sNuc-Seq) profiles RNA from tissues that are preserved or cannot be dissociated, but does not provide the throughput required to analyse many cells from complex tissues. Here, we develop DroNc-Seq, massively parallel sNuc-Seq with droplet technology. We profile 29,543 nuclei from mouse and human archived brain samples to demonstrate sensitive, efficient and unbiased classification of cell types, paving the way for charting systematic cell atlases.
Single cell RNA-seq has become an instrumental approach to interrogate cell types, dynamic states and functional processes in complex tissues1,2. However, current protocols require the preparation of a single cell suspension from fresh tissue, a major roadblock in many cases, including clinical deployment, handling archived materials and application in tissues that cannot be readily dissociated. In particular, in the adult brain, harsh enzymatic dissociation harms the integrity of neurons and their RNA, biases data in favour of recovery of easily dissociated cell types, and can only be used on samples from young organisms, precluding, for example, those obtained from deceased patients with neurodegenerative disorders. To address this challenge, we3 and others4 developed single nucleus RNA-seq(e.g., sNuc-Seq3 and Div-Seq3) for analysis of RNA in single nuclei from fresh, frozen or lightly fixed tissues. sNuc-Seq can handle even minute samples of complex tissues that cannot be successfully dissociated, and provide access to archived or banked samples, such as fresh-frozen or lightly fixed samples. However, it relies on sorting nuclei by FACS into plates (96 or 384 wells), and thus cannot easily be scaled to profiling tens of thousands of nuclei (such as human brain tissue) or large numbers of samples (such as tumor biopsies from a patient). Conversely, massively parallel singlecell RNA-seq methods, such as Drop-Seq5, InDrop6 and related commercial tools7,8 can be readily applied at this scale9 in a cost-effective manner10, but require a single cell suspension as input.
To address this challenge, we developed DroNc-seq (Fig. 1a), a massively parallel single nucleus RNA-seq method that combines the advantages of sNuc-Seq with the scale of droplet microfluidics to profile thousands of nuclei at very low cost and massive throughput. DroNc-Seq was modified from Drop-Seq5 to accommodate for the smaller size and relatively lower amount of RNA in nuclei compared to cells. Specifically, we modified the microfluidics design (Supplementary Fig. S1A, B) to generate smaller coencapsulation droplets (75 μm diameter) and flow parameters; we optimized the nuclei isolation protocol to reduce processing time and increase capture efficiency (Supplementary Fig. S1C); and we changed the downstream PCR conditions (Methods). We validated for single nucleus specificity using species-mixing experiments5, in which we combine nuclei from human 293 cells and mouse 3T3 cells in one DroNC-seq run, to assess single nucleus purity, as previously performed for cells5 (Supplementary Fig. S1D). Notably, the DroNc-Seq device and workflow are compatible with current Drop-Seq platforms.
DroNc-Seqrobustly generated high quality expression profiles from nuclei isolated from a mouse cell line (3T3, 4,442 nuclei), adult mouse brain tissue (9,219), and adult human post-mortem frozen archived tissue (20,324 nuclei). It detected, on average 3,152 genes (6,614 transcripts) for 3T3 nuclei, 1,500 genes (2,614 transcripts) for nuclei from adult mouse brain, and 1,000 genes (1,337 transcripts) for nuclei from human post mortem brain tissue (Methods, Fig. 1b).
To assess Dronc-Seq’s throughput and sensitivity, we profiled the same 3T3 cell culture at both the single cell (with Drop-Seq) and single nucleus (with DroNc-Seq) levels, each sequenced to an average depth of 120,000 reads per nucleus or cell. Both methods yielded high quality libraries, detecting, on average, 4,770 and 3,152 genes for cells and nuclei, respectively (Fig. 1c). DroNc-Seq had somewhat reduced throughput, with 2,982 / 300,000 input nuclei passing filter (~1%), compared to 5,175 / 100,000 cells (5%) passing filter per run. The average expression profile of single nuclei was well-correlated with the average profile of single cells (Pearson r=0.87, Fig. 1d), albeit somewhat lower than the correlation between the average profiles of two replicates of Drop-Seq (r=0.99) or DroNc-Seq (r=0.99). Those genes with significantly higher expression in nuclei (e.g., the lncRNAs Malatl and Meg3) or cells (mitochondrial genes Mt-nd1, Mt-nd2, Mt-nd4, Mt-cytb) (Fig. 1d) were consistent with their known distinct enrichment in nuclear vs. non-nuclear compartments (Supplementary Table 1). Interestingly, while in both methods over 85% of reads align to coding loci, in cells 80% of these reads map to exons, whereas in nuclei 56% map to exons and 32% to introns (Fig. 1e), reflecting the enrichment of nascent, pre-processed transcripts in the nuclear compartment3,11-14.
Clustering9 of 5,592 nuclei profiled from frozen adult mouse hippocampus (3 samples) and prefrontal cortex (3 samples) (each with >20,000 reads per nucleus, Methods) revealed groups of nuclei corresponding to known cell types (e.g., GABAergic neurons)and anatomical distinctions between the brain regions and within the hippocampus (e.g., CA1, CA3, dentate gyrus; Fig. 1f). Neurons of the same class but from different brain regions (and different samples) group together, as was also the case for GABAergic neurons, glia and endothelial cells (Fig. 1f-g). Among the non-neural cells, different glia cell types, including astrocytes, oligodendrocytes and oligodendrocyte precursor cells (OPC), readily partitioned into separate clusters, despite their relatively low RNA levels and correspondingly lower numbers of detected genes (Fig. 1f). Finally, DroNc-Seq of mouse hippocampus compared well to sNuc-Seq of the same region15, maintaining the ability to detect the same cell types and correlated cell-types specific signatures (Fig. 1h, Supplementary Table 2) with increased throughput, despite a lower number of genes detected per nucleus in the massively parallel setting.
To demonstrate the utility of DroncSeq on archived human tissue, we profile adult (40-65 years old) human post-mortem frozen brain tissue archived by the GTEx project16. We analysed 10,368 nuclei (each with >20,000 reads per nucleus) from five frozen postmortem archived samples of adult human hippocampus and prefrontal cortex, revealing distinct nuclei clusters corresponding to the known cell types in these regions (Fig. 2a). We readily annotated each cell type cluster post-hoc by its unique expression of known canonical marker genes (Fig. 2b), including rare types, such as adult neuronal stem cells specifically found in the hippocampus (Fig. 2a, cluster 9). Although the human archived samples vary in the quality of the input material, DroNc-Seq yielded high-quality libraries of both neurons and glia cells from each sample (Fig. 2c, bottom), and each cluster was supported by multiple samples (Fig. 2c, top), demonstrating the robustness and utility of DroNc-Seq for clinical applications.
Finally, we determined cell-type specific gene signatures for each human cell type cluster (Fig. 2d), as well as a pan-neuronal signature, a pan-glia signature, and signatures for neuronal stem cells and endothelial cells (Supplementary Table 3). Signatures are enriched for key relevant pathways (FDR<0.01, Methods). For example: Neuronal stem cells signatures are enriched for the expression of genes regulated by NF-kB in response to TNF signalling17; Endothelial cells are enriched for the expression of immune pathways, such as MHC I genes and interferon signalling (Fig. 2e), consistent with the known role of interferon signalling in modulation of the blood brain barrier18. Moreover, we captured finer distinctions between closely related cells (Fig. 2f and Supplementary Fig. 2), such as, distinct sub-types of GABAergic neurons (Fig. 2f), each robustly identified across biological replicates (Supplementary Fig. 3a), and often from both brain regions (Fig. 2g). Two of the GABAergic neuron sub-clusters are specific to the hippocampus (Supplementary Fig. 3a, Fig. 2f, clusters 1 and 4); these too are supported by multiple samples (Supplementary Fig. 3a). We associated each GABAergic neuron sub-cluster with a distinct combination of canonical markers (Fig. 2h), as previously reported in the mouse brain3,19,20.
In conclusion, DroNc-Seq is a massively-parallel single nucleus RNA-seq method, which is robust, cost-effective, and easy to use. Our results show that DroNc-Seq profiling from both mouse and human frozen archived brain tissues successfully identified cell types and sub-types, rare cells, expression signatures and activated pathways, opening the way tosystematic single nucleus analysis of complex tissues that are either inherently challenging to dissociate or already archived. This will help create vital atlases of human tissues and clinical samples.
Acknowledgements
We thank KarthikShekhar, ChristophMuus and Eugene Drokhlyansky for helpful discussions, Timothy Tickle and AsmaBankapur for technical support, and Leslie Gaffney for help with graphics. Work was supported by the Klarman Cell Observatory, NIMH grant U01MH105960 and NCI grant 1R33CA202820-1 (to A.R.). Microfluidic devices were fabricated at the Center for Nanoscale Systems (CNS), Harvard University, member of the National Nanotechnology Coordinated Infrastructure Network (NNCI), and supported by the National Science Foundation under NSF award no. 1541959. A.R. is an Investigator of the Howard Hughes Medical Institute. A.R. is a member of the Scientific Advisory Board for Thermo Fisher Scientific, Syros Pharmaceuticals and Driver Genomics. F.Z. is supported by the NIH through NIMH (5DP1-MH100706 and 1R01-MH110049); NSF; the New York Stem Cell Foundation; the Howard Hughes Medical Institute; the Simons, Paul G. Allen Family, and Vallee Foundations; the Skoltech-MIT Next Generation Program; James and Patricia Poitras; Robert Metcalfe; and David Cheng. F.Z. is a New York Stem Cell Foundation-Robertson Investigator. D.A.W. thanks NSF DMR-1420570, NSF DMR-1310266 and NIH P01HL120839 grants for their support. NH is a Howard Hughes Medical Institute fellow for the Helen Hey Whitney Foundation. N.H., A.B., I.A.D., D.A.W., F.Z. and A.R. are inventors on international patent application PCT/US16/59239 filed by Broad Institute, Harvard and MIT, relating to method of this manuscript. GTEx is supported by the Common Fund of the Office of the Director of the United States National Institutes of Health, through Contract HHSN268201000029C (to K.A LDACC, Broad Institute).