Main

The neocortex coordinates most flexible and learned behaviours1,2. In mammalian evolution, the cortex underwent greater expansion in the number of cells, layers and functional areas compared to the rest of the brain, coinciding with the acquisition of increasingly sophisticated cognitive functions3. On the basis of cytoarchitectonic, neurochemical, connectional and functional studies, up to 180 distinct cortical areas have been identified in humans4 and dozens in rodents5,6. Cortical areas have laminar structure (layers (L) 1–6), and are often categorized as sensory, motor or associational, on the basis of their connections with other brain areas. Different cortical areas show qualitatively different activity patterns. Primary visual (VISp) and other sensory cortical areas process sensory information with millisecond timescale dynamics7,8,9. Frontal areas, such as the anterior lateral motor cortex (ALM) in mice, show slower dynamics related to short-term memory, deliberation, decision-making and planning10,11,12. Categorizing cortical neurons into types, and studying the roles of different types in the function of the circuit, is an essential step towards understanding how different cortical circuits produce distinct computations13,14.

Previous studies have characterized various neuronal properties to define numerous types of glutamatergic (excitatory) and GABAergic (inhibitory) neurons in the rodent cortex15,16,17,18,19,20. Reconciling the morphological, neurophysiological and molecular properties into a consensus view of cortical types remains a major challenge. We leveraged the scalability of single-cell RNA sequencing (scRNA-seq) to define cell types in two distant cortical areas. We analysed 14,249 cells from the VISp and 9,573 cells from the ALM to define 133 transcriptomic types and establish correspondence between glutamatergic neuron projection patterns and their transcriptomic identities. In the accompanying paper21, we show that transcriptomic L5 types with different subcortical projections have distinct roles in movement planning and execution.

Overall cell type taxonomy

Building on our previous study20, we established a standardized pipeline for scRNA-seq (Extended Data Figs. 14). Individual cells were isolated by fluorescence-activated cell sorting (FACS) or manual picking, cDNA was generated and amplified by the SMART-Seq v4 kit, and cDNA libraries were tagemented by Nextera XT and sequenced on the Illumina HiSeq2500 platform, resulting in the detection of approximately 9,500 genes per cell (median; Extended Data Fig. 4).

We report 23,822 single-cell transcriptomes with cluster-assigned identity, validated by quality control measures (Extended Data Fig. 2b). The cells were isolated from the VISp and ALM of adult mice (96.3% at postnatal day (P) 53–59, Supplementary Table 1) of both sexes, in the congenic C57BL/6J background (Extended Data Fig. 1a). We obtained 10,752 cells from layer-enriching dissections of ALM and VISp of pan-neuronal, pan-glutamatergic or pan-GABAergic recombinase driver lines crossed to recombinase reporters (referred to as the PAN collection; Extended Data Fig. 1, Supplementary Table 2). To sample non-neuronal cells, compensate for cell survival biases, and collect rare types, we supplemented the PAN collection with 10,414 cells isolated from a variety of recombinase driver lines and reporter-negative cells, with or without layer-enriching dissections (Extended Data Fig. 1b, h, i). To investigate the correspondence between transcriptomic types and neuronal projection properties, we analysed 2,656 retrogradely labelled cells (retro-seq dataset, Fig. 1a), resulting in 2,204 cells in the annotated retro-seq dataset (Extended Data Fig. 2c).

Fig. 1: Cell type taxonomy in ALM and VISp cortical areas.
figure 1

a, Transgenically or retrogradely labelled cells and unlabelled cells were collected by layer-enriching or all-layer microdissections from the ALM or VISp. b, After dissociation, single cells were isolated by FACS or manual picking, mRNA was reverse transcribed (RT), amplified (cDNA amp.), tagmented and sequenced (next-generation sequencing, NGS). c, Clustering revealed 61 GABAergic, 56 glutamatergic, and 16 non-neuronal types organized in a taxonomy on the basis of median cluster expression for 4,020 differentially expressed genes, n = 23,822 cells and branch confidence scores > 0.4 (Extended Data Figs. 13). Cell classes and subclasses are labelled at branch points of the dendrogram. Bar plots represent fractions of cells dissected from the ALM and VISp, and from different layer-enriching dissections. Astro, astrocyte; CR, Cajal–Retzius cell; endo, endothelial cell; oligo, oligodendrocyte; OPC, oligodendrocyte precursor cell; peri, pericyte; PVM, perivascular macrophage; SMC, smooth muscle cell; VLMC, vascular lepotomeningeal cell; IT, intratelencephalic; PT, pyramidal tract; NP, near-projecting; CT, corticothalamic. Brain diagrams were derived from the Allen Mouse Brain Reference Atlas (version 2 (2011); downloaded from https://brain-map.org/api/index.html).

We defined 133 clusters by combining iterative, bootstrapped dimensionality reduction with clustering (Extended Data Fig. 2b). After clustering, we evaluated cluster membership to assign core versus intermediate identity to each cell: core cells (21,195 cells) are reliably classified into the original cluster (in more than 90 out of 100 trials); others are labelled intermediate20 (2,627 cells; Extended Data Fig. 2b).

By assigning identity to each cluster based on previously reported and newly discovered differentially expressed genes (Extended Data Fig. 5), we identified 56 glutamatergic, 61 GABAergic and 16 non-neuronal types (Fig. 1). These types correspond well to the 49 types from our previous study20, with better resolution provided in the current dataset (Extended Data Fig. 6). Sub-sampling analysis shows that for most clusters, we sampled many more cells than needed to define them (Extended Data Fig. 7). The use of many transgenic lines enabled focused access to select rare types, and allowed us to define cell types labelled by each line (Extended Data Fig. 8).

A clear hierarchy of transcriptomic cell types and their relationships emerged (Fig. 1). Consistent with previous reports19,20, the biggest differences are observed between non-neuronal (n = 1,383) and neuronal (n = 22,439) cells. We refer to major branches as classes (for example, glutamatergic class), and related groups of types as subclasses (for example, L6b subclass) (Fig. 1c). We do not assign subclass or class to isolated branches (for example, CR–Lhx5 cells). We detect all previously defined non-neuronal classes in the cortex (Extended Data Fig. 9).

Most neurons fall into two major branches corresponding to glutamatergic and GABAergic classes (Fig. 1). There are two exceptions: CR–Lhx5 and Meis2Adamts19, two distant branches preceding the major glutamatergic and GABAergic split. On the basis of marker expression and cell source, Meis2–Adamts19 corresponds to the Meis2-expressing GABAergic neuronal type largely confined to white matter that originates from the embryonic pallial–subpallial boundary22. Among GABAergic types, this is the only type that reliably expresses the transcription factor Meis2 mRNA, and transcribes the smallest number of genes (median = 4,965, Extended Data Fig. 4b). CR–Lhx5 corresponds to Cajal–Retzius (CR) cells based on their location in L1 and expression of known Cajal–Retzius markers, such as Trp73, Lhx5 and Reln23,24 (Extended Data Fig. 5). Almost all GABAergic types contain cells from both ALM and VISp (Figs. 1c, 2a) with the exception of Sst–Tac1–Tacr3 and Pvalb–Reln–Itm2a types, which are VISp-specific. By contrast, the glutamatergic types are mostly segregated by area (Figs. 1c, 2a), with the exception of five shared types: one L6 CT type, three L6b types and the CR–Lhx5 type.

Fig. 2: Comparison of gene expression differences among types across cortical areas.
figure 2

a, Two-dimensional t-distributed stochastic neighbour embedding (t-SNE) plots based on 4,020 differentially expressed genes for n = 23,822 cells, coloured by region, class and cluster. Most glutamatergic types are ALM- or VISp-specific. Most GABAergic types contain cells from both regions (salt-and-pepper clusters, left t-SNE). b, Number of differentially expressed (DE) genes (x axis) and mean difference in gene expression (y axis) for all 8,778 pairs of clusters. Left, comparisons between ALM and VISp portions of each GABAergic cluster (pink) and best-matched glutamatergic ALM and VISp clusters (blue). For comparison, centre and right panels show differences between: types within a subclass, types from different subclasses, non-neuronal types, types from different neuronal classes (GABA versus glutamate), and neuronal and non-neuronal types. Grey points represent all pairwise type comparisons; pink points are only in the left panel. c, Number of differentially expressed genes between best-matched ALM- and VISp-specific cell types (Extended Data Fig. 10c) or ALM and VISp portions for shared types. Cell types on the x axis are coloured as in Fig. 1; black horizontal line separates matched ALM and VISp types, but not the shared types. Black and grey bars denote the numbers of ALM- and VISp-enriched genes, respectively. d, ALM- or VISp-specific genes based on the proportion of cells in each region that express each gene, calculated separately for glutamatergic and GABAergic cells.

We performed differential gene expression tests between the best-matched ALM- and VISp-specific types (mostly glutamatergic; Extended Data Fig. 10c) and between ALM- and VISp-portions of shared types (mostly GABAergic and non-neuronal) (Fig. 2b). We find that the best-matched glutamatergic types have a median of 78 differentially expressed genes and average eightfold difference in expression (Fig. 2b, Supplementary Table 3). We find more ALM-enriched genes (Fig. 2c, d). We confirm the area-specific expression of several genes by RNA in situ hybridization (ISH) from the Allen Brain Atlas25 (Extended Data Fig. 10d, e). By contrast, the GABAergic neurons from the two areas belonging to the same cluster have a median of 2 (and at most 19) differentially expressed genes, with an average 5.2-fold difference in expression (Fig. 2b, left).

Glutamatergic taxonomy by scRNA-seq and projections

Most cortical glutamatergic neurons project outside of their resident area, and genetic markers have been correlated with projection properties15,26,27. To inform our transcriptomic taxonomy with neuronal projection properties, we analysed the transcriptomes of 2,204 cells labelled by retrograde injections (retro-seq dataset; Fig. 3a, Extended Data Fig. 2c). Projection targets (Fig. 3b, Extended Data Fig. 10) were selected based on the Allen Mouse Brain Connectivity Atlas28 and other anatomical data29. Retro-seq cells were processed through the same pipeline including clustering with all other cells.

Fig. 3: Glutamatergic cell types by scRNA-seq and projections.
figure 3

a, Retro-seq: after virus injections and brain sectioning, injection sites were imaged to determine injection specificity. Tissue was microdissected from the collection site (ALM or VISp) and processed as shown in Fig. 1b. b, Injection targets grouped into broad regions: cortex (CTX), striatum (STR), thalamus (TH), tectum (TEC), pons (P) or medulla (MY). c, Dendrogram of glutamatergic cell types in ALM followed by numbers of cells (represented by disc area) originating from retrograde labelling from regions on top. Shaded regions denote cells labelled unintentionally, directly or retrogradely through the needle (injection) tract. d, As in c, but for VISp. Only glutamatergic cells from the annotated retro-seq dataset were included: n = 1,138 out of 1,152 annotated cells in c, and 1,049 out of 1,052 annotated cells in d. See Extended Data Fig. 10a, b for further details. Brain diagrams were derived from the Allen Mouse Brain Reference Atlas (version 2 (2011); downloaded from https://brain-map.org/api/index.html).

We assigned identities to glutamatergic neuron types based on their projection patterns (Fig. 3c, d), dominant layer-of-dissection (Figs. 1c, 4b), and expression of marker genes (Fig. 4c, Extended Data Fig. 5). We represent the relationships between types by a constellation diagram and a dendrogram (Fig. 4a,b). VISp and ALM contain common subclasses of projection neurons (Fig. 3c, d): intratelencephalic (IT), pyramidal tract (PT), near-projecting (NP) and corticothalamic (CT). We validated the preferential residence layer for neuronal cell bodies of select types by RNA fluorescent in situ hybridization (FISH) and neuronal projections by anterograde labelling (Extended Data Fig. 11).

Fig. 4: Glutamatergic cell types and markers.
figure 4

a, Constellation diagram of ALM and VISp types. Disc areas represent core cell numbers for each cluster (n = 10,729), edge weights represent intermediate cell numbers (n = 1,136). L6–CT–Nxph2–Sla, L6b–Col8a1–Rprm, L6b–Hsd17b2 and L6b–P2ry12 are found in both areas. Cajal–Retzius type was omitted. b, Dendrograms correspond to glutamatergic portion of Fig. 1c. Layer distribution for each type was inferred from layer-enriching dissections (n = 8,477 out of 11,871 cells in glutamatergic clusters): each dot represents a cell positioned at random within each layer. Distributions are approximate owing to sampling strategy (Methods). c, Marker gene expression distributions within each cluster are represented by violin plots. Rows are genes, black dots are medians. Values within each row are normalized between 0 and maximum detected, displayed on a log10 scale (n = 11,827 cells).

Projection properties dominate the dendrogram structure. The IT types constitute the largest branch in both the VISp and ALM glutamatergic taxonomies (Figs. 1c, 3c, d), and span most layers. IT constellations include many intermediate cells, which connect types within a layer, between equivalent layers (for example, L2–L3 in ALM and VISp) or from neighbouring layers (Fig. 4a). We define many new markers (Fig. 4c), including a new pan-IT-type marker (Slc30a3) and a new L6-IT-type marker (Osr1). We also define a distinct IT type, L6–IT–VISp–Car3, which expresses a unique combination of markers including Car3, Oprk1 and Nr2f2 (Fig. 4). Some of these genes have been previously detected in the claustrum30, and are detected in VISp L6 in the Allen Brain Atlas25. Anterograde labelling confirms these findings and refines our knowledge of cortico–cortical projections (Extended Data Fig. 11). For example, IT types preferentially target different laminae in same target areas—upper layers for L2–L3 and L5 IT types, and lower ones for L6 IT types (Extended Data Fig. 11f–h).

Pyramidal tract neurons, the descending output neurons in L5, share a separate branch in the taxonomy (Fig. 1c). They project to subcortical targets (Fig. 3c, d) and express the previously known marker Bcl626 and a new pan-pyramidal tract neuronal marker Fam84b (Fig. 4b, c). The three pyramidal tract transcriptomic types in the ALM correspond to two projection classes21: two project to the thalamus, whereas the third projects to the medulla (Extended Data Fig. 10a). The thalamus- and medulla-projecting ALM pyramidal tract neurons have distinct functions in planning and executing voluntary movements, respectively21. Similarly, it seems that pyramidal tract types from the VISp display differential subcortical projections (Extended Data Fig. 10b).

Corticothalamic (CT) L6 types (Fig. 3c, d) share the transcription factor marker Foxp2 (Fig. 4b, c), and may have cell-type-specific preferences for different thalamic nuclei (Extended Data Fig. 10b).

L6b types share many markers, such as Cplx3, Ctgf and Nxph425,31,32, but display differential projections to the thalamus or anterior cingulate (Fig. 3d). The thalamus-projecting L6b–Col8a1–Rprm type is related to the L6–CT–VISp–Krt80–Sla type (Fig. 4a), and expresses shared markers (for example, Rprm and Crym; Fig. 4c). This relationship is captured in the constellation diagram (Fig. 4a), but not in the dendrogram (Fig. 4b). Three other L6b types in the VISp project to the anterior cingulate area (Extended Data Fig. 10b). For the remaining L6b types, we observed no long-distance projections. As recently reported33, anterograde tracing in Ctgf-2A-dgcre knock-in mice (see Methods) confirms sparse long-range projections from the anterior VISp to the anterior cingulate area. In addition, it shows that L6b neurons in the VISp and ALM project to L1 within resident and neighbouring cortical areas (Extended Data Fig. 11j).

We define four related types in L5–L6 that express distinct markers including Slc17a8, Trhr, Tshz2, Sla2 and Rapgef3 (Fig. 4c). On the basis of the retro-seq dataset, they do not project to any of the assayed areas (Fig. 3c, d). Anterograde tracing of neurons labelled by a new Cre line Slc17a8-IRES2-cre, reveals only sparse projections to neighbouring areas (Extended Data Fig. 11k), earning this subclass the name ‘near projecting’. Some of these cells probably correspond to previously reported Slc17a8+ L5 cells26, as well as cells labelled by Efr3a-cre_NO10834.

GABAergic cell type taxonomy by scRNA-seq

We define six subclasses of GABAergic cells: Sst, Pvalb, Vip, Lamp5, Sncg and Serpinf1, and two distinct types: Sst–Chodl and Meis2–Adamts19 (Fig. 1c). We represent the taxonomy by constellation diagrams, dendrograms, layer-of-isolation, and the expression of select marker genes (Fig. 5a–f). The major division among GABAergic types largely corresponds to their developmental origin in the medial ganglionic eminence (Pvalb and Sst subclasses) or caudal ganglionic eminence (Lamp5, Sncg, Serpinf1 and Vip subclasses).

Fig. 5: GABAergic cell types by scRNA-seq.
figure 5

a, b, Constellation diagrams for Sst and Pvalb (a) and Lamp5, Serpinf1, Sncg and Vip (b) types, as in Fig. 4a (n = 9,021 core cells; n = 1,457 intermediate cells). Edges connecting subclasses are pink. Meis2 type was omitted. c, d, Dendrograms are portions of Fig. 1c focused on the main GABAergic branch. Below the dendrograms, layer distribution for each type was inferred as in Fig. 4b; only cells from single-layer dissections were used: n = 4,675 out of 5,365 cells in c, and 3,908 out of 5,113 cells in d. Distributions are approximate owing to the sampling strategy (Methods). e, f, Marker gene expression distributions within each cluster are represented by violin plots as in Fig. 4c. n = 5,365 cells in e and 5,113 cells in f.

The Sst and Pvalb subclasses within the Sst and Pvalb constellation are connected by select upper and lower layer types (Fig. 5a, pink lines). The Lamp5, Vip, Serpinf1 and Sncg subclasses are represented by four interconnected neighbourhoods in the constellation diagram (Fig. 5b). These complicated landscapes are the result of many genes expressed in a combinatorial and graded fashion (Extended Data Fig. 5), resulting in high co-clustering frequencies (Extended Data Fig. 3a) and many intermediate cells (Fig. 5a, b).

Our GABAergic transcriptomic taxonomy agrees with previously reported interneuron types based on marker gene expression, transgenic lines, published Patch-seq (patch-pipette-extracted single-cell RNA sequencing) and other scRNA-seq data (Supplementary Table 4, Extended Data Figs. 8, 12). Sst–Chodl corresponds to Nos1+ long-range projecting interneurons based on marker expression, location, Cre-line labelling, and other RNA-seq data20,35,36 (Supplementary Table 4, Extended Data Figs. 8, 12). Sst–Calb2–Pdlim5 corresponds to Sst+ and Calb2+ L2/3 Martinotti cells16,35,36 (Fig. 5e, Extended Data Fig. 12a), whereas some of the deep-layer Sst types (for example, Sst–Chrna2–Glra3) express Chrna2, a gene detected in L5 Martinotti cells37.

For the Pvalb subclass, we confirm that the Pvalb–Vipr2 type (Pvalb–Cpne5 in our previous study20), corresponds to chandelier cells by mapping of the recently reported chandelier cell (CHC1) RNA-seq data36 to our Pvalb–Vipr2 type (Extended Data Fig. 12a). We used the new genetic marker Vipr2 to develop Vipr2-IRES2-cre to access chandelier cells (Extended Data Figs. 8, 13a–f). Several other Pvalb types (Pvalb–Gpr149–Islr, Pvalb–Tpbg and Pvalb–Reln–Tac1) correspond to basket cells36 (Extended Data Fig. 12a, b).

Within the Lamp5, Vip, Sncg and Serpinf1 subclasses, we find evidence for neurogliaform, bipolar, single bouquet and cholecystokinin (CCK) basket cell types (Supplementary Table 1). The Sncg subclass corresponds to the Vip+ and Cck+ multipolar or basket cells and is distinct from cells of the Vip subclass that are also Calb2+ and have bipolar morphologies16,35,36 (Fig. 5f, Extended Data Fig. 12a). We previously assigned neurogliaform cell identity to Ndnf types20, which correspond to several current Lamp5 types (Extended Data Fig. 6). We confirm this finding by mapping of published Patch-seq data38 to our data (Extended Data Fig. 12d–f) and find correspondence of neurogliaform cells to Lamp5–Plch2–Dock5 and Lamp5–Lsp1 types. In addition, we find that single bouquet cells map mostly to Lamp5Fam19a1–Tmem182, and find a possible transitional single bouquet–neurogliaform cell type, Lamp5–Ntn1–Npy2r (Extended Data Fig. 12d).

The Lamp5–Lhx6 type is unusual because it clusters with other Lamp5 types, which are derived from the caudal ganglionic eminence, but expresses Nkx2.1 (also known as Nkx2-1) and Lhx6, which encode transcription factors of the medial ganglionic eminence. This type is labelled by tamoxifen induction at embryonic day (E) 18 of Nkx2.1-creERT2 mice (Extended Data Fig. 8) and was isolated previously36 from the same Cre line (Extended Data Fig. 12a–c). We find that the RNA-seq data of chandelier type 2 cells (CHC2)36 map primarily to our Lamp5–Lhx6 type (Extended Data Fig. 12a, b), which is transcriptomically most related to Lamp5 neurogliaform types.

Continuous variation and cell states

Cell classes are easily identified because they are driven by large differences in gene expression (Fig. 2b) and agree well with previous literature19,20. Gene expression differences between subclasses and types are smaller and sometimes graded (Fig. 2b), making interpretation more complicated. Constellation diagrams capture differences in gene expression among types as a combination of continuity and discreteness. However, they do not capture heterogeneity within types, which may be substantial. To illustrate this, we focus on the L4–IT–VISp–Rspo1 type, which consists of 1,404 cells and displays heterogeneity along the first principal component (Extended Data Fig. 14a–c). The extent of the heterogeneity between the ends of this type is similar to heterogeneity between this type and a neighbouring type (L4–IT–VISp–Rspo1 and L5–IT–VISp–Hsd11b1–Endou, Extended Data Fig. 14d, e). However, in this dataset, we were unable to split this cluster into subclusters using our clustering criteria. This cluster maps to three clusters connected by many intermediate cells in our previous study20 (Extended Data Fig. 14b). Therefore, the description of L4 cell heterogeneity changed from discrete with many intermediate cells20 to continuous, possibly owing to more extensive cell sampling and better gene detection. To demonstrate how clustering criteria affect the taxonomy, we performed clustering for Sst types at different stringencies. As expected, less stringent statistical criteria yield more types, and vice versa (Extended Data Fig. 14f).

Transcriptomic profiles are also influenced by cell states, which can be defined as reversibly accessible locations a cell can occupy within a multidimensional gene expression space39. To determine whether we can detect activity-dependent changes that may be indicative of states in our cell types, we mapped our cells to VISp transcriptomic clusters from dark-reared animals, some of which were exposed to light before euthanasia40 (Extended Data Fig. 15). We find several glutamatergic and GABAergic types that display statistically significant enrichment or depletion of early- and/or late-response genes, showing that some of our types probably represent cell states. Therefore, our clustering criteria are appropriate to capture at least some cell states, whereas more stringent criteria may overlook them (Extended Data Fig. 14f; the Sst–Tac1–Tacr3 cluster merges with Sst–Tac1–Htr1d).

Discussion

We used single-cell transcriptomics to uncover the principles of cell type diversity in two functionally distinct areas of neocortex. We define 133 transcriptomic types, 101 types in the ALM and 111 in the VISp, 79 of which are shared between these areas. Most glutamatergic types are area-specific. By contrast, and as previously suggested19, non-neuronal and most GABAergic neuronal types are shared across cortical areas. Although we detect area-specific differences in gene expression within GABAergic types (Fig. 2, Extended Data Fig. 16), they are usually insufficient to define subtypes with our statistical criteria.

This dichotomy correlates with neuronal connectivity patterns and developmental origins. Most glutamatergic types in VISp or ALM project to different cortical and subcortical targets (Fig. 3, Extended Data Fig. 10), whereas nearly all GABAergic interneurons form local connections. Most glutamatergic neurons are born locally within the ventricular–subventricular zone of the developing cortex41, which is pre-patterned with developmental gradients—an embryonic protomap42,43—and further segregated into areas through differential thalamic input in development44,45. By contrast, types that are shared across areas are derived from extracortical sources, and migrate into the developing cortex: most GABAergic interneurons are from the medial ganglionic eminence or caudal ganglionic eminence16; Meis2 interneurons are from the pallial–subpallial boundary22; and Cajal–Retzius cells of the hippocampus and cortex are from the cortical hem46. It remains to be investigated whether some of the shared L6b types may originate from the rostro-medial telencephalic wall, a known source for a subset of subplate neurons that are distinct from those generated within the local ventricular–subventricular zone47, or whether further sampling may segregate them into area-specific types. Although our taxonomy mostly agrees with the developmental origins of the cells, there are exceptions. For example, tamoxifen induction of Nkx2.1-creERT2 mice at E18 labels not only chandelier cells, but also a suggested second chandelier type, CHC236. Our taxonomy suggests that CHC2 may be a neurogliaform type (Lamp5–Lxh6) that arises from the medial ganglionic eminence, and that neurogliaform types could arise through different developmental pathways and embryonic sources in an example of developmental convergence.

We observe both discrete and continuous gene expression variation among and within types. To accommodate both kinds of variation, we used post-clustering classifiers to construct constellation diagrams, and were able to capture cell states. Alternative analyses of these landscapes lead to more cluster splits (more discreteness) or merges (more continuous variation) (Extended Data Fig. 14f). The detected and described (versus actual) discreteness in the definition of cell types depend on gene detection, cell sampling, and noise estimates or statistical criteria39 (Extended Data Fig. 14b, f). Future experimental datasets would benefit from multimodal data acquisition, more efficient mRNA detection, and sampling cells according to their abundance in situ48 and in different states40. Our dataset provides a foundation for understanding the diversity of cortical cell types and dissecting circuit function. As an example, in the accompanying paper21, we show that ALM L5 pyramidal tract neurons map to transcriptomic clusters with distinct projection patterns that have different roles in the preparation and execution of movement.

Methods

Mouse breeding and husbandry

All procedures were carried out in accordance with Institutional Animal Care and Use Committee protocols 1508, 1510 and 1511 at the Allen Institute for Brain Science and Janelia Research Campus. Animals were provided food and water ad libitum and were maintained on a regular 12-h day/night cycle at no more than five adult animals per cage. Animals were maintained on the C57BL/6J background, and newly received or generated transgenic lines were backcrossed to C57BL/6J. Experimental animals were heterozygous for the recombinase transgenes and the reporter transgenes. Transgenic lines used in this study are summarized in Supplementary Table 5. Standard tamoxifen treatment for CreER lines included a single dose of tamoxifen (40 μl of 50 mg ml−1) dissolved in corn oil and administered via oral gavage at P10–14. Tamoxifen treatment for Nkx2.1-creERT2;Ai14 was performed at E17 (oral gavage of the dam at 1 mg per 10 g of body weight), pups were delivered by caesarean section at E19 and then fostered. Cux2-creERT2;Ai14 mice received tamoxifen treatment daily, for five consecutive days, between P30 and P40. Trimethoprim was administered to animals containing Ctgf-2A-dgcre by oral gavage daily, for three consecutive days, between P35 and P45 (0.015 ml per g of body weight using 20 mg ml−1 trimethoprim solution). Ndnf-IRES2-dgcre animals did not receive trimethoprim induction, because the baseline dgCre activity (without trimethoprim) was sufficient to label the cells with the Ai14 reporter20. The transgenic component dgcre encodes a destabilized Cre protein: it contains a destabilizing domain ‘d’, which is stabilized by trimethoprim, and a non-fluorescent portion of eGFP ‘g’. We excluded any animals with anophthalmia or microphthalmia. We used 352 animals to collect the set of 24,411 cells for clustering (Supplementary Table 1). Animals were euthanized at P53–P59 (n = 339), P51 (n = 1), and P63–P91 (n = 12). No statistical methods were used to predetermine sample size.

Generation of transgenic mice (Penk-IRES2-cre-neo, Slc17a8-IRES2-cre and Vipr2-IRES2-cre)

Vectors containing gene-specific homology arms and IRES2-cre-bGHpoly(A)-PGK-gb2-neo-PGKpoly(A) components were generated using gene synthesis (GenScript) and standard molecular cloning techniques. Targeting of the transgene cassette into the endogenous gene locus immediately downstream of the stop codon was accomplished by CRISPR–Cas9-mediated genome editing using circularized targeting vector in combination with a gene-specific guide vector (Addgene, plasmid 42230)49. The 129S6/B6 F1 embryonic stem (ES) cell line, G450, was used to generate all modified ES cells. Correctly targeted clones were identified using standard screening approaches (PCR, qPCR and Southern blots) and injected into blastocysts to obtain chimaeras and subsequent germline transmission. Resulting mice were crossed to the Rosa26-PhiC31o mice (JAX, 007743)51 to delete the PGK-neo selection cassette, and then backcrossed to C57BL/6J mice and maintained in the C57BL/6J background. The PGK-neo cassette could not be removed from Penk-IRES2-cre-neo by the PhiC31o integrase-mediated recombination.

Retrograde labelling

We injected rAAV2-retro-EF1a-Cre52, RV∆GL-Cre53, or CAV2-Cre (gift from M. Chillon Rodrigues)54 into brains of heterozygous or homozygous Ai14 mice as previously described20. For ALM experiments, we also injected rAAV2-retro-CAG-GFP or rAAV2-retro-CAG-tdTomato52 into wild-type mice. Stereotaxic coordinates were obtained from Paxinos adult mouse brain atlas55(Supplementary Table 6). For two VISp experiments, we injected into the superior colliculus sensory-related area by inserting the needle through the cerebellum at a 45° angle in the posterior to anterior direction. TdT+ or GFP+ single cells were isolated from VISp or ALM, depending on the injection area. Detailed information on used viruses is available in Supplementary Table 7.

Anterograde labelling

For anterograde projection mapping, we injected AAV2/1-pCAG-FLEX-eGFP-WPRE-pA28 into VISp or ALM of 8–12-week-old mice. Stereotaxic injection procedure was the same as for retrograde labelling above. In Ctgf-2A-dgcre mice, one week after AAV injection, trimethoprim induction was conducted for 3 consecutive days as described previously20. Mice were euthanized and brains perfused after 21 days (or 28 days in the case of Ctgf-2A-dgcre) after AAV injection, and brains were imaged using TissueCyte 1000 system as described previously28. Experiments can be viewed interactively on the Allen Institute data portal at http://connectivity.brain-map.org/.

Single-cell isolation

We isolated single cells as previously described20,56,57 with modifications below. We usually used layer-enriching dissections, with focus on a single layer. Broader dissections (no layer enrichment or multiple layers combined) were used for lines that label small numbers of cells, to facilitate isolation of sufficient number of cells. We updated our artificial cerebrospinal fluid (ACSF) formulation compared to our previous study20 to include N-methyl-d-glucamine (NMDG) to improve neuronal survival58. Our ACSF consisted of CaCl2 (0.5 mM), glucose (25 mM), HCl (96 mM), HEPES (20 mM), MgSO4 (10 mM), NaH2PO4 (1.25 mM), myo-inositol (3 mM), N-acetylcysteine (12 mM), NMDG (96 mM), KCl (2.5 mM), NaHCO3 (25 mM), sodium l-ascorbate (5 mM), sodium pyruvate (3 mM), taurine (0.01 mM), thiourea (2 mM), and was bubbled with carbogen gas (95% O2 and 5% CO2). For samples collected after 16 December 2016, the ACSF formulation also included trehalose (13.2 mM). Mice were anaesthetized with isoflurane and perfused with cold carbogen-bubbled ACSF. The brain was dissected, submerged in ACSF, embedded in 2% agarose, and sliced into 250-µm coronal sections on a compresstome (Precisionary). Enzymatic digestion, trituration into single cell suspension, and FACS analysis of single cells were carried out as previously described20, with example sorting strategy shown in Extended Data Fig. 1e–g. Cells were sorted into 8-well strips containing lysis buffer from the SMART-Seq v4 kit (see below) with RNase inhibitor (0.17 U µl−1), immediately frozen on dry ice, and stored at −80 °C.

Note that the overall relative proportions of cell types in our dataset are not representative of those in the intact brain because of the targeted sampling approach using various Cre lines and possible cell type–specific differences in survival during the isolation procedure.

cDNA amplification and library construction

We used the SMART-Seq v4 Ultra Low Input RNA Kit for Sequencing (Takara, 634894) to reverse transcribe poly(A) RNA and amplify full-length cDNA according to the manufacturer’s instructions. We performed reverse transcription and cDNA amplification for 18 PCR cycles in 8-well strips, in sets of 12–24 strips at a time. A small set of non-neuronal cell samples was amplified by 21 PCR cycles instead of 18 (Supplementary Table 10). At least 1 control strip was used per amplification set, which contained 4 wells without cells and 4 wells with 10 pg control RNA. Control RNA was either Mouse Whole Brain Total RNA (Zyagen, MR-201) or control RNA provided in the SMART-Seq v4 kit. All samples proceeded through Nextera XT DNA Library Preparation (Illumina FC-131-1096) using Nextera XT Index Kit V2 Set A (FC-131-2001). Nextera XT DNA Library prep was performed according to manufacturer’s instructions except that the volumes of all reagents including cDNA input were decreased to 0.4× or 0.5× by volume. The replacement of Clontech’s SMARTer v.159, which we used in our previous study20, with SMART-Seq v.4 kit, which is based on Smart-seq260, increases the efficiency of gene detection. This allowed us to reduce the median sequencing depth from approximately 8.7 million to 2.5 million reads per cell while still detecting 9,500 genes per cell (median) compared to 7,800 previously (Extended Data Fig. 2b). Subsampling of the reads to a median of 0.5 million per cell results in similar gene detection per cell (>89% of genes detected, data not shown), showing that we detect most of the genes at 2.5 million reads per cell. Details are available in ‘Documentation’ on the Allen Institute data portal at: http://celltypes.brain-map.org/.

Sequencing data processing and quality control

Fifty-base-pair paired-end reads were aligned to GRCm38 (mm10) using a RefSeq annotation gff file retrieved from NCBI on 18 January 2016 (https://www.ncbi.nlm.nih.gov/genome/annotation_euk/all/). Sequence alignment was performed using STAR v2.5.361 in twopassMode. PCR duplicates were masked and removed using STAR option ‘bamRemoveDuplicates’. Only uniquely aligned reads were used for gene quantification. Gene counts were computed using the R GenomicAlignments package62 sumarizeOverlaps function using ‘IntersectionNotEmpty’ mode for exonic and intronic regions separately. In this study, we only used exonic regions for gene quantification. Cells that met any one of the following criteria were removed: <100,000 total reads, <1,000 detected genes (counts per million > 0), < 75% of reads aligned to genome, or CG dinucleotide odds ratio > 0.5. Doublets were removed by first classifying cells into broad classes of glutamatergic, GABAergic, and non-neuronal based on known markers. For each class, we selected a set of highly specific genes that are only present in this class compared to all other classes, and computed the eigengene (the first principle component based on the given gene set), normalized within the 0–1 range. Each cell was assigned to the class with the maximum eigengene. For each class, we computed the mean and standard deviation of the corresponding eigengene for cells outside this class. Any cell in which the eigengene was more than three standard deviations above the mean for the cells outside the class was assigned to be members of that class. On the basis of this criterion, cells that belong to more than one class were defined as doublets.

Mapping reads to synthetic constructs

We mapped all non-genome-mapped reads to sequences in Supplementary Table 8. To avoid ambiguous counting due to stretches of sequence identity, we designated unique regions within these sequences to count mRNAs of interest. We counted only reads for which at least one of the paired ends had an overlap with the unique regions of at least 10 bp.

Clustering

Cells that passed quality control criteria were clustered using an in-house developed iterative clustering R package hicat available via Github (https://github.com/AllenInstitute/hicat). It was described partially in previous studies20,63, and was modified to improve robustness and adapt to large numbers of cells. In brief, all quality control qualified cells were grouped into very broad categories using known markers, then clustered using high variance gene selection, dimensionality reduction, dimension filtering, and Jaccard–Louvain or hierarchical (Ward) clustering. This process was repeated within each resulting cluster until no more child clusters met differential gene expression or cluster size termination criteria. The entire clustering procedure was repeated 100 times using 80% of all cells sampled at random, and the frequency with which cells co-cluster was used to generate a final set of clusters, again subject to differential gene expression and cluster size termination criteria. A workflow diagram for this approach is presented in Extended Data Fig. 2. The key strength of this approach is its ability to provide high-resolution cell type categorization that withstands rigorous statistical tests to ensure reproducibility and biological relevance of the results. Below, we provide more details for the analysis carried out at each iteration of clustering:

1. Selection of high-variance genes. We first removed predicted gene models (gene names that start with Gm), genes from the mitochondrial chromosome, ribosomal genes, sex-specific genes, as well as genes that were detected in fewer than four cells. To choose high variance genes, we used gene counts from each cell to fit a Loess regression curve between average scaled gene counts and dispersion (variance divided by mean). The regression residuals were then fit to a normal distribution based on 25% and 75% quantiles to calculate P values and adjusted P values (using Holm’s method), representing the probability that each gene had higher than expected variance. Genes were ranked by adjusted P value.

2. Dimensionality reduction. We implemented two methods: principal component analysis (PCA) and weighted gene co-expression network analysis (WGCNA). In the PCA mode, top high variance genes with adjusted P < 0.5 were used to compute principal components. The proportion of variance for all principal components was converted to z-scores, and principal components with z-scores >2 were selected for clustering. In the WGCNA mode, the 4,000 genes with the most significant P values were used as input for WGCNA to identify gene modules. Here, we used a more relaxed criterion than in the PCA mode to allow more genes to be included for gene module detection. To determine the discriminative power of each module, we used the genes in each module to divide the cells into two clusters using Jaccard–Louvain clustering64 (for more than 4,000 cells) or a combination of k-means and Ward’s hierarchical clustering (for <4,000 cells). After dividing the cells into two clusters, we computed differential gene expression between the two clusters (see ‘Defining differentially expressed genes’ section). We then computed the differential expression score (deScore), defined as the sum of −log10(adjusted P value) of all differentially expressed genes. For deScore calculations, the maximum value each gene was allowed to contribute was 20. Only modules with deScore greater than 150 were selected for use in downstream analysis, and module eigengenes were computed for selected modules as reduced dimensions. Up to 20 top reduced dimensions were selected for both methods. The two dimensionality reduction approaches are complementary: WGCNA detects rare clusters well, segregates well biological and technical variation, and provides cleaner cluster boundaries; PCA is more scalable to large datasets and captures combinatorial marker expression patterns better than WGCNA.

3. Dimension filtering. We have identified systematic technical variation that affects expression of hundreds of genes that we believe is primarily driven by the quality of the single cell cDNA library. The first principal component of these genes is highly correlated with the log-transform of the number of genes detected in each cell, so we define the latter as the quality control eigen. We have also identified a list of genes that contribute to the batch effect for the first set of experiments for this study with subtle protocol differences. We computed batch eigen as the first principal component based on these batch specific genes. We removed any principal components or module eigengenes that have correlation greater than 0.7 with either the quality control eigen or the batch eigen.

4. Initial clustering. For clustering, we applied either the Jaccard–Louvain method64 using the Rphenograph package (for >4,000 cells), or Ward’s method (for ≤4,000 cells). Although the Louvain algorithm scales well with large datasets, it has been shown to have a resolution limit65, and small clusters tend to be missed. Therefore, as a complementary approach, we applied Ward’s minimum variance method for hierarchical clustering when fewer than 4,000 cells were to be clustered. The initial number of clusters was set at twice the number of reduced dimensions from step 3.

5. Cluster merging. To make sure the resulting clusters all have distinguishable transcriptomic signatures, we defined differentially expressed genes between every cluster and their two nearest neighbours in the reduced dimension space (using Euclidean distance if there were 1 or 2 dimensions, or 1 minus Pearson correlation for more dimensions). A pair of clusters was considered separable if the deScore (described in step 2) for all differentially expressed genes was greater than 150. If a cluster did not pass this criterion, it was merged with the nearest neighbour cluster, and differentially expressed gene scores were recomputed using the merged clusters. Clusters with fewer than four cells were also merged with their nearest neighbours. This iterative merging process was repeated until all remaining clusters were separable and contained at least 4 cells.

Steps 1–5 were repeated for each resulting cluster until no further partitions were found.

6. Defining consensus clusters. To determine the robustness of the clustering results, the entire clustering procedure was repeated 100 times using 80% of all cells sampled at random in both the PCA and WGCNA modes. We then generated the frequency matrix for co-clustering of every pair of cells in both modes. The final cell-cell co-clustering matrix was defined as the element-by-element minimum of these two matrices, which implies that if two cells belong to the same cluster by one method, but to different clusters by another method, then their co-clustering probability is considered low and they should be separated into different clusters. We inferred the consensus clusters by iteratively splitting the co-clustering matrix. In any given step, we used the co-clustering matrix as the similarity matrix and performed clustering by either the Louvain (≥4000 cells) or Ward’s algorithm (<4,000 cells). We defined Nk,l as the average probabilities of cells within cluster k to co-cluster with cells within cluster l. We merged clusters k, l if Nk,l > max(Nk,k, Nl,l) – 0.25. We merged remaining clusters based on differentially expressed genes as described in step 5 using a deScore threshold of 150.

7. Cluster refinement. For each cell i, we computed the average probability that it co-clustered with cells in each cluster k as Mi,k, and we reassigned every cell i to the cluster k with maximum Mi,k. We repeated this process until convergence.

8. Exclusion of outlier clusters. After defining consensus clusters, we examined our clustering results to identify outlier clusters that are likely to be due to technical artefacts. These clusters fall into three categories: clusters of doublets, clusters of low-quality cells, and clusters driven by batch effects. A cluster was defined as a doublet cluster if it had signatures from two distinctive cell subclasses, for example, smooth muscle cells and neurons. Low-quality clusters were defined as clusters with significantly lower gene counts compared to the nearest cluster in taxonomy, and with few or no significantly enriched genes. We also identified two clusters that contain only retrogradely labelled cells. These two clusters are very similar to two other distinctive clusters, but contain shared additional signatures that we suspect were due to technical variation in retrograde experiments, so they were annotated as outlier clusters.

Constructing the cell type taxonomy tree

To build the cell type tree, we computed up to top 50 differentially expressed genes in both directions for every pair of clusters, and assembled unique entrees into a marker list of 4,020 genes. We calculated median expression of these marker genes per cluster as cluster centroid, and applied hierarchical clustering with average linkage on the correlation matrix of cluster centroids to infer the cell type taxonomy tree. The confidence for each branch of the tree was estimated by the bootstrap resampling approach from  the R package pvclust v.2.0. A comparison between the uncollapsed dendrogram and collapsing at >0.4 is presented in Extended Data Fig. 3. For display in figures, we collapsed the dendrogram to branches with a confidence score >0.4.

Assigning core and intermediate cells

In our previous study, post-clustering, we applied a random forest classifier to test our cluster assignments, and to define core and intermediate cells20. We found that random forest classification penalized small clusters, so we used a nearest-centroid classifier, which assigns a cell to the cluster whose centroid is the closest (with the highest correlation) to the cell. Here, the cluster centroid is defined as the median expression of 4,020 differentially expressed genes. To define core versus intermediate cells, we performed fivefold cross-validation 100 times: in each round, the cells were randomly partitioned into five groups, and cells in each group of 20% of the cells were classified by a nearest-centroid classifier trained using the other 80% of the cells. A cell classified to the same cluster more than 90 times was defined as a core cell, the others were designated intermediate cells. We define 21,195 core cells and 2,627 intermediate cells, which, in most cases, classify to only two clusters, one of which is the original cluster (2,492 out of 2,627; 94.9%).

Assigning cluster names

The marker genes included in cluster names were selected to be unique either individually or as a combination within our universe of cell types. We considered differentially expressed genes (see ‘Defining differentially expressed genes’ section below) at different levels of taxonomy: globally specific, within-class specific, within-subclass specific, and specific compared to the nearest sibling cluster. We also evaluated marker genes for the completeness of expression within the cluster that would be named after that gene. From this list of markers, we visually inspected marker specificity by examining gene expression at the single-cell level in clusters of interest. Many genes satisfied criteria of good marker genes, and therefore many alternatives for cluster naming exist. We gave preferences to globally unique genes (for example, Chodl, included in the Sst–Chodl cluster name) and markers that are expressed in all or a large proportion of cells within the cluster. For example, Lamp5–Lxh6, could also be called Lamp5–Nkx2.1. We chose Lxh6 as it is expressed in every cell of this cluster whereas Nkx2.1 is not, although Nkx2.1 is expressed in a smaller number of cell types overall.

Defining differentially expressed genes

Differentially expressed genes were detected using the R package limma v.3.30.1366 using log2(CPM + 1) of expression values. We did not perform any tests of normality before performing differentially expressed gene tests. Differentially expressed genes were defined as genes with a more than twofold change and adjusted P < 0.01. We also required these genes to have a relatively bimodal expression pattern, expressed predominately in one cluster relative to the other. To do that, we computed Pi,j as the fraction of cells in cluster j expressing gene i with CPM ≥ 1, and required upregulated genes i in cluster c1 relative to c2 to have Pi,c1 > q1.th (q1.th = 0.5), and (Pi,c1 – Pi,c2)/max((Pi,c1, Pi,c2) > q.diff.th (q.diff.th = 0.7). We define the deScore as the sum of the −log10(adjusted P value) of all differentially expressed genes. For deScore calculations, the maximum value each gene was allowed to contribute was 20. The deScores used for Extended Data Fig. 14f are: 80, low stringency; 150, standard; and 300, high stringency.

Retro-seq quality control and analysis

All retrogradely labelled cells were subjected to the same experimental and data processing, quality control, and clustering with all other quality control-qualified single-cell transcriptomes. Clustering was performed blinded to the experimental source of retrogradely labelled cells. After clustering, we performed an additional quality control step, in which we examined the dissection images and annotated the injection sites for specificity. We excluded single cell samples derived from incorrectly targeted injections or injections which displayed significant labelling through needle tract to define the ‘annotated retro-seq dataset’ (Extended Data Fig. 2e). Figure 3 and Extended Data Fig. 10 were generated based on this dataset.

Correspondence between VISp and ALM glutamatergic clusters

To establish correspondence in both directions, we classified VISp glutamatergic cells using ALM glutamatergic clusters as training data, and vice versa. In both cases, we trained the nearest centroid classifier based on common set of glutamatergic markers (pool of top 50 differentially expressed genes in each direction between glutamatergic clusters within VISp or within ALM) shared by both regions, and calculated the fraction of cells in each VISp clusters that mapped to each of the ALM clusters, and vice versa. For each cell, we computed the correlation score of the best mapping cluster, and transformed the correlation scores into z-scores. If the average z-score of cells from one cluster mapped to another cluster in the other region was below −1.64 (roughly 5% confidence interval), this cluster was considered to be unique to one region, with no corresponding cluster in the other region. For Fig. 2c, we used matched types as described in the paragraph above, or split each type into its ALM and VISp portions. Differentially expressed genes were calculated for all pairwise comparisons between type-specific and region-specific portions within glutamatergic samples and GABAergic samples. For each gene, two measures were calculated: a ratio of proportions (proportion of cells in ALM − proportion in VISp divided by whichever is higher, x axis) and the proportion of cells in whichever region has a greater proportion of cells expressing each gene (y axis). Proportions were computed separately for glutamatergic and GABAergic cells.

Assessing correspondence to the Paul et al. (2017)36 dataset

We mapped cells from Gene Expression Omnibus (GEO) accession GSE9252236 to our GABAergic clusters using the nearest centroid classifier based on a set of shared GABAergic markers that were detected in both datasets (expression >0). To estimate the robustness of mapping, we repeated classification 100 times, each time using 80% of randomly sampled markers, and computed the probabilities for every cell to map to every reference cluster.

Assessing correspondence to Cadwell et al. (2016)38 Patch-seq dataset

We mapped cells from the ArrayExpress accession E-MTAB-4092 dataset38 to our clusters (using only VISp cells) using the nearest centroid classifier with 100 sub-sampling rounds as described in paragraph above. Cells mapped to clusters with probabilities <80% were mapped to the parent nodes of the mapped clusters within the cell type hierarchy, until aggregated confidence at the parent node was >80%.

Assessing correspondence to Hrvatin et al. (2018)40 dataset

We mapped VISp cells from our dataset to GEO accession GSE10282740 using the same strategy described above. We chose the Hrvatin et al.40 dataset as reference because the cells profiled by inDrop have lower gene detection, and cannot be mapped to our high-resolution clusters confidently, whereas our cells can be mapped to clusters from the previous dataset40 with high confidence. To define early-response genes (ERGs) and late-response genes (LRGs) within each cluster in the previously published dataset40, differentially expressed genes were computed between samples with 1 h or 4 h after exposure to light versus no exposure. We used the approach described above, with the following criteria: > twofold change, adjusted P < 0.01, q1.th = 0.05, q.diff.th = 0.5. We computed average ERGs and LRGs for all our VISp cells mapped to the this cluster, and plotted their distribution based on our cluster annotation. We then used two-sided t-test to compute the significance for enrichment/depletion of average ERG and LRG expression for each of our cell types against the other types mapped to the same Hrvatin cluster, and defined significant values as having a P < 0.01, after correction for multiple hypotheses using the Holm method, and average fold change greater than 2.

Measures of heterogeneity within L4–IT–VISp–Rspo1 and between L4–IT–VISp–Rspo1 and related clusters

To explore the heterogeneity of the L4–IT–VISp–Rspo1 cluster, which corresponds to three separate cell types in our previous study20 (Extended Data Fig. 5), we first removed the quality control-dependent gene expression signatures by regressing the expression of each gene against the quality control index, defined as the ratio between the fraction of the reads mapped to intronic regions over the reads mapped to exonic regions. Compared to other cell types, L4 cells have a high fraction of intronic reads, likely indicating high nuclear content. There is also considerable variation of this quality control index among L4 cells, which confounds other transcriptomic signatures. After normalization, we performed WGCNA to find co-expressed gene modules within cells from L4–IT–VISp–Rspo1. We found that the eigengene for the top gene module within L4–IT–VISp–Rspo1 corresponds to the gradient that drove separation of L4 subtypes previously20. We then took the 50 cells at both ends of the eigengene-defined gradient, trained a random forest classifier using the genes from the WGCNA gene module, and tested it on the remaining cells to assign them to the ends of the gradient. The classification probabilities by random forest strongly correlated with the gradient eigengene (Extended Data Fig. 14d). We repeated the same analysis between L4–IT–VISp–Rspo1 and the neighbouring L5–IT–VISp–Hsd11b1–Endou cluster, and between L4–IT–VISp–Rspo1 and more distant L5–IT–VISp–Batf3 cluster. The eigengenes for these comparisons were defined as the first principle component of the top 50 differentially expressed genes in both directions. In both cases, the classifier was trained on 50 sampled cells from each cluster based on the selected differentially expressed genes, and tested on the remaining cells. We applied Kolmogorov–Smirnov tests to determine whether the distribution of classification probabilities is uniform for each of the three cases above. To account for the differences in sample size, we sampled 400 tested L4–IT–VISp–Rspo1 cells for the first case, and up to 200 cells from each cluster for the latter two cases. The Kolmogorov–Smirnov test gave P = 2.64 × 10−5 within the L4–IT–VISp–Rspo1 gradient. Between neighbouring cluster L4–IT–VISp–Rspo1 and L5–IT–VISp–Batf3, the random forest classification probabilities deviated from uniform distribution more significantly (Kolmogorov–Smirnov test P = 4.37 × 10−13). When cells in the L4–IT–VISp–Rspo1 cluster were compared with the more distant L5–IT–VISp–Batf3 cluster, the separation was clear (Kolmogorov–Smirnov test P = 0): classification probabilities have a bimodal distribution and cluster separation is discrete. Finally, we split the L4–IT–VISp–Rspo1 cells into five bins based on random forest classification probabilities and computed the differentially expressed genes between the two bins at the both ends of the gradient and the bin at the middle of the gradient (Extended Data Fig. 14d).

RNA FISH

We performed RNA FISH using RNAscope Multiplex Fluorescent v1 and v2 kits (Advanced Cell Diagnostics) according to the manufacturer’s protocols. We used fresh frozen sections, which we prepared by dissecting fresh brains, embedding the brains in optimum cutting temperature compound (OCT; Tissue-Tek), and storing OCT blocks at −80 °C. Ten-micrometre coronal sections were cut using a cryostat and collected on SuperFrost slides (ThermoFisher Scientific). Sections were allowed to dry for 30 min at −20 °C in a cryostat chamber, placed into pre-chilled plastic slide boxes, wrapped in a zipped plastic bag, and stored at −80 °C. Slides were used within one week. Nuclei were labelled by DAPI and nuclear signal was used to segment cells in images. We imaged mounted sections at 40× on a confocal microscope (Leica SP8). Maximum projections of z-stacks (1-µm intervals) were processed using CellProfiler (http://www.cellprofiler.org)67 to identify nuclei, quantify the number of fluorescent spots, and assign fluorescent spots to each cell/nucleus.

Immunohistochemistry

Mice were perfused with 4% paraformaldehyde (PFA). Brains were dissected and post-fixed with 4% PFA at room temperature for 3–6 h followed by overnight at 4 °C. Brains were rinsed with PBS and cryoprotected in 10% sucrose (w/v) in PBS with 0.1% sodium azide overnight at 4 °C. One-hundred-micrometre coronal slices were sectioned on a microtome (Leica, SM2010R), washed with PBS, blocked with 5% normal donkey serum in PBS and 0.3% Triton X-100 (PBST) for 1 h, and stained with rabbit anti-dsRed (1:1,000, Clontech, 632496) and goat anti-PVALB (1:1,000, Swant, PVG-213) overnight at room temperature. Sections were washed three times in PBST and incubated with anti-rabbit Alexa 594 (1:500, Jackson ImmunoResearch, 711-585-152) and anti-goat Alexa 488 (1:500, Jackson ImmunoResearch, 705-605-147) for 4 h at room temperature. Sections were washed three times with PBST and stained with 5 µM DAPI in PBS for 20 min. After washing in PBST, sections were mounted onto slides, allowed to dry, rehydrated in PBS, dipped in water and coverslips were added with Fluoromount G (SouthernBiotech, 0100-01) mounting medium.

Data analysis and visualization software

Analysis and visualization of transcriptomic data were performed using R v.3.3.0 and greater68, assisted by the Rstudio IDE (Integrated Development Environment for R v.1.1.442; https://www.rstudio.com/) as well as the following R packages: cowplot v.0.9.2 (https://rdrr.io/cran/cowplot/), dendextend v.1.5.269, dplyr v.0.7.4 (https://dplyr.tidyverse.org/), feather v0.3.1 (https://rdrr.io/cran/feather/), FNN v.1.1 (https://cran.r-project.org/web/packages/FNN/index.html), ggbeeswarm v.0.6.0 (https://cran.r-project.org/web/packages/ggbeeswarm/index.html), ggExtra v.0.8 (https://rdrr.io/cran/ggExtra/), ggplot2 v.2.2.170, ggrepel v.0.7.0 (https://cran.r-project.org/web/packages/ggrepel/vignettes/ggrepel.html), googlesheets v.0.2.2 (https://cran.r-project.org/web/packages/googlesheets/vignettes/basic-usage.html), gridExtra v.2.3 (https://cran.r-project.org/web/packages/gridExtra/index.html), Hmisc v.4.1-1 (https://cran.r-project.org/web/packages/Hmisc/index.html), igraph v.1.2.1 (https://www.rdocumentation.org/packages/igraph/versions/1.2.1), limma v.3.30.1366,71, Matrix v.1.2-12 (https://rdrr.io/rforge/Matrix/), matrixStats v.0.53.1 (https://cran.rstudio.com/web/packages/matrixStats/index.html), pals v.1.5 (https://rdrr.io/cran/pals/), purrr v.0.2.4 (https://purrr.tidyverse.org/), pvclust v.2.0-0 (http://stat.sys.i.kyoto-u.ac.jp/prog/pvclust/), randomForest v.4.6-1472, reshape2 v.1.4.2 (https://www.statmethods.net/management/reshape.html), Rphenograph v.0.99.1 (https://rdrr.io/github/JinmiaoChenLab/Rphenograph/), Rtsne v.0.14. (https://cran.r-project.org/web/packages/Rtsne/citation.html), Seurat v.2.1.073, viridis v.0.5.0 (https://rdrr.io/cran/viridisLite/man/viridis.html), WGCNA v.1.6174, and xlsx v.0.5.7 (https://cran.r-project.org/web/packages/xlsx/index.html). Scripts for the R implementation of FIt-SNE75 were used for t-SNE analyses.

Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this paper.

Code availability

Software code used for data analysis and visualization is available from GitHub at https://github.com/AllenInstitute/tasic2018analysis/. An R package for iterative clustering (hicat) is available on GitHub at https://github.com/AllenInstitute/scrattch.hicat. The dataset is available for download and browsing on the Allen Institute for Brain Science website: http://celltypes.brain-map.org/rnaseq.