Integrated single cell and unsupervised spatial transcriptomic analysis defines molecular anatomy of the human dorsolateral prefrontal cortex

Summary: Generation of a molecular neuroanatomical map of the human prefrontal cortex reveals novel spatial domains and cell-cell interactions relevant for psychiatric disease. The molecular organization of the human neocortex has been historically studied in the context of its histological layers. However, emerging spatial transcriptomic technologies have enabled unbiased identification of transcriptionally-defined spatial domains that move beyond classic cytoarchitecture. Here we used the Visium spatial gene expression platform to generate a data-driven molecular neuroanatomical atlas across the anterior-posterior axis of the human dorsolateral prefrontal cortex (DLPFC). Integration with paired single nucleus RNA-sequencing data revealed distinct cell type compositions and cell-cell interactions across spatial domains. Using PsychENCODE and publicly available data, we map the enrichment of cell types and genes associated with neuropsychiatric disorders to discrete spatial domains. Finally, we provide resources for the scientific community to explore these integrated spatial and single cell datasets at research.libd.org/spatialDLPFC/.

. Details of tissue acquisition, handling, processing, dissection, clinical characterization, diagnoses, neuropathological examinations, and quality control measures have been described previously (48). Tissue blocks from the anterior (Ant), middle (Mid) and posterior (Post) positions along the rostral-caudal axis of human DLPFC were microdissected from the post-mortem brains of 10 adult neurotypical control donors (n=10 Ant, n=10 Mid, n=10 Post DLPFC blocks).
Each tissue block was dissected from Brodmann Area 46 in a plane perpendicular to the pial surface to capture L1-6 and the WM.
Tissue processing and quality control Following microdissection, tissue blocks were embedded in cold OCT and snap frozen in isopentane on dry ice and stored at -80℃. Tissue blocks were equilibrated to -20℃ before cryosectioning at a thickness of 10uM. Prior to cutting sections for Visium experiments, sections of the tissue blocks were cut to complete two quality control steps: 1) H&E staining to assess gross neuroanatomical structure, and 2) fluorescence multiplex RNAscope single molecule fluorescence in situ hybridization (smFISH) to ensure presence of all 6 layers and white matter ( Fig. 1). H&E staining and RNAscope were performed according to manufacturer's instructions as previously described (49), and images were acquired using an Aperio CS2 slide scanner Visium H&E data generation One 10uM section from each of the 30 blocks was collected onto a 10x Visium Gene Expression slide, and processed according to manufacturer's instructions (10x Visium Gene Expression protocol) as previously described (6). Samples were processed in rounds by anatomical location (Ant, Mid, Post) with 4 different donors included on each slide. In brief, the workflow includes permeabilization of the tissue (18 minutes) to allow access to mRNA, followed by reverse transcription, collection of cDNA from the slide, and library construction. Libraries were quality controlled and sequenced on an Illumina NovaSeq 6000 at the Johns Hopkins Single Cell Transcriptomics core according to manufacturer's instructions at a minimum depth of 50,000 reads per Visium spot. Samples were sequenced to a median depth of 275,278,056 reads, corresponding to a median 67,211 of mean reads per spot, a median 2,257 of median unique molecular indices (UMIs) per spot, and median of 1,380 median genes per spot. After sequencing, we plotted established marker genes for neurons (SNAP25), white matter (MBP), and layer 5 (PCP4) to further validate the morphology of each tissue block (Fig S1, Fig S2,   Fig S3).
Visium raw data processing and quality control (QC) FASTQ and image data were pre-processed with the 10x SpaceRanger pipeline (version 1.3.0 for Visium H&E and 1.3.1 for Visium-SPG). Outputs of SpaceRanger, including the spotcounts matrix, were stored in a SpatialExperiment 1.6.0 (50) object (Bioconductor version 3.14) for downstream analysis. Data were filtered to remove genes that were not detected and spots with zero counts. We used scran 1.24.0 (Bioconductor version 3.14) (18) to calculate QC metrics, including mitochondrial expression, low library size, and low gene count. Only low library size spots were discarded: 4,866 (4.1%) spots, with 29 to 566 per sample with a median of 100 ( Fig S4). We performed dimensionality reduction using the scater 1.22.0 package (51) to calculate the top 50 PCs followed by using the Harmony package 0.1.0 (52) to perform batch-correction for the effect of donors (Fig S8).

Visium H&E
The high-resolution images obtained on the 10x Genomics Visium platform were processed using VistoSeg (53), a MATLAB-based pipeline that integrates gene expression data with histological data. First, the software splits the high-resolution image of the entire slide into individual capture areas (.tif file format), which are used as input to 1) Loupe Browser (10x Genomics) for fiducial frame alignment, and 2) SpaceRanger 1.3.0 to extract the spot metrics/coordinates. We then used the VNS() function from VistoSeg to obtain the initial nuclei segmentations. For sections lacking tissue artifacts, we used VNS() nuclei segmentation outputs, and for tissue sections with artifacts like tears and folds, we applied a secondary refining step called the refineVNS() function, which segments more obvious nuclei from background signals. We then applied a watershed function to separate nuclei that were in close proximity and a size threshold to only retain segmented objects in the nuclei size range (30 to 4000 pixels). These final segmentations were used to obtain the nuclei count per Visium spot.
We took two approaches for spot level counting with different stringencies: 1) we counted all segmented nuclei whose centroids were within a given Visium spot and 2) we calculated the proportion of pixels within a given Visium spot in which a segmented nucleus was present. A function in VistoSeg then uses the spot metrics and coordinates from SpaceRanger to integrate the segmented nuclei counts with gene expression data.

Visium-SPG
We leveraged the immunofluorescence (IF) images in the Visium-SPG assay to directly quantify cell type abundance and provide an imaging-based calculation of cell counts for comparison with outputs from spot deconvolution tools. Starting from the cyto model provided by Cellpose 2.0 (54), nuclear segmentations of Br2720_Ant_IF and Br6522_Ant_IF were iteratively improved in the Cellpose GUI, and a new model was trained until DAPI segmentations visually satisfied our accuracy standards. The final refined model was used to segment all four sections to produce masks. Segmentation masks were read into python 3.8.12, expanded to include a small region around each nucleus, and mean fluorescence intensity for the NeuN, OLIG2, and TMEM119-marked channels was quantified using skimage.measure.regionprops_table() from the scikit-image (55) v0.19.2 library. For each tissue section, an expert experimenter (co-author A.N.) used Samui Browser (30) to randomly select and annotate a spatially diverse set of 30 cells of each immunolabeled cell type -e.g. neuron, oligodendrocyte, astrocyte, microglia, or other (i.e. presence of DAPI, but absence of other immunofluorescent signals). The resulting dataset included 600 cells with a cell type label, and the corresponding mean fluorescence intensities from the respective channels.
To address concerns regarding possible biased selection of cells to annotate, a logistic-regression model was trained on the dataset using scikit-learn v1.1.1 (56), applied to predict un-annotated cells in all four sections. For each cell, the maximal probability of the model over any cell type was recorded as an indication of the model's prediction confidence. For each section and cell type, confidences were binned into four quartiles, and four cells from each such bin were randomly selected to form a list of 320 new cells to annotate. This process was designed to ensure a diverse set of cells, including some difficult-to-classify ones, were included in the full dataset, to improve model generalization. The original 600 cells were divided in an 80-20 training-test split, evenly stratified by section and cell type. The newer 320 cells were split 75-25, to ensure that 3 cells from each section, cell type, and quartile were included in the training set, with the remaining cell in the test set. A decision-tree classifier, built with tree.DecisionTreeClassifier(max_depth = 4)in scikit-learn (56), was trained, and an accuracy of 86.0% was achieved on the test data. The trained decision tree then predicted exact cell-type counts for each spot across the four Visium-IF sections, and this was compared against the results derived from Tangram, Cell2location, and SPOTlight.

Evaluating the impact of histology artifacts on Visium H&E data
The presence of artifacts, such as wrinkles and folds that result from tissue sectioning and sample preparation, can lead to misleading data. However, excluding spots that are believed to be artifacts diminishes sample size and the statistical power of downstream tests. To determine the impact of including spots from artifact regions in our analyses, an expert experimenter (K.R.M) manually annotated artifacts and histological layers for 3 samples: Br6522_Anterior, Br6522_Middle and Br6522_Posterior (Fig S5A). For each annotated sample, we used the one-sided Wilcoxon rank sum test to evaluate whether spots labeled as artifacts had greater library size, number of detected genes and estimated number of nuclei from VistoSeg, compared to spots that were not artifacts. Similarly, we evaluated whether spots in regions of artifacts were expected to have a lower percentage of mitochondrial gene expression. Since artifact spots were observed to have both a higher number of nuclei and higher library sizes (Fig S5B), we examined whether the changes in the density of nuclei had an effect on gene expression that was not accounted for by library size normalization. Specifically, we selected protein-coding genes and excluded genes expressed in fewer than 5% of all spots and fit the following regression model with a negative binomial (NB) error distribution and log-link function to estimate the effect of the presence of artifacts on the expression of each gene: where is the vector of UMI counts assigned to gene and is a binary vector which indicates whether a spot is an artifact or not. Using the NB parameterization with mean and variance µ , we estimate the intercept , slope and overdispersion parameter for each gene, µ + µ 2 θ β 0 β θ and regularize by aggregating information across genes with similar mean UMI counts using the sctransform v0.3.3 R package (57,58). Alternatively, we considered a negative binomial regression model with both library size and the presence of artifacts as independent effects: where is the library size of spot When we include library size as an independent = ∑ .
effect we observed that the regularized effect corresponding to artifacts was non-existent across all three samples (Fig S5C). Finally, we compared the estimate of the mean and variance of log-normalized gene expression computed across all spots to estimates obtained after excluding artifacts and found them to be similar. (Fig S5D).
To evaluate the impact of artifacts on spot-level downstream analyses, including clustering, we considered the complete dataset of 36,601 genes and all spots which were annotated to be found in a region with tissue. We first excluded genes with zero expression across all cells and spots which were low-quality due to inadequate library size as described previously. Since mitochondrial genes, ribosomal genes and MALAT1 are highly expressed and believed to be technical rather than associated with a biological process of interest, we excluded them. We then log-normalized that expression data and selected the top 10% of highly variable genes to compute 50 expression PCs and obtained a UMAP embedding of the spots. We did not observe distinct clustering of the artifact spots when the data was visualized using UMAP ( Fig S6A). Assuming that each PC (Principle Component) represents a distinct axis of variation unique to a subpopulation, we estimated the minimum number of PCs required to represent all subpopulations present, which corresponds to 10 PCs for Br6522_ant, Br6522_mid and 9PCs for Br8667_post and performed shared nearest neighbor clustering using the selected subset of PCs. For each cluster obtained, we characterized the fraction of spots which were artifacts and the relative layer proportions (Fig S6B). Further, we computed the variance explained by known sample characteristics including library size, percentage mitochondrial gene expression, layer of origin, and presence of artifacts using ANOVA ( Fig S6C). Finally, we performed differential gene  Table S5).
One possible manner in which artifacts can lead to misleading data occurs when a spot contains nuclei from two distinct biological regions of DLPFC (heterotypic spots) which would manifest either as a novel layer or lead to lack of separation between layers. In contrast, artifact spots formed due to wrinkling or folding of a single layer or transcriptionally similar layers (homotypic spots) are indistinguishable following normalization of gene expression. We artificially generated heterotypic spots across 28 potential layer combinations for Br6522_ant and Br6522_mid and 10 potential layer combinations for Br8667_post by randomly sampling weights and such that and . We then computed the samples, performed joint-data normalization and obtained a lower dimensional UMAP embedding (Fig S7A). For each spot we determined the 10 nearest neighbors using the shared nearest neighbor algorithm. Finally, we aggregated across all spots belonging to a specific artifact category and estimated the mean number of nearest neighbors that are non-artifacts, artifacts and simulated doublets to evaluate whether artifact regions had a greater fraction of neighbors that were either alternate artifact spots or simulated doublets (Fig S7B). We observed that artifact spots did not disproportionately co-cluster with simulated heterotypic doublets.
Alternatively, the presence of artifacts could lead to the observation of spots which have a cell-type composition that differs from non-artifact spots owing to overlapping biologically distinct layers. We used Tangram to assign cells of 7 distinct cell-types to both artifact and non-artifact spots in all three samples. For each spot, we summed across all cells belonging to a particular cell type to obtain the number of cells of each type present in the spot. We considered the Pearson correlation between a pair of cell-types within a layer to approximate how likely two cell types co-localized. Then we computed the cell-type correlation using all spots and compared it to the correlation obtained after excluding artifact spots to evaluate whether including artifact spots changed our beliefs about the cell type composition of spots using the st1()function from the corTest v1.0.7 R package. We then corrected the p-values obtained across all tests to obtain the FDR and observed no significant differences between to cell-type correlation computed from all spots and excluding artifact spots at FDR < 0.1 (Fig S7C, Table S6).

Unsupervised clustering of spatial data
To select an unsupervised clustering method for robust identification of spatial domains (SpDs) that best corresponded to cortical laminae in the Visium data, we evaluated three algorithms: shared nearest neighbors graph-based clustering (18), BayesSpace v1.5.1 (7), and SpaGCN v1.2.0 (9). These algorithms incorporated different features, including batch clustering (graph-based and BayesSpace), spatial awareness (BayesSpace and SpaGCN), and histology-weighted clustering (SpaGCN). Performance was benchmarked with our previously published DLPFC Visium data (12 sections from 3 donors; (6)) by comparing clustering accuracy against manual layer annotations using the Adjusted Rand Index (ARI), where a higher value represents greater similarity between predicted SpDs and manually annotated layers. Among the algorithms tested, BayesSpace most accurately identified SpDs consistent with histological cortical layers, and its performance improved with Harmony v0.1.0 batch correction (52) (Fig S8, Fig S9). We describe individual clustering methods below:

Graph-based
We performed graph-based clustering using buildSNNGraph() from scran and the Walktrap method implemented by igraph (17)  SpaGCN(), set_l() and train(), was repeated until seven SpDs were found, as even with random seeds set, the process was not deterministic and was not guaranteed to find seven SpDs.
Identification of data-driven k for unsupervised clustering To identify an optimal number of clusters for BayesSpace SpDs, we used a scalable discordance-based internal validity metric, fasthplus (59), to assess clustering performance for different values of k (Fig S12). Using fasthplus version 1.0, we calculated H + and then plotted 1-H + to identify the second inflection point of the plot as the point at which H + started to stabilize.
Using this data-driven approach, we identified k=16 as the optimal number of clusters.
We then identified differentially expressed genes using three models, as in our previous work (6) while adjusting for position in the anterior-posterior axis of the DLPFC, age, and sex.
First, the ANOVA model used registration_stats_anova() from spatialLIBD and powered by limma (62)  and 9,535 unique genes having at significant differences (FDR < 0.05) in at least one pair( Table S9).

Spatial registration of Spatial Domains
We validated and enhanced our understanding of the BayesSpace (7) SpDs by comparing them to the manual annotations of the pilot dataset (6) through a method of "spatial registration". We accessed the enrichment modeling results from Maynard et al. with spatialLIBD v1.11.4 fetchData(type = "modeling_results"). To compare the gene enrichments between the SpDs (Supplemental Methods: layer-level data processing and differential expression modeling) and the manual annotations, we calculated a Pearson's correlation between the enrichment t-statistics for each SpD and each manually annotated layer pair using only the union of the top 100 genes from each manual layer with spatialLIBD layer_stat_cor(model = "enrichment", top_n=100) (Fig. 1E, Fig. 2B).
Next to annotate SpDs with their best layer identity or hybrid-layer identity, we used annotate_registered_clusters(cutoff_merge_ratio = 0.1). This function first finds the top correlation value for each SpD, if it's greater than the confidence_threshold of 0.25, the SpD's annotation has a good confidence (if not it is assigned poor confidence).
Then the proportion of the difference between the top correlation and next highest correlation is compared to the cutoff_merge_ratio, if the proportion is lower the new layer is added to the list of associated layers. This process repeats until the ratio is higher than the cutoff as shown in this equation: (cor current -cor next ) / cor next < cutoff_merge_ratio.
This histological layer association is denoted for a given Sp k D d by adding "~L" (e.g. Sp 09 D 07~L 6). Note that this SpD may not be identical to a histological layer as cytoarchitecture was not used to define layer boundaries. Our goal here was to highlight the most strongly associated histological layer to aid in biological interpretation.
Anatomical-level data processing and differential expression modeling Using the same strategy and pseudo-bulked data as for the spatial domain differential expression analysis (Supplemental Methods: Layer-level data processing and differential expression modeling), we identified differentially expressed genes using three models while adjusting for Sp 9 Ds, age, and sex. First, the ANOVA model used registration_stats_anova() from spatialLIBD to test for differences in mean expression across positions in the anterior-posterior axis of the DLPFC and identified 1,220 (9.98%) DEGs (Table S10) snRNA-seq data collection and sequencing Eighteen of the 30 tissue blocks were selected for single nucleus RNA-sequencing (snRNA-seq); the two "morphologically best" blocks were selected for each donor.
"Morphologically best" was defined by: inclusion of all cortical layers, white matter, and few surface defects according to the layer definitions from the Visium data. Each selected block was sectioned using a Leica CM3050 at 100µm. Ten, 100µm sections were collected in lo-bind tubes for each respective sample, and these tissue sections were stored at -80°C until time of experiment. Nuclei preparations were conducted according to the 10x Genomics customer-developed "Frankenstein" nuclei isolation protocol as previously described in (63) computeDoubletDensity(). This metric was later assessed at the cluster level (Fig S19A).
Following quality control, a total of 77,604 (2,661 to 5,911 per sample) nuclei across 19 tissue blocks from 10 donors were included in the study (Fig S18, Table S11).
To perform dimension reduction, we first selected the top 2000 deviant genes from the raw count values with scry 1.6.0 devianceFeatureSelection() (67). Next, reduced dimensions were calculated with generalized linear models principal component analysis (GLM-PCA) (67), with scry nullResiduals() and scater 1.22.0 runPCA() (51). Initial reduced dimension plots showed evidence for batch effect (Fig S20A, Fig S21A). To correct for batch effects, we applied Harmony 0.1.0 RunHarmony() (52) to the GLM PC's with "Sample'' as the group variable as it is a subset of both sequencing batch and individual. This returned corrected Harmony PCs, which upon visual inspection seemed to successfully correct for differences across different variables (Fig S20B, Fig S21B, Fig S22).
Next, we followed the hierarchical clustering procedure outlined in (63) in the batch corrected PC space, we performed graph-based clustering with scran v1.22.0 buildSNNGraph() with k=20 and igraph 1.3.1 cluster_walktrap() to identify 296 preliminary nuclei clusters. After pseudo-bulking over the preliminary clusters, hierarchical clustering (hc) revealed 29 clusters. The clusters were annotated for broad cell type identity using established marker genes (63,68). When multiple cell types appeared to belong to the same broad cell type category, cell types were appended with a number (e.g. Excit_01) (Fig S23). One large cluster of cells did not have a clear cell type identity, and may have been driven by high mitochondrial rates and low UMIs (Fig S19B). Examining the marker expression of this hc cluster at the preliminary cluster level showed that two out of six preliminary clusters had high expression for Oligo marker genes, and were therefore re-classified as Oligos. The other four preliminary clusters could not be assigned a clear cell type, so they were classified as a 30th "Ambiguous" cluster and excluded (Fig. 3A) resulting in a set of 56,447 nuclei. Doublets did not drive clustering in this dataset as no clusters had particularly high median doublet scores (Fig S19A).

snRNA-seq spatial registration
To add anatomical context to our cell populations, we performed spatial registration on the 29 hc cell type clusters with several functions from spatialLIBD v1.9.19. The enrichment statistics were calculated with registration_wrapper() with position, age, and sex used as covariates.
These statistics were correlated against the manually annotated histological layers from Maynard et al., as well as the Sp 9 D and Sp 16 D BayesSpace clusters with layer_stat_cor(), then annotated with annotate_registered_clusters()to assign the most strongly correlated histological layer or SpD. For Sp 9 D and Sp 16 D BayesSpace clusters, a stricter cutoff_merge_ratio = 0.1 was used to identify more specific SpDs annotations (Fig. 3B, see details in Supplemental Methods: Spatial registration of Spatial Domains). The histological layer assignments based on (6) were used to create a "layer level" cell type annotation for excitatory neuron clusters (e.g. Excit_L3). Thirteen, excitatory neuron clusters were grouped by their layer level assignments to seven layer resolution populations, three excitatory neurons clusters had poor annotation confidence and were classified as spatially ambiguous and excluded from the downstream analysis. Other non-excitatory cell types were grouped at broad resolution (Fig. 3C, Fig S24, Table S2). The spatially annotated set of 54,394 (921 to 5043 per sample) nuclei across 19 tissue blocks from 10 donors were used in downstream analysis.

snRNA-seq Azimuth validation
To support community efforts in using a common cell type nomenclature (69), we also took an alternative clustering approach and used the reference-based mapping tool Azimuth . We showed that a majority of our hierarchical cluster-and layer-resolution snRNA-seq clusters strongly correlate with those from Azimuth, with the exception of L4-associated excitatory neuron clusters (Fig S25), which is consistent with the absence of L4 in the agranular motor cortex reference dataset used by Azimuth.

Spot deconvolution benchmarking
Given that individual Visium spots often contain multiple cell types, we performed spot deconvolution to better understand the cellular composition of spots mapping to unsupervised spatial domains. While several algorithms have been developed to predict cell type proportions within individual Visium spots using single cell reference data (8,22,23,70,71), they have not yet been comprehensively benchmarked in brain tissues due to lack of a suitable reference dataset containing 1) paired Visium and snRNA-seq data from the same donors, 2) known cell type abundances in each Visium spot, 3) manually annotated Visium regions enriched in specific cell types. Therefore, we generated the first gold standard spot deconvolution dataset for 4 broad cell types in postmortem human DLPFC using the Visium Spatial Proteogenomics assay (hereafter referred to as Visium-SPG), which replaces H&E histology with immunofluorescence staining to label proteins of interest with fluorescent dyes (Fig. 4A). We selected 4 out of the 30 tissue blocks from independent donors and performed immunofluorescent staining for established cell type-specific proteins, including NEUN (marking neurons), OLIG2 (marking oligodendrocytes), GFAP (marking astrocytes), and TMEM119 (marking microglia). Following multispectral fluorescence imaging (Fig S26), we proceeded with the standard Visium protocol to generate corresponding gene expression libraries from the same tissue section (Fig. 4B, Supplemental Methods: Visium Spatial Proteogenomics (SPG) data generation).
Next, we benchmarked 3 recently published spot deconvolution algorithms: SPOTlight, Tangram, and Cell2location (8,22,23) using our Visium-SPG and snRNA-seq data. First, we identified a set of 25 robustly expressed marker genes for snRNA-seq clusters at both broad and layer-level resolution. For both the broad and layer-level cell-type resolutions, the get_mean_ratio2() function from the DeconvoBuddies v0.99.0 R package was applied to rank each gene in the snRNA-seq data in terms of its suitability as a marker for each possible target cell type. This method, which we call "mean ratio", determines the ratio of expression between a target cell type and the next-highest-expressing cell type. This is contrasted with a more conventional one-versus-all marker-finding strategy comparing the logarithm of the expression for a target cell type with the logarithm of the mean taken across all other cell types, an approach we denote "std.logFC" (Fig S27). Mitochondrial genes were excluded from the ranking process. The top-25-ranked markers by mean ratio were taken for each cell type, and it was verified the mean ratio exceeded 1 for each gene, to ensure greater expression in the target cell type than all others. This resulted in a total of 150 broad-resolution markers and 300 layer-level markers, used downstream for spot deconvolution (Fig. 3D, Table S12). We verified that Visium-SPG gene expression data reproduced expected laminar gene expression patterns (Fig S28). To confirm the utility of the snRNA-seq-derived marker genes for spot deconvolution, we evaluated their proportion of nonzero expression in Visium data (Fig. 4B, Fig S29) and confirmed that 25 genes per cell type were sufficient to identify laminar patterns (Fig S30).
Using L5 excitatory neurons as an example, we found consistent spatial localization between L5 marker gene PCP4 and the top 25 Excit_L5 marker genes identified in snRNA-seq data ( Fig. 4B, Fig S29).

Applying spot deconvolution softwares
Having verified the selection of robust marker genes for each cell type, we then ran SPOTlight were generated without any subsetting. This is consistent with the recommendations of the authors to use more than 100 cells for data with finer cell types, as was the case for our layer-level resolution data. Tangram was provided total cell counts per spot derived through segmentation with Cellpose v2.0 (54,72) (Supplemental Methods: Image segmentation and processing), while Cell2location was provided an average count per spot, taken across all four sections. Importantly, these algorithms are not constrained to exactly match these input counts, and distort their values to differing extents, which is reflected in their outputs of cell type abundances (Fig S31). While Tangram and Cell2location predict cell-type abundances, SPOTlight predicts cell-type proportions, and is not subject to this distortion. Therefore, to more directly compare SPOTlight to Tangram and Cell2location, we multiplied the cell-type proportions outputted by SPOTlight by the Cellpose-derived counts to obtain abundances.

Evaluating performance of spot deconvolution methods
To quantify algorithm performance, we took 2 complementary approaches: 1) evaluating the localization of laminar cell types to their expected cortical layer and 2) comparing predicted cell type counts to actual counts obtained from immunofluorescent images. First, spots were manually assigned to L1-L6 or the WM using a combination of marker gene expression and cytotectonic architecture (Fig. 4C, Fig S32). Then for each cell type, we calculated the average predicted counts across manually annotated cortical layers. As expected, both Tangram and Cell2location show the highest predicted counts for L5 excitatory neurons in spots manually annotated to L5 (Fig. 4C). Across all cell types, Tangram and Cell2location showed similar layer-mapping performance; however, Tangram made more conservative predictions compared to Cell2location and performed more consistently across broad and layer level resolutions (Fig S32, Fig S33).
Next, using the immunofluorescent images, we calculated the actual number of neurons, astrocytes, microglia, and oligodendrocytes per spot by segmenting individual nuclei (72) and implementing a classification and regression tree (CART) approach (73) in scikit-learn (56) to categorize nuclei into the 4 immunolabeled cell types (Fig S34) (Supplemental Methods: Image segmentation and processing). We then compared the predicted cell type counts per spot from these softwares to the CART-calculated number of immunolabeled cells for both broad and layer level resolutions (Fig. 4D, Fig S35, Fig S36, Table S13). Counts for each software tool and CART-quantifiable cell type were summed across each Visium-SPG section. Totals for each software method (gene expression) were compared against CART-predicted (immunofluorescence) totals using Pearson correlation and root mean squared error (RMSE) (Fig S37A). Next, counts for each particular software tool and predicted by the CART were normalized to add to 1 across the four cell types. This allowed the calculation of the Kulback-Lieber divergence (KL divergence) from each software tool's predictions to the CART predictions. This treats the CART-predicted cell-type composition as a ground-truth probability distribution that each software tool is attempting to estimate. Given the challenge of deconvolving transcriptionally fine cell types, all softwares generally performed better (higher correlation, lower RMSE, and lower KL-divergence) at broad resolution compared to layer-level resolution. However, Tangram showed the most consistent performance at both broad and layer level resolution across all cell types and samples (Fig. 4E). This can at least partially be explained by the fact that Tangram spatially maps individual cells to perform deconvolution, while Cell2location and SPOTlight compute gene expression profiles for each cell type, which are then used for deconvolution. Since the same single-cell reference was used for deconvolution across all samples, Tangram was constrained to fix the proportions of cell types across sections, while Cell2location was able to more accurately capture dynamic changes in cell type proportions (Fig S39).  (14). For genes with a reported FDR < 0.05, we performed an enrichment analysis using BayesSpace Sp 9 Ds with spatialLIBD v1.11.4 gene_set_enrichment() (Fig. 6C). We performed the same enrichment analysis on differentially expressed genes for post traumatic stress disorder (PTSD) and major depressive disorder (MDD) on bulk RNA-seq reported through personal communication (PEC study 6, Fig. 6).

SCZ risk ligand-receptor occurrence
A SCZ risk gene list was obtained from the OpenTargets platform (https://www.opentargets.org/) under keyword "Schizophrenia." (24) The top resulting list was filtered to only include genes with genetic association score > 0.1. To compare the occurrence of ligands and receptors in SCZ risk genes, we obtained a reference list of brain-expressed genes from GTEx data (https://storage.googleapis.com/gtex_analysis_v7/rna_seq_data/GTEx_Analysis_2016-01-15_v 7_RNASeQCv1.1.8_gene_median_tpm.gct.gz). To create the brain-expressed gene list, tissues were filtered to those starting with 'Brain' and genes were filtered to have expression greater than zero. ENSEMBL genes were converted to symbol genes using the mygene library.
Reference data for ligand or receptor classification assignment was obtained from the Omnipath database of interactions (25), importing only genes with categories 'ligand' or 'receptor'. A randomized set of genes equal in size to the number of SCZ risk genes obtained from Open Targets was sampled from the brain-expressed reference gene list. Ligand and receptor incidence was determined using the ligand/receptor reference data. This process was performed iteratively 10,000 times, with % incidence of ligands and receptors visualized as histograms. p-values were calculated by dividing the number of bootstrapped values greater than the inquiry value by the total number of statistics.

Data-driven cell-cell communication analysis of snRNA-seq data
Independently, we performed cell-cell communication analysis using all available normalized and annotated (at layer level resolution) snRNA-seq data via LIANA v0.1.8 (12).
Ligand-receptor (LR) analysis was performed using liana_wrap(). We used liana_aggregate() to find consensus ranks of different methods. Interesting LR pairs were selected with the threshold aggregated_rank<=0.01. Complex heatmap and chord diagram (74,75) were used to display communication frequencies between cell type pairs (Fig. 5A) and cell communication pattern between EFNA5 and EPHA5 (Fig. 5B) respectively.
Cell specificity assessment of targets Gene expression of EFNA5, EPHA5 and FYN was visualized using the scanpy implementation for plotting dotplot (Fig. 5C). Cell specificity statistics in the preprocessed and annotated snRNA-seq data were calculated with the tspex python library. Two statistics were calculated: the tau statistic for determining whether a gene was specifically expressed, and the SPM statistic to identify the cell types gene expression was likely to be specific to. These results were visualized as clustermaps (Fig S40B-C).

snRNA-seq intracellular co-expression
For interactions predicted to occur intracellularly, we estimated the percent of nuclei co-expressing (raw counts > 1) both interactors of interest per cell type. These values were then visualized in the form of a clustermap (Fig. 5D).

Spatial co-expression
Preprocessed and annotated snRNA-seq and Visium data were used to verify co-expression of intracellular interactors of interest. Any nucleus (snRNA-seq) or spot (Visium) with raw counts > 0 was considered to be an expressor of a target of interest. When both the ligand and receptor had raw counts > 0, they were considered to be co-expressed in that nucleus or spot.

Cellular neighborhood network analysis
Spatial colocalization network analysis was performed using the deconvoluted spatial data for both Tangram and Cell2location results. In this analysis, the top cell types (Fig. 5G) with most likely contributions to a given spot were assumed to be co-localized in that spot. Co-localization relationships were scored in an adjacency matrix for spots where both the ligand and receptor were expressed. The adjacency matrix values were then divided by the absolute sum of the matrix. This was also performed for spots where neither the ligand nor the receptor were expressed. We took the ratio between both adjacency matrices and visualized this as a heatmap (Fig. 5I, Fig S40 D-E, H-I).
Spatial registration of PsychENCODE and other external snRNA-seq datasets We leveraged the uniformly processed snRNA-seq data from eight PsychENCODE consortium identical to using registration_wrapper() but was done piece wise to maximize efficiency as this analysis was completed with the full dataset (Fig S42) and a dataset filtered to include only neurotypical controls (Fig. 6A).
Enrichment statistics for the snRNA-seq dataset was completed with registration_wrapper(), after filtering to only neurotypical controls (14).
These statistics were correlated against the manually annotated histological layers from (6), as well as the Sp 9 D and Sp 16 D BayesSpace clusters with layer_stat_cor() and then annotated with annotate_registered_clusters(). For Sp 9 D and Sp 16 D BayesSpace clusters, a stricter cutoff_mere_ratio = 0.1 was used to identify more specific spatial domain annotations (Fig. 6A-B).

Statistics
Statistical tests were performed with R versions 4.1.1 to 4.2.2 with Bioconductor versions 3.14, 3.15, and 3.16.0, as well as python versions 3.8.12 to 3.10.8. In addition to software versions listed in the Supplemental Methods, R session information was recorded with sessioninfo::session_info() for many specific analyses on log files or scripts themselves and is available on GitHub and Zenodo (Data and materials availability).
Supplementary Table Legends   Table S1. Demographics of donors included in the study, data generation metrics, and SpaceRanger metrics for Visium (n=30) and Visium Spatial Proteogenomics (Visium-SPG, n = 4) slides. Table S2. snRNA-seq spatial registration correlation values and annotations. Table S3. Summary of ligand-receptor pairs detected in data-driven cell-cell communication analysis .  Table S4. Summary of ligand-receptor pairs associated with SCZ genetic risk association. Table S5. Summary statistics from t-tests on gene expression from artifact vs. non-artifact regions. Table S6. P-values of differences in cell-type correlation across artifact vs. non-artifact regions. Table S7. DEGs (FDR < 5%) for ANOVA model at broad and fine resolution for both Sp09 and Sp16 resolutions. Table S8. DEGs (FDR < 5%) for the enrichment model at broad and fine resolution for both Sp09 and Sp16 resolutions. Table S9. DEGs (FDR < 5%) for the pairwise model at broad and fine resolution for both Sp09 and Sp16 resolutions. Table S10. DEGs (FDR < 5%) across Ant, Mid, Post positions for the anova, enrichment, and pairwise statistical models at the Sp09 resolution. Table S11. snRNA-seq sample and quality control details. Table S12. snRNA-seq cell type marker gene statistics.