Resolving the fibrotic niche of human liver cirrhosis using single-cell transcriptomics

Currently there are no effective antifibrotic therapies for liver cirrhosis, a major killer worldwide. To obtain a cellular resolution of directly-relevant pathogenesis and to inform therapeutic design, we profile the transcriptomes of over 100,000 primary human single cells, yielding molecular definitions for the major non-parenchymal cell types present in healthy and cirrhotic human liver. We uncover a novel scar-associated TREM2+CD9+ macrophage subpopulation with a fibrogenic phenotype, that has a distinct differentiation trajectory from circulating monocytes. In the endothelial compartment, we show that newly-defined ACKR1+ and PLVAP+ endothelial cells expand in cirrhosis and are topographically located in the fibrotic septae. Multi-lineage ligand-receptor modelling of specific interactions between the novel scar-associated macrophages, endothelial cells and collagen-producing myofibroblasts in the fibrotic niche, reveals intra-scar activity of several major pathways which promote hepatic fibrosis. Our work dissects unanticipated aspects of the cellular and molecular basis of human organ fibrosis at a single-cell level, and provides the conceptual framework required to discover rational therapeutic targets in liver cirrhosis.

endothelial cells; and (4) key ligand-receptor interactions between novel scarassociated macrophages, endothelial subpopulations and collagen-producing myofibroblasts in the fibrotic niche. Thus, we have simultaneously identified a series of intra-scar fibrogenic pathways which represent hitherto unsuspected therapeutic targets for the treatment of liver fibrosis, whilst demonstrating the applicability of scRNA-seq to define pathogenic mechanisms for other human fibrotic disorders.

Single cell atlas of human liver non-parenchymal cells
Hepatic NPC were isolated from fresh healthy and cirrhotic human liver tissue spanning a range of aetiologies of cirrhosis (Fig. 1a, Extended Data Fig. 1a). Importantly, to minimise artefacts 11 , we developed a rapid tissue processing pipeline, obtaining fresh non-ischaemic liver tissue taken by wedge biopsy prior to the interruption of the hepatic vascular inflow during liver surgery or transplantation, and immediately processing this for FACS. This enabled a workflow time of under three hours from patient to singlecell droplet encapsulation (Methods).
We used an unbiased approach, FACS sorting viable single cells from liver tissue into broad leucocyte (CD45 + ) or other NPC (CD45 -) fractions (Extended Data Fig. 1b), prior to scRNA-seq. To facilitate discrimination between liver-resident and circulating leucocytes, we also performed scRNA-seq on CD45 + CD66bperipheral blood mononuclear cells (PBMCs) (Extended Data Fig. 1c, f). In total, we analysed 67,494 human cells from healthy (n=5) and cirrhotic (n=5) livers, 30,741 PBMCs from cirrhotic patients (n=4) and compared our data with a publicly-available reference dataset of 8,381 PBMCs from a healthy donor.
Tissue cells and PBMCs could be partitioned into 21 distinct clusters, which we visualized using t-distributed stochastic neighbourhood embedding (t-SNE) (Extended Data Fig. 1d). Clusters were annotated using signatures and integrating with known lineage markers (Extended Data Fig. 1e; signature gene lists available in Supplementary   Table 1). All PBMC datasets contained the major blood lineages, with excellent reproducibility between samples (Extended Data Fig. 1g, h). To generate an atlas of liver-resident cells, contaminating circulating cells were removed from the liver tissue datasets, by excluding individual cells from the tissue samples which mapped transcriptionally to blood-derived clusters 1 and 13 (Extended Data Fig. 1d).
Re-clustering the 66,135 liver-resident cells revealed 21 clusters (Fig. 1b), each containing cells from both healthy and cirrhotic livers (Fig. 1c). Gene signature analysis enabled annotation of each cluster by major cell lineage (Fig. 1d To further annotate the 11 T cell and ILC clusters (36,900 cells from 10 livers) we assessed expression of known markers (Extended Data Fig. 3c) and computationally identified differential marker genes (Extended Data Fig. 3d, Supplementary Table 4).
We also performed imputation of gene dropouts, which enhanced detection of discriminatory marker genes for each cluster but did not yield additional T cell or ILC subpopulations (Extended Data Fig. 3e, Supplementary Table 5). All T cell and ILC clusters expressed tissue residency markers CD69 and CXCR4. Clusters 1 and 2 were CD4 + T cells, with CD4 + T cell (2) expanding significantly in cirrhotic livers (Extended Data Fig. 3a, b, e) and expressing SELL and CCR7, indicating an expansion of memory CD4 + T cells in liver cirrhosis. Sparse expression of FOXP3, RORC, IL17A and IFNG in both CD4 + T cell subpopulations suggested the presence of Tregs, Th17 and Th1 cells in these clusters. Clusters 3, 4 and 5 were CD8 + T cells, with features of effector T cells expressing GZMA, GZMH and IFNG. Two resident CD56 bright IL7R -NK cell clusters were defined (NK cell (1) and NK cell (2)), as well as a distinct cytotoxic CD56 dim NK cell population (cNK), with specific expression of FCGR3A and GZMB.
No expansion of these populations was observed in cirrhotic livers.
We provide an interactive gene browser freely-available online ((http://www.livercellatlas.mvm.ed.ac.uk), to allow assessment of individual gene expression both in all human liver NPC and in specific lineages, comparing healthy versus cirrhotic livers.

Distinct macrophage subpopulations inhabit the fibrotic niche
Macrophages are critical to tissue homeostasis and wound-healing 12  Strikingly, clusters MP(4) and MP (5), named SAM(1) and SAM(2) respectively, were expanded in cirrhotic livers (Fig. 2b), a finding that was confirmed by quantification of the MP cell composition of each liver individually (Fig. 2c), and reproduced in all cirrhotic livers irrespective of liver disease aetiology.
To enable MP cell annotation, we initially assessed expression of known MP marker genes (Extended Data Fig. 4a), classifying clusters MP (8) and MP (9) as conventional dendritic cells, cDC2 and cDC1 respectively, based on CD1C and CLEC9A specificity.
Clusters MP (6) and MP (7)  Scar-associated clusters SAM(1) and SAM (2), expanded in diseased livers and expressed the unique markers TREM2 and CD9 (Fig. 2d, e). These newly-defined macrophages displayed a hybrid phenotype, with features of both tissue monocytes and KCs (Fig. 2d, Fig. 4k). These data highlight the potential role of local macrophage proliferation in driving the accumulation of SAMs in the fibrotic niche of human chronic liver disease.

Fibrogenic phenotype of scar-associated macrophages
To delineate the functional profile of SAMs we generated self-organising maps using the SCRAT package, visualising co-ordinately expressed gene groups across the MP subpopulations. This created a landscape of 3600 metagenes on a 60x60 grid and highlighted 44 metagene signatures overexpressed in the MP lineage (Fig. 3a).
Mapping the nine MP cell clusters to this landscape (Extended Data Fig. 5a Fig. 4g-i) and a distinct topographical distribution from KCs (Fig. 2j), suggesting they may represent monocyte-derived cells.
To computationally assess the origin of these scar-associated macrophages, we performed in silico trajectory analysis on a combined dataset of peripheral blood monocytes and liver-resident MPs. We visualised the transcriptional profile of these cells using a diffusion map, mapped them along a pseudotemporal trajectory (using the monocle R package) and interrogated their directionality via spliced and unspliced mRNA ratios (RNA velocity 33,34 ) (Fig. 3c). These analyses suggested a branching differentiation trajectory from peripheral blood monocytes into either scar-associated macrophages or cDCs (Fig. 3c). Additionally, applying RNA velocity indicated a lack of differentiation from KCs to scar-associated macrophages, and no progression from scar-associated macrophages to KCs (Fig. 3c).
To further investigate the pseudotemporal relationship between SAMs and KCs, we visualised the combined blood monocyte and liver-resident MP dataset using a UMAP, and performed additional RNA velocity analyses 33,34 (Fig. 3d) Table 9). These data highlight that SAMs acquire a specific fibrogenic phenotype during differentiation from circulating monocytes.
To identify potential transcriptional regulators of SAMs we used the SCENIC package to define sets of genes co-expressed with known transcription factors, termed regulons.
We assessed the cell activity score for differentially-expressed regulons along the tissue monocyte-macrophage pseudotemporal trajectory and in KCs, allowing visualisation of regulon activity across liver-resident macrophage subpopulations (Extended Data Fig. 5f, g, Supplementary  Fig.   5f, g), both of which have been associated with modulation of macrophage phenotype and tissue fibrosis [40][41][42][43] .
In summary, multimodal computational analysis suggests that TREM2 + CD9 + scarassociated macrophages derive from the recruitment and differentiation of circulating monocytes and display a fibrogenic phenotype.
Immunohistochemical quantification of ACKR1 and PLVAP confirmed expansion of these populations in cirrhotic livers, with clear localization within fibrotic septae (Fig.   4f). Scar-associated endothelial cells displayed enhanced expression of the ELK3 regulon (Fig. 4g), a transcription factor known to modulate angiogenesis 46 . Metagene signature analysis found that Endo(6) (SAEndo(1)) cells expressed fibrogenic genes including PDGFD, PDGFB, LOX, LOXL2 and several basement membrane components [47][48][49] ; associated significant ontology terms included extracellular matrix organization and wound healing (signature A; Extended Data Fig. 6c-e). Endo (7) (SAEndo(2)) cells displayed an immunomodulatory phenotype (signature B; Extended Data Fig. 6c-e). Furthermore, the most discriminatory marker for this cluster, ACKR1, is restricted to venules in mice 50 and has a role in regulating leucocyte recruitment via transcytosis of chemokines from the abluminal to luminal side of blood vessels 51 .
The arterial identity of this cluster was further indicated by SOX17 regulon expression 57

Resolving the multi-lineage interactome in the fibrotic niche
To investigate how the newly-defined scar-associated macrophage and endothelial cell subpopulations regulate fibrosis, we assessed interactions between these cells and the myofibroblasts, the key scar-producing cells within the fibrotic niche 59,60 .
To interrogate potential ligand-receptor interactions between these scar-associated macrophage, endothelial and mesenchymal cells, we used CellPhoneDB, a new repository of curated ligand-receptor interactions integrated with a statistical framework. We calculated statistically significant ligand-receptor pairs, based on expression of receptors by one lineage and ligands by another, using empirical shuffling 61 . We focussed further analysis on differentially expressed interactions between pairs of lineages spatially located in the fibrotic niche.
Numerous statistically significant potential interactions were detected between ligands and cognate receptors expressed by scar-associated macrophages, scar-associated endothelial cells and myofibroblasts within the fibrotic niche ( Fig. 5g, h, Extended Data  Table 15).
Myofibroblasts display a potent immunoregulatory profile, producing cytokines and chemokines such as CCL2, which regulates monocyte-macrophage recruitment and phenotype (Fig. 5i, m). We detected highly significant interactions between CXCL12 produced by myofibroblasts and CXCR4 expressed by both scar-associated macrophages and endothelial cells (Fig. 5i), confirmed at protein level within the fibrotic septae (Extended Data Fig. 7d). CXCR4 signaling in both macrophages and endothelia have been associated with promoting tissue fibrosis in mice 44 In summary, our unbiased dissection of the key ligand-receptor interactions between novel scar-associated macrophages, endothelial subpopulations and collagenproducing myofibroblasts in the fibrotic niche reveals several major pathways which promote hepatic fibrosis (Fig. 6). Therapeutic targeting of these intra-scar pathways, represents a rational approach for the discovery of novel anti-fibrotic treatments for patients with chronic liver disease.

Discussion
The fibrotic niche has not previously been defined in human liver. Here, using scRNAseq and spatial mapping, we resolve the fibrotic niche of human liver cirrhosis, identifying novel pathogenic subpopulations of TREM2 + CD9 + fibrogenic macrophages, ACKR1 + and PLVAP + endothelial cells and PDGFRα + collagenproducing myofibroblasts. We dissect a complex, pro-fibrotic interactome between multiple novel scar-associated cells, and identify highly relevant intra-scar pathways that are potentially druggable. This multi-lineage single cell dataset of human liver cirrhosis should serve as a useful resource for the scientific community, and is freely available for interactive browsing at http://www.livercellatlas.mvm.ed.ac.uk.
Despite significant progress in our understanding of the molecular pathways driving liver fibrosis in rodent models, a lack of corollary studies in diseased human liver tissue has hindered translation into effective therapies, with currently no FDA or EMAapproved anti-fibrotic treatments available. Our multi-lineage ligand-receptor analysis demonstrates the complexity of interactions within the fibrotic niche, highlighting why current approaches to treat human liver fibrosis have proven so intractable, and provides a conceptual framework for more rational studies of anti-fibrotic therapies in both preclinical animal models and translational systems such as human liver organoid cultures 5,70,71 . Further, this unbiased multi-lineage approach should inform the design of combination therapies which will very likely be necessary to achieve effective antifibrotic potency 5,6 .
Macrophages and endothelial cells are known to regulate liver fibrosis in rodent models 13,19,23,44,45 . However, little is known regarding the heterogeneity and precise molecular definitions of these cell types in human liver disease. Our data demonstrates both the accumulation of discrete monocyte-derived macrophage and endothelial cell populations in the fibrotic niche of cirrhotic livers, but also the persistence of spatially distinct, non-scar associated resident Kupffer cells and liver sinusoidal endothelial cells. This single-cell approach has important implications for therapy development; facilitating specific targeting of pathogenic cells without perturbing homeostatic function.
In this era of precision medicine, where molecular profiling guides the development of highly targeted therapies, we used scRNA-seq to resolve the key non-parenchymal cell subclasses inhabiting the fibrotic niche of human liver cirrhosis. Application of our novel scar-associated cell markers could potentially inform molecular pathology-based patient stratification, which is fundamental to the prosecution of successful anti-fibrotic clinical trials. Our work illustrates the power of single-cell transcriptomics to decode the cellular and molecular basis of human organ fibrosis, providing a conceptual framework for the discovery of relevant and rational therapeutic targets to treat patients with a broad range of fibrotic diseases.

Study subjects
Local approval for procuring human liver tissue and blood samples for scRNA-seq, flow cytometry and histological analysis was obtained from the NRS BioResource and

Tissue processing
For liver scRNA-seq and flow cytometry analysis, a wedge biopsy of non-ischaemic fresh liver tissue (2-3 grams) was obtained by the operating surgeon, prior to interruption of the hepatic vascular inflow. This was immediately placed in HBSS (Gibco) on ice. The tissue was then transported directly to the laboratory and dissociation routinely commenced within 20 minutes of the liver biopsy. To enable paired histological assessment, a segment of each liver specimen was also fixed in 4% neutral-buffered formalin for 24 hours followed by paraffin-embedding. Additional liver samples, obtained via the same method, were fixed in an identical manner and used for further histological analysis. For cell sorting to assess secreted mediator production, explanted diseased liver tissue (40 grams) was used from patients undergoing orthotopic liver transplantation.
Following heat-mediated antigen retrieval in pH6 sodium citrate (microwave; 15 minutes), slides were washed in PBS and incubated in 4% hydrogen peroxide for 10 minutes. Slides were then washed in PBS, blocked using protein block (GeneTex, Brightfield and fluorescently-stained sections were imaged using the slide scanner AxioScan.Z1 (Zeiss) at 20X magnification (40X magnification for smFISH). Images were processed and scale bars added using Zen Blue (Zeiss) and Fiji software 72 .

Cell counting and image analysis
Automated cell counting was performed using QuPath software 73

Preparation of single-cell suspensions
For liver scRNA-seq, human liver tissue was minced with scissors and digested in For both liver flow cytometry analysis and cell sorting to assess secreted mediator production, single-cell suspensions were prepared as previously described, with minor modifications 75  BSA; Sigma-Aldrich) and centrifugation was repeated. Samples were then blocked in 10% human serum (Sigma-Aldrich, H4522) in staining buffer on ice for 30 minutes.
Cells were then resuspended in staining buffer and passed through a 35μm filter prior to antibody staining.

Flow cytometry and cell sorting
Incubation with primary antibodies was performed for 20 minutes at 4°C. All antibodies, conjugates, lot numbers and dilutions used in this study are presented in Supplementary

Pre-processing scRNA-seq data
We aligned to the GRCh38 reference genome, and estimated cell-containing partitions and associated UMIs, using the Cell Ranger v2.1.0 Single-Cell Software Suite from 10X Genomics. Genes expressed in fewer than three cells in a sample were excluded, as were cells that expressed fewer than 300 genes or mitochondrial gene content >30% of the total UMI count. We normalised by dividing the UMI count per gene by the total UMI count in the corresponding cell and log-transforming. Variation in UMI counts between cells was regressed according to a negative binomial model, before the resulting value was scaled and centred by subtracting the mean expression of each gene and dividing by its standard deviation (En), then calculating ln(10 4 *En+1).

Dimensionality reduction, clustering, and DE analysis
We performed unsupervised clustering and differential gene expression analyses in the Seurat R package v2.3.0 76 . In particular we used SNN graph-based clustering, where the SNN graph was constructed using between 2 and 11 principal components as determined by dataset variability shown in principal components analysis (PCA); the resolution parameter to determine the resulting number of clusters was also tuned accordingly. To assess cluster similarity we used the BuildClusterTree function from Seurat.
In total, we present scRNA-seq data from ten human liver samples (named Healthy 1- Initially, we combined all scRNA-seq datasets (liver and blood) and performed clustering analysis with the aim of isolating a population of liver-resident cells, by identifying contaminating circulatory cells within datasets generated from liver digests and removing them from downstream analysis. Specifically, we removed from our liver datasets cells that fell into clusters 1 and 13 of the initial dataset in Extended Data Fig.   1d.
Using further clustering followed by signature analysis, we interrogated this postprocessed liver-resident cell dataset for robust cell lineages. These lineages were isolated into individual datasets, and the process was iterated to identify robust lineage subpopulations. At each stage of this process we removed clusters expressing more than one unique lineage signature in more than 25% of their cells from the dataset as probable doublets. Where the cell proliferation signature identified distinct cycling subpopulations, we re-clustered these again to ascertain the identity of their constituent cells.
All heatmaps, t-SNE and UMAP visualisations, violin plots, and dot plots were produced using Seurat functions in conjunction with the ggplot2, pheatmap, and grid R packages. t-SNE and UMAP visualisations were constructed using the same number of principal components as the associated clustering, with perplexity ranging from 30 to 300 according to the number of cells in the dataset or lineage. We conducted differential gene expression analysis in Seurat using the standard AUC classifier to assess significance. We retained only those genes with a log-fold change of at least 0.25 and expression in at least 25% of cells in the cluster under comparison.

Defining cell lineage signatures
For each cell we obtained a signature score across a curated list of known marker genes per cell lineage in the liver (Supplementary Table 1). This signature score was defined as the geometric mean of the expression of the associated signature genes in that cell.
Lineage signature scores were scaled from 0 to 1 across the dataset, and the score for each cell with signature less than a given threshold (the mean of said signature score across the entire dataset) was set as 0.

Batch effect and quality control
To investigate agreement between samples we extracted the average expression profile for a given cell lineage in each sample, and calculated the Pearson correlation coefficients between all possible pairwise comparisons of samples per lineage 77 .

Imputing dropout in T cell and ILC clusters
To impute dropout of low-abundance transcripts in our T cell and ILC clusters so that we might associate them with known subpopulations, we down-sampled to 7,380 cells from 36,900 and applied the scImpute R package v0.0.8 78 , using as input both our previous annotation labels and k-means spectral clustering (k=5), but otherwise default parameters.

Inferring injury dynamics and transcriptional regulation
To generate cellular trajectories (pseudotemporal dynamics) we used the monocle R package v2.6.1 80 . We ordered cells in a semi-supervised manner based on their Seurat clustering, scaled the resulting pseudotime values from 0 to 1, and mapped them onto either the t-SNE or UMAP visualisations generated by Seurat or diffusion maps as implemented in the scater R package v1.4.0 81 using the top 500 variable genes as input.
We removed mitochondrial and ribosomal genes from the geneset for the purposes of trajectory analysis. Differentially-expressed genes along this trajectory were identified using generalised linear models via the differentialGeneTest function in monocle.
When determining significance for differential gene expression along the trajectory, we set a q-value threshold of 1e-20. We clustered these genes using hierarchical clustering in pheatmap, cutting the tree at k=3 to obtain gene modules with correlated gene expression across pseudotime. Cubic smoothing spline curves were fitted to scaled gene expression along this trajectory using the smooth.spline command from the stats R package, and gene ontology enrichment analysis again performed using PANTHER 13.1.
We verified the trajectory and its directionality using the velocyto R package v0.6.0 33 , estimating cell velocities from their spliced and unspliced mRNA content. We For transcription factor analysis, we obtained a list of all genes identified as acting as transcription factors in humans from AnimalTFDB 82 . To further analyse transcription factor regulons, we adopted the SCENIC v0.1.7 workflow in R 83 , using default parameters and the normalised data matrices from Seurat as input. For visualisation, we mapped the regulon activity (AUC) scores thus generated to the pseudotemporal trajectories from monocle and the clustering subpopulations from Seurat. between subpopulations, 2) calculating the proportion of these means that were "as or more extreme" than the actual mean, thus obtaining a p-value for the likelihood of subpopulation specificity for a given ligand-receptor pair, 3) prioritising interactions that displayed specificity to subpopulations interacting within the fibrotic niche.

Statistical Analysis
To assess whether our identified subpopulations were significantly overexpressed in injury, we posited the proportion of injured cells in each cluster as a random count variable using a Poisson process, as previously described 77 . We modelled the rate of detection using the total number of cells in the lineage profiled in a given sample as an offset, with the condition of each sample (healthy vs cirrhotic) provided as a covariate factor. The model was fitted using the R command glm from the stats package. The pvalue for the significance of the proportion of injured cells was assessed using a Wald test on the regression coefficient. This methodology was also applied to assess significant changes in proportions of mononuclear phagocytes between healthy and cirrhotic liver tissue by flow cytometry.
Remaining statistical analyses were performed using GraphPad Prism (GraphPad Software, USA). Comparison of changes in histological cell counts or morphometric pixel analysis between healthy and cirrhotic livers were performed using a Mann-Whitney test (unpaired; two-tailed). Comparison of topographical localisation of counted cells (fibrotic septae vs parenchymal nodule) was performed using a Wilcoxon matched-pairs signed rank test (paired; two-tailed). P-values<0.05 were considered statistically significant.

Data and materials availability
Our expression data will be freely available for user-friendly interactive browsing online http://www.livercellatlas.mvm.ed.ac.uk. CellPhoneDB 61 is available at www.CellPhoneDB.org, along with lists of membrane proteins, ligands and receptors, and heteromeric complexes. All raw sequencing data is available in the Gene Expression Omnibus (GEO accession GSE136103).