Single-cell transcriptomics defines heterogeneity of epicardial cells and fibroblasts within the infarcted heart

In the adult heart, the epicardium becomes activated after injury, contributing to cardiac healing by secretion of paracrine factors. Here we analyzed by single-cell RNA sequencing combined with RNA in situ hybridization and lineage tracing of WT1+ cells the cellular composition, location, and hierarchy of epicardial stromal cells (EpiSC) in comparison to activated myocardial fibroblasts/stromal cells in infarcted mouse hearts. We identified 11 transcriptionally distinct EpiSC populations, that can be classified in three groups each containing a cluster of proliferating cells. Two groups expressed cardiac specification makers and sarcomeric proteins suggestive of cardiomyogenic potential. Transcripts of HIF-1α and HIF-responsive genes were enriched in EpiSC consistent with an epicardial hypoxic niche. Expression of paracrine factors was not limited to WT1+ cells but was a general feature of activated cardiac stromal cells. Our findings provide the cellular framework by which myocardial ischemia may trigger in EpiSC the formation of cardioprotective/regenerative responses.


Introduction 43
Myocardial infarction (MI), still the most frequent cause of death in western societies, is associated 44 with massive activation of cardiac fibroblasts, ultimately resulting in excessive accumulation of 45 extracellular matrix (ECM) components that finally impair cardiac function (1). During 46 development, the majority of cardiac fibroblasts are derived from the epicardium which forms the 47 thin outermost epithelial layer of all vertebrate hearts and exhibits extensive developmental 48 plasticity (2). A subset of epicardial cells undergoes epithelial to mesenchymal transition (EMT) 49 and those epicardial progenitor cells can give rise to various cardiac cell types. In addition to 50 cardiac fibroblasts, this includes vascular smooth muscle cells and pericytes which contribute to 51 the coronary vasculature (2). Embryonic epicardial heterogeneity was recently studied at the 52 single-cell level in the developing zebrafish heart und uncovered three epicardial populations, 53 functionally related to cell adhesion, migration and chemotaxis (3). 54 55 In the adult heart, the epicardium is a rather quiescent monolayer that becomes activated after MI 56 by upregulating embryonic epicardial genes (4,5). In the injured heart, epicardial cells form via 57 EMT a multi-cell layer of epicardial stromal cells (EpiSC) at the heart surface that can reach a 58 thickness of about 50-70 µm in mice (6). It is generally assumed that the activated epicardium 59 recapitulates the embryonic program in generating mesenchymal progenitor cells, although there 60 may be major molecular differences with respect to their embryonic counterpart (7). On the 61 functional side, adult EpiSC secrete paracrine factors that stimulate cardiomyocyte growth and 62 angiogenesis (8) and play a key role in post-MI adaptive immune regulation (9). When stimulated 63 with thymosin β4, Wilms tumor protein 1-positive (WT1 + ) adult EpiSC can form cardiomyocytes, 64 however, the rate of conversion is only small (10). Thus, the epicardium is a signaling center 65 regulating cardiac wound healing and may have cardiogenic potential in the adult injured heart. 66 67 Despite its importance in cardiac repair, little is known on cell heterogeneity and molecular 68 identifiers within the epicardial layer of the adult heart. Yet, this knowledge is essential to map the 69 epicardial progeny and attribute meaningful functions. While the single-cell landscape of activated 70 cardiac fibroblasts (activated cardiac stromal cells, aCSC) in the post-MI heart has been explored 71 in detail (11,12), these studies did not assess epicardial heterogeneity because of lack of specific 72 identifiers. In a previous study we have reported a novel perfusion-based technique (13) which 73 permitted the simultaneous isolation of EpiSC and aCSC with high yield and only minimal cell 74 activation. The isolation of viable, purified preparations of aCSC and EpiSC from the infarcted 75 heart permitted the first direct comparison of the two cardiac stromal cell fractions. We have 76 combined single-cell RNA sequencing (scRNAseq) with lineage tracing of WT1-expressing cells 77 and localization of cell populations by RNA in situ hybridization, characterized cellular hierarchy 78 of adult EpiSC, defined similarities and differences to cardiac fibroblast, and explored in detail the 79 individual EpiSC populations in the activated epicardium. 80

ScRNAseq of post-MI stromal cells 82
EpiSC (epicardial stromal cells) and aCSC (activated cardiac stromal cells) were isolated from the 83 same mouse hearts (n=3) 5 days after MI (50 min ischemia followed by reperfusion) using a 84 technique which rather selectively removed EpiSC by applying gentle shear force to the cardiac 85 surface (13) ( Figure 1A). CSC (cardiac stromal cells) were isolated from uninjured hearts of sham-86 operated animals (n=3). Cell preparations were depleted of cardiomyocytes, endothelial cells and 87 immune cells (see methods) prior to scRNAseq using the 10x Genomics Chromium platform. 88 Transcriptional profiles of 13,796 EpiSC, 24,470 aCSC and 24,781 CSC were captured after 89 quality control filtering. Unbiased clustering using the Seurat R package with visualization in 90 UMAP dimension reduction plots was performed to identify cells with distinct lineage identities 91 and transcriptional profiles. Average and significantly enriched RNA expression of EpiSC, aCSC 92 and CSC are listed in Supplementary files 1 and 2, respectively. 93

Hierarchy of EpiSC populations 263
To better define the cellular relationship between the identified EpiSC populations, we combined 264 scRNAseq of EpiSC with lineage tracing using tamoxifen-inducible Wt1-targeted reporter 265 (Wt1 CreERT2 Rosa tdTomato ) mice ( Figure 4A). Transcriptional profiles of 13,373 cells from 2 mouse 266 hearts 5 days after MI were assigned to previously defined EpiSC populations ( Figure 4B). 267 Surprisingly, expression of tdTomato fully overlapped with that of Wt1 ( Figure 4C), suggesting 268 that WT1 + cells within the given time did not convert in cells of any Wt1-negative EpiSC 269 population. 270 RNA velocity analysis (19), which can predict and visualize future cell states based on the ratio of 271 unspliced and spliced mRNA read counts, supports the notion that EpiSC consists of different 272 independent groups of cell populations. As shown in Figure 4D, EpiSC-7 and 1 (group I) appeared 273 to be separated from EpiSC-2, 4, 8 (group II) and EpiSC-6, 5, 3, 9 (group III). 274 275 To study cell-cell communication mediated by ligand-receptor interactions between the EpiSC 276 populations, we used CellPhoneDB (20). As shown in Figure 4E, showed that each of the three EpiSC groups identified above was equipped with a cell cluster 286 expressing a number of cell cycle genes ( Figure 4G). Within group I it is subcluster EpiSC-1.4, 287 within group II a subpopulation of EpiSC-2 and within group III EpiSC-9. This location is very 288 similar to the origin of velocity arrows ( Figure 4D). 289

Discussion 290
In this study we provide a single-cell landscape of the post-MI epicardium with 11 transcriptionally 291 different cell populations. This amazing degree of cellular heterogeneity is similar to that of 292 myocardial cardiac fibroblasts (11,12). The widely used epicardial lineage marker gene Wt1 (4) 293 was selectively expressed in three populations (EpiSC-1, 7, 8), which were localized on the outer 294 surface of the activated epicardium ( Figure 2B). Other commonly used lineage markers and 295 recently identified epicardial population markers of the developing zebrafish heart (3) were rather 296 heterogeneously distributed ( 28) ( Figure 1F). Surprisingly, group I cells accounted for only 26% of the EpiSC fraction from the 326 injured heart. Furthermore, the expression of several paracrine proangiogenic factors that were 327 previously considered specifically derived from WT1 + epicardial cells (8) was not limited to group 328 I EpiSC, but was even higher in other EpiSC populations ( Figure 2E). Thus, about 2/3 of all 329 epicardial cells were Wt1-negative, but they are likely to be also involved in the secretion of 330 paracrine factors. Furthermore, aCSC expressed numerous paracrine factors ( Figure 3H). 331 Therefore, secretion of paracrine factors appears to be general feature of both epicardial and 332 myocardial stromal cells which all can contribute to cardioprotection. 333 Group II, consisting of four cell populations, represented 40% of the EpiSC fraction and showed 334 enriched expression of chemokines known to be involved in attraction of monocytes and 335 neutrophils ( Figure 2D). The expression of these chemokines was a general feature of EpiSC in 336 comparison to aCSC ( Figure 3H). This finding points to a role of epicardial cells in the modulation 337 of the innate immune response post MI, as was already suggested with regard to adaptive immune 338 regulation during the post-MI recovery phase (9). 339 340 Cells with the transcriptional profiles of group I and II populations were generally prevalent in 341 EpiSC in comparison to aCSC ( Figure 3B). Group I and II populations also showed the highest 342 number of potential ligand-receptor interactions ( Figure 4E). Interestingly, EpiSC-8, characterized 343 by expression of Wt1 and high transcript levels of HOX transcription factors as well as of genes 344 annotated with the GO term "Embryonic organ morphogenesis", also specifically expressed 345 thymosin β4 ( Figure 2C) which has tissue-regenerating properties (10). The pronounced 346 expression of cardiogenic factors and contractile proteins in group I ( Figure 2C), and the parallel 347 expression of WT1 and thymosin β4 suggests that this cellular network has cardiogenic potential 348 and was involved in the previously reported formation of cardiomyocytes from WT1 + cells after 349 thymosin β4 stimulation (10). Interestingly, group I and II also shared high expression of HIF-1-350 responsive glycolytic enzymes ( Figure 2F), which again was a general feature of EpiSC compared 351 to aCSC ( Figure 3G). This is in support of the epicardium being a hypoxic niche (46). Since WT1 352 expression is HIF-1-dependent (37), this is consistent with the view that epicardial HIF-1 signaling 353 is likely to be an important trigger in the ischemic heart to promote cardioprotection. 354 355 Group III comprised five cell populations accounting for 34% of all EpiSC. In contrast to group I, 356 group III EpiSC were found localized throughout the activated epicardium ( Figure 2B) and group 357 III population identifiers also labeled stromal cells within the myocardium ( Figure 2B). This 358 finding is consistent with the transcriptional profile of group III EpiSC that was quite similar to 359 that of major aCSC populations ( Figure 3A and B) and the CSC fraction (Figure 3-figure  360 supplement 5 A-B). This demonstrated that group III cells exhibit a fibroblast-like phenotype. In 361 addition, group III EpiSC showed the lowest numbers of both predicted cell-cell interactions 362 ( Figure 4E) and expressed a lower number of genes ( Figure 4F), suggesting a less active cell state. 363 Another remarkable feature of group III was the expression of a second set of cardiogenic factors 364 ( Figure 2C). However, these cardiogenic factors were prevalently expressed in aCSC ( Figure 3E) 365 which is consistent with the postulated cardiogenic potential of cardiac fibroblasts (47

Animal procedures 403
Myocardial infarction (MI) followed by reperfusion was performed as previously described (14). 404 In brief, mice were anaesthetized (isoflurane 1.5%) and the left anterior descending coronary artery 405 (LAD) was ligated for 50 min followed by reperfusion. LAD occlusion was controlled by ST-406 segment elevation in electrocardiography recordings. Sham control animals underwent the surgical 407 procedure without LAD ligation. 408 Wt1 CreERT2 -mediated lineage tracing was performed as described previously (10). In brief, 409 Wt1 CreERT2 Rosa tdTomato mice received tamoxifen injections (2 mg emulsified in sesame oil; i.p.) 410 5 and 3 days prior MI to induce CreERT2 activity. 411

Isolation of EpiSC, aCSC and CSC 412
EpiSC and aCSC at day 5 after MI and control CSC from uninjured hearts at day 5 after sham 413 surgery were isolated as previously described (13). In brief, mice were sacrificed by cervical 414 dislocation and hearts were excised for preparation of the aortic trunk in ice-cold PBS. Isolated 415 hearts were immediately cannulated and perfused with PBS (3 min), followed by perfusion with 416 collagenase solution (8 min; 1,200 U/ml collagenase CLS II (Biochrom, Berlin, Germany) in PBS 417 at 37°C. 418 EpiSC were simultaneously isolated from post-MI hearts by bathing the heart in its collagenase-419 containing coronary effluat while applying mild shear force to the cardiac surface. The effluat was 420 collected, centrifuged (300 g, 7 min) and cells were resuspended in PBS / 2% FCS / 1 mM EDTA. 421 Application of shear force and effluat collection was omitted for uninjured sham control hearts 422 due to the absence of an activated, expanded epicardial layer. 423 aCSC and CSC were isolated by mechanical dissociation of the digested myocardial tissue, 424 followed by resuspension in Dulbecco's modified eagle medium (DMEM) / 10% FCS. The cell 425 suspensions were meshed through a 100 μm cell strainer and centrifuged at 55 g to separate 426 cardiomyocytes from non-cardiomyocytes. The supernatants were again passed through a 40 µm 427 cell strainer, centrifuged (7 min, 300 g) and cell pellets were resuspended in PBS / 2% FCS / 1 mM 428 EDTA. Cells were immediately stained for surface markers and applied to fluorescence-activated 429 cell sorting. Cell 3' Reagent Kit v3 according to manufacturer's instructions. Sequencing was carried out on a 449 HiSeq 3000 system (Illumina Inc. San Diego, USA) according to manufacturer's instructions with 450 a mean sequencing depth of ~90,000 reads/cell for EpiSC and ~70,000 reads/cell for aCSC. 451 Differences in sequencing depth were necessary in order to achieve a similar sequencing saturation 452 between all samples of ~70%. 453

Processing of scRNAseq data 454
Raw sequencing data was processed using the 10x Genomics CellRanger software (v3.0.2) 455 provided by 10x Genomics. Raw BCL-files were demultiplexed and processed to Fastq-files using 456 the CellRanger mkfastq pipeline. Alignment of reads to the mm10 genome and UMI counting was 457 performed via the CellRanger count pipeline to generate a gene-barcode matrix. 458 The median of detected genes per cell was 3,155 for EpiSC, 3,265 for aCSC and 2,241 for CSC. 459 The median of UMI counts per cell was 10,689 for EpiSC, 11,110 for aCSC and 5,879 for CSC. 460 Mapping rates (reads mapped to the genome) were about 89% for EpiSC, 90.9% for aCSC and 461 86.7% for CSC. 462 For tdTomato lineage tracing experiments a custom reference, consisting of the mm10 genome 463 and the full-length sequence of tdTomato, was generated via CellRanger mkref. 464

Filtering and clustering of scRNAseq data 465
Further analyses were carried out with the Seurat v3.0 R package (15). Initial quality control 466 consisted of removal of cells with fewer than 200 detected genes as well as removal of genes 467 expressed in less than 3 cells. Furthermore, cells with a disproportionately high mapping rate to 468 the mitochondrial genome (mitochondrial read percentages >5.0 for EpiSC and aCSC, >7.5 for 469 CSC) have been removed, as they represent dead or damaged cells. Normalization has been carried 470 out utilizing SCTransform. Biological replicates have been integrated into one dataset by 471 identifying pairwise anchors between datasets and using the anchors to harmonize the datasets. 472 Dimensional reduction of the data set was achieved by Principal Component analysis (PCA) based 473 on identified variable genes and subsequent UMAP embedding. The number of meaningful 474 Principal Components (PC) was selected by ranking them according to the percentage of variance 475 explained by each PC, plotting them in an "Elbow Plot" and manually determining the number of 476 PCs that represent the majority of variance in the data set. Cells were clustered using the graph-477 based clustering approach implemented in Seurat v3.0. Doublet identification was achieved by 478 using the tool DoubletFinder (v2.0.2) (16) by the generation of artificial doublets, using the PC 479 distance to find each cell's proportion of artificial k nearest neighbors (pANN) and ranking them 480 according to the expected number of doublets. Heat maps were generated using Morpheus 481 (https://software.broadinstitute.org/morpheus). 482

Gene Ontology enrichment analysis 483
Enriched Gene Ontology (GO) terms in differentially expressed genes between populations were 484 identified by using the gene ontology enrichment analysis and visualization (GOrilla) tool (17). 485

Canonical correlation analysis (CCA) space alignment 486
A direct comparison of the EpiSC and aCSC as well as EpiSC and CSC data sets was performed 487 by Seurat's CCA alignment procedure (v 2.3.4). Briefly, the top 600 variable genes were identified 488 for each data set and subjected to a CCA. Herein, canonical correlation vectors were identified and 489 aligned across data sets with dynamic time warping. After alignment, a single integrated clustering 490 was performed, allowing for comparative analysis of cell populations across both cell fractions. 491

RNA in situ hybridization 503
In situ detection of selected marker gene expression was performed by RNA in situ hybridization 504 using the RNAScope 2.5 HD Detection Kit Assay -RED (Advanced Cell Diagnostics, Hayward, 505 CA) (21). Fresh frozen hearts (5 days after MI) were cut in 10 μm sections. Fixation and 506 pretreatment of the cryosections were performed according to the manufacturer's instructions. The 507 incubation time for the hybridization of RNAScope 2.5 AMP 5 -RED was increased to 120 min 508 to enhance the signal. For evaluation of the target probe signal the hybridized heart sections were 509 examined under a standard bright field microscope (BX61 Olympus, Hamburg, Germany) using a 510 20x objective. Images were processed for publication using ImageJ/Fiji (22). 511

Statistics 512
Markers defining each cluster as well as differential gene expression between different clusters 513 were calculated using a two-sided Wilcoxon Rank-Sum test which is implemented in Seurat. 514

Acknowledgements 515
We thank K. Raba (Institute of Transplantation