Cardiac directed differentiation using small molecule Wnt modulation at single-cell resolution

Differentiation into diverse cell lineages requires orchestration of gene regulatory networks guiding cell fate choices. Here, we present the dissection of cellular composition and gene networks from transcriptomic data of 43,168 cells across five discrete time points during cardiac-directed differentiation. We utilize unsupervised clustering and implement a lineage trajectory prediction algorithm that integrates transcription factor networks to predict cell fate progression of 15 subpopulations that correlate with germ layer and cardiovascular differentiation in vivo. These data reveal transcriptional networks underlying lineage derivation of mesoderm, definitive endoderm, vascular endothelium, cardiac precursors, and definitive cell types that comprise cardiomyocytes and a previously unrecognized cardiac outflow tract population. Single cell analysis of genetic regulators governing cardiac fate diversification identified the non-DNA binding homeodomain protein, HOPX, as functionally necessary for endothelial specification. Our findings further implicate dysregulation of HOPX during in vitro cardiac-directed differentiation underlying the molecular and physiological immaturity of stem cell-derived cardiomyocytes.


Introduction 80
Studies of cardiac development at single-cell resolution have provided valuable new insights into cell diversity and genetic regulation of cell types revealing mechanisms underlying cardiovascular differentiation and morphogenesis. Single-cell analysis of in vivo mouse heart development have revealed chamber-specific and temporal changes in gene expression underlying embryonic heart development from E9.5 to postnatal day 21 establishing anatomical patterns of gene expression in the heart (Li et al., 2016) 85 and new insights into transcriptional programs underlying cardiac maturation (DeLaughter et al., 2016). These studies provide a valuable resource by which to understand transcriptional mechanisms underlying diverse fate choices involved in cardiac development and morphogenesis in vivo. Like many single-cell transcriptomic studies, they further highlight the importance of dissecting cell heterogeneity to understand mechanisms underlying the identity and fate of cells in health and disease. 90 Human pluripotent stem cells are a key model system to study human cardiovascular developmental biology (Murry and Keller, 2008). However, the fidelity by which cardiac directed differentiation in vitro recapitulates the transcriptional programs governing the diversity of cell fates generated in vivo is not well understood. Analyzing differentiation efficiency has relied extensively on expression signatures from bulk samples consisting of hundreds of thousands of cells which lack the 95 resolution to dissect gene expression and cell subpopulation heterogeneity. Furthermore, identification of rare populations remains challenging. Cardiac progenitor populations, for example, are difficult to identify but important as they constitute cell states underlying decision points in fate diversification (Qyang et al., 2007). Lastly, modelling development and disease requires an accurate, quantifiable analysis of complex decisions underlying the orchestration of heterogeneous cell types responsible for 100 cell phenotypes, from molecular characterization to physiological function. In this study, we report RNA-sequencing data captured from more than forty thousand single cells navigating stage-specific transitions through in vitro cardiac directed differentiation from pluripotency using an established small molecule Wnt modulation protocol Lian et al., 2012). In coordination with two companion papers (Ding et al., In review; Nguyen et al., in review-a), 105 we present in this study three primary outcomes illustrating the power of this data set for expanding our understanding of stem cell directed differentiation as a platform to study cardiovascular development. First, unsupervised clustering analysis of single cell data identify fifteen transcriptionally distinct cell subpopulations transiting cardiac directed differentiation. In vitro gene expression networks underlying these cell subpopulations correlate with mesendoderm fate choices made during gastrulation phases of 110 germ layer specification and progressive developments in cardiovascular differentiation and morphogenesis in vivo (Li et al., 2016;Peng et al., 2016). Second, we implement a new lineage trajectory prediction algorithm, scdiff, specifically designed for time-course single-cell data (Ding et al., In review). This algorithm utilizes transcription factors and their regulatory networks to provide mechanistic insights into the analysis of coordinated differentiation of diverse cell subpopulations through differentiation from 115 pluripotency into the cardiac lineage. As a result of lineage tracing transcriptional regulatory networks using scdiff, we identify cardiomyocytes and outflow tract cells as the dominant terminal cell types derived from in vitro differentiation. Lastly, we leverage single-cell transcriptomic data to elucidate new insights into gene regulatory mechanisms controlling cardiovascular fate from pluripotency. We identify the non-DNA binding homeodomain protein HOPX as an early marker of mesoderm and a context-120 specific regulator of gene networks underlying endothelial and cardiac fate. Using conditional genetic loss of function assays, we determine that HOPX is required for vascular fate specification and is an essential regulator underlying differentiation of hPSC-derived cardiomyocytes at molecular and physiological levels. 125 Results

Single-cell RNA-sequencing analysis of cardiac directed differentiation
To gain insights into the genetic regulation of cardiovascular development, we performed single-cell 130 transcriptional profiling of human iPSCs navigating from pluripotency through stage-specific transitions in cardiac differentiation ( Figure 1A). Small molecule Wnt modulation was used as an efficient method to differentiate pluripotent cells toward the cardiac lineage Lian et al., 2012). WTC-CRISPRi hiPSCs  were chosen as the parental cell line for this study. These cells are genetically engineered with an inducible nuclease-dead Cas9 fused to a KRAB repression 135 domain ( Figure S1A). Transcriptional inhibition by gRNAs targeted to the transcriptional start site is doxycycline-dependent and can be designed to silence genes in an allele-specific manner. The versatility of this line provides a means to use this scRNA-seq data as a reference point for future studies aiming to assess the transcriptional basis of cardiac differentiation at the single-cell level. Cells were verified to have a normal 46 X,Y male karyotype by Giemsa banding analysis before analysis by scRNA-Seq 140 ( Figure S1B). Cells were captured at time points corresponding to stage-specific transitions in cell state including pluripotency (day 0), germ layer specification (day 2), and progressing through progenitor (day 5), committed (day 15), and definitive (day 30) cardiac cell states. We harvested a total of 44,020 cells of which 43,168 cells were retained after quality control analysis. In total, we captured expression of 17,718 genes (detected in at least 44 cells and with expression values within the overall expression range of 3 145 median absolute deviation, as described in our companion paper (Nguyen et al., in review-b)). We used dimensionality reduction approaches to visualize all 43,168 cells in low-dimensional space, in which cell's coordinates were estimated so that they preserve the expression similarity (local and global distance in the original multidimensional space) in t-SNE plots (left), and the differentiation pseudotime (transition probability between cells) in diffusion plots (right). These data show distinct transcriptomic clustering and 150 distribution of cells undergoing differentiation ( Figure 1B). We generated a time-course gene expression profile using a wide range of known cardiac developmental genes by measuring expression among all cells to reveal the temporally-restricted expression dynamics of stage-specific genes reflecting cardiac fate choices ( Figure 1C). To confirm that the differentiation follows known developmental trajectories, we identified cells that express known 155 cardiac differentiation genes and observed how they cluster in low-dimensional space ( Figure 1D) (Coifman et al., 2005;Moignard et al., 2015). Day 0 (pluripotency) cells express POU5F1 (OCT4), a key pluripotency gene that is down-regulated as cells transition to germ layer differentiation at day 2. Day 2 cells are characterised by expression of EOMES, a marker of pan-mesendoderm lineages. TMEM88, previously identified as a regulator of cardiac progenitor fate (Palpant et al., 2013), displays expression 160 primarily in day 5 cells. TNNI1, the foetal isoform of troponin I, marks all cardiac derivatives beginning on day 5 and continuing through to day 30, whereas TTN, a sarcomeric protein found in definitive contractile cardiomyocytes, is expressed predominantly in day 30 cells. Analysis of this time-course data using an unsupervised clustering algorithm (Clustering at Optimal REsolution (CORE) (Nguyen et al., in review-b)) identified fifteen transcriptionally unique subpopulations across all five time points ( Figure  165 1E). Using the subpopulation identification, we show that stage-specific genes are consistently expressed with known developmental state transitions, subpopulations within each time point reveal a diversity of cell states coordinating fate choices during differentiation. Overall, these data show that small moleculemediated cardiac directed differentiation generates developmentally distinct populations of cells displaying expected temporal-specific transcriptional profiles. 170 Phenotypic diversity and lineage heterogeneity during differentiation.
With recent developments in high-resolution transcriptomic mapping of mouse in vivo development of the cardiovascular system from the earliest stages of gastrulation (DeLaughter et al., 2016;Li et al., 2016;175 Peng et al., 2016), we set out to map single-cell heterogeneity of human in vitro derived subpopulations to cell types against stages of lineage specification in vivo. To assist in elucidating the molecular identity of each subpopulation, we analyzed high-resolution spatio-temporal gene expression during mouse in vivo gastrulation to identify genes that mark known developmental populations and cell types ( Figure S2). Using previously published approaches, laser microdissection was used to capture germ layer cells of 180 mid-gastrula stage (E7.0) embryos (Peng et al., 2016), with an expanded analysis to include early-(E6.5) and late-gastrulation (E7.5) mouse embryos (unpublished data). High-throughput RNA-sequencing data were compiled into corn plots, with each plot depicting discrete spatial-temporal patterns of gene expression corresponding to individually sequenced sections. Expression data for selected genes involved in heart development in the epiblast/ectoderm and mesoderm/endoderm for E6.5, E7.0, and E7.5 mouse 185 embryos are shown in Figure S3. To determine phenotypic identities based on gene expression networks governing each human in vitro-derived subpopulation during differentiation, we visualized the spatiotemporal patterns of gene expression in the gastrulating mouse embryo including: EOMES (panmesendoderm), MESP1 and MIXL1 (mesoderm), SOX17 and FOXA2 (endoderm), and NKX2-5 (cardiac lineage transcription factor) (Figure 2A, Figure S3A-D). These in vivo expression dynamics of mouse 190 gastrulation established spatiotemporal reference points for identifying in vitro subpopulations. Based on these observations, we dissected the transcriptional phenotype of subpopulations identified during human cardiac directed differentiation. From pluripotency ( Figure S4), cells navigate through germ layer specification (day 2), comprising three transcriptionally distinct subpopulations that express the pan-mesendoderm gene, EOMES ( Figure 2B-C, Figure S3A). Specific day 2 subpopulations 195 express genes involved in mesoderm (D2:S2), mesendoderm (D2:S3), and definitive endoderm (D2:S1) differentiation ( Figure 2B-C and Figure S3D and Figure S4C). Gene ontology (GO) analysis of differentially expressed genes between subpopulations indicated that only D2:S2 showed significant enrichment for cardiogenic gene networks ( Figure 2D, Table S1). Surprisingly, these data show that only 34% of day 2 cells comprise cardiogenic mesoderm marked by MESP1 with the majority of cells  Figure S4E-F). In contrast, S1 was primarily characterized by GO enrichment for genes associated with extracellular matrix deposition, motility, and cell adhesion ( Figure 2J and M) which was supported by identification of a significant number of fibroblast-like cells marked by THY1 (CD90) in S1 (Figure 2I and L). The co-existence of a non-contractile cell population, which is characterized as non-210 myocytes, is common in directed cardiac differentiation (Dubois et al., 2011). Taken together, these data show iPSC differentiation into committed (day 15) and definitive (day 30) cardiomyocytes (S2) and noncontractile THY1 + cells (S1).
Overall, these data provide single-cell resolution transcriptomic data of small molecule in vitro directed cardiac differentiation. Unsupervised clustering using CORE (Nguyen et al., enabled the identification of subpopulations with biologically distinct phenotypes linked to in vivo cardiovascular development fate choices ( Figure 2N). To assess the level of maturity derived from this protocol relative to in vivo human development, we compared day 30 clusters against ENCODE RNA-seq data from foetal and adult hearts ( Figure 2O). Using genes that reflect either early foetal (TNNI1, MYH6) vs late stages of heart development (MYH7, TNNI3, MYL2), the most differentiated in vitro 220 derived cardiac population (D30:S2) remains more developmentally immature than even first trimester human hearts.

Lineage predictions based on regulatory gene networks governing differentiation 225
While these bulk population analyses provided clarity into the diversity of cell types represented in cardiac differentiation, we sought to understand the lineage trajectories and gene regulatory networks governing diversification of cell fates ( Figure S5). We analyzed the full differentiation data set using two previously published computational packages, Monocle2 and dPath ( Figure S6) (Gong et al., 2017;Qiu et al., 2017;Trapnell et al., 2014). dPath predicts trajectories on the basis of metagene entropy to rank 230 cells on their differentiation potential and implements a self-organizing map (SOM) and random walk with restart (RWR) algorithm to construct lineage hierarchies in an unbiased manner based on measures of transcriptional entropy. Monocle builds lineage trajectories by aligning cells in pseudotime and establishes branch points based on a manifold tree structure learned from the multidimensional expression data using the discriminative dimensionality reduction via learning tree (DDRTree) algorithm. Analysis 235 of our time-course cardiac differentiation data using Monocle2 or dPath for all 43,168 cells was carried out using default parameters. For gene inputs, we used the 6000 top most variable genes for dPath, and known markers or differentially expressed genes for Monocle2. These analyses did not yield lineage predictions that accurately fit known in vivo developmental trajectories possibly due to limitations in linking such transcriptionally disparate populations represented in our time series data set through gene 240 expression entropy or pseudotime analysis ( Figure S6).
To address these problems, we implemented a probabilistic method for constructing regulatory networks from single-cell time series expression data (scdiff: Cell Differentiation Analysis Using Timeseries Single-cell RNA-seq Data) (Ding et al., In review) ( Figure S7). The algorithm utilizes TF-gene databases to model gene regulation relationships based on the directional changes in expression of TFs 245 and target genes at parental and descendant states. These prerequisites impact both cell assignment and model learning since each state is represented by a probabilistic model that takes into account not just the expression but also the regulatory information. As such, scdiff does not exclusively rely on expression similarity to connect states allowing it to overcome problems related to sampling since it can still identify descendent states even if they are less similar in terms of their actual expression profiles. 250 We used scdiff to predict lineages underlying the diversity of fates during small moleculemediated cardiac directed differentiation (Table S2 and Figure 3A). While the algorithm permits movement between days, we did not observe any cells shift from the original day of isolation (Table S2), thus highlighting the challenge of connecting lineage trajectories on the basis of transcriptional similarity.
We leveraged the regulatory network predictions to identify key transcription factors and target genes underlying progressive fate changes across all 10 nodes ( Figure 3D and Table S2). These data 275 reinforce established mechanisms of cardiac lineage specification. In particular, we found evidence for down-regulation of Wnt/β-catenin signaling (LEF1) between N4-N6 which is required to transition from mesoderm into the cardiac progenitor cell (Paige et al., 2010;Ueno et al., 2007). From the progenitor node N6 into contractile cardiomyocytes N7:N9, the data show proper down-regulation of progenitor transcription factors such as YY1 and up-regulation of TFs known to control cardiomyocyte 280 differentiation such as NKX2-5. Downstream target genes involved in governing the transition from progenitor cells to a differentiated cardiac state were expressed concomitantly ( Figure 3D).
To gain new insights into lineage trajectories derived during differentiation, we sought to understand the gene network underlying specification of THY1 + non-contractile cardiac derivatives N8:N10, a population currently not well defined although widely used for tissue engineering applications 285 (Thavandiran et al., 2013). The predicted network underlying this transition showed significant downregulation of cardiac TFs NKX2-5 and MAZ while other TFs involved in lipid metabolism/sterol regulation (SREBF2) and protein sumoylation (TOPORS) were up-regulated ( Figure 3D). Of particular note, we observed up-regulation of Pre-B cell leukemia transcription homeobox (PBX1: P = 1.1e -16 , mean DE target fold change = 2.72), a transcriptional regulator that activates a network of genes 290 associated with cardiac outflow tract (OFT) morphogenesis (Arrington et al., 2012).

HOPX: a context-specific regulator of cardiovascular lineage diversification in vivo and in vitro 320
To gain new insights into genetic regulation of cardiac subpopulations not possible without single-cell level resolution, we analyzed a panel of 52 transcription factors and epigenetic regulators known to govern diversification of mesoderm and endoderm lineages represented in this data set ( Table S3). Expression of these regulatory genes was measured across eleven subpopulations identified between days 325 2-30 of differentiation (Table S3). HOPX, a non-DNA binding homeodomain protein identified in this analysis, has previously been identified as one of the earliest, specific markers of cardiomyocyte development (Jain et al., 2015), and governs cardiac fate by regulating cardiac gene networks through interactions with transcription factors, epigenetic regulators, and signaling molecules (Chen et al., 2002;Jain et al., 2015). We have also recently shown that HOPX functionally regulates blood formation from 330 hemogenic endothelium (Palpant et al., 2017b). Analysis of single-cell data revealed three unexpected observations about HOPX in the context of cardiac directed differentiation. First, HOPX is expressed as early as day 2 of differentiation during germ layer specification, much earlier than previously detected ( Figure 4A). Second, HOPX is expressed during cardiomyocyte specification at the progenitor stage of mouse development in vivo whereas we 335 detect HOPX only in endothelium (D5:C3) and not in cardiac precursor cells (D5:C1) at an equivalent time point (day 5) of in vitro differentiation ( Figure 4A). Third, in contrast to previous studies in vivo where HOPX lineage traces almost all cardiomyocytes of the heart (Jain et al., 2015), HOPX is detected in only 16% of D30:S2 cardiomyocytes (Figure 4B-C). To rule out stochastic expression in cardiomyocytes due to low sequencing read depth resulting in dropout, we analyzed expression of a panel 340 of genes known to regulate cardiac lineage specification and differentiation (Figure 4B-C). While HOPX is rarely detected, its expression level is equivalent to that of other cardiac TFs detected in a majority of D30:S2 cardiomyocytes (HAND1: 67%, HAND2: 64%, GATA4: 67%, NKX2-5: 86% vs HOPX: 16%) ( Figure 4B-C).
To assess the fidelity of HOPX expression during germ layer specification, we analyzed spatio-345 temporal gene expression throughout mesoderm and endoderm development in the gastrulating mouse in vivo and during hPSC directed differentiation in vitro. Previous work using MESP1 as a marker of mesoderm specification from pluripotency in vitro (MESP1-mCherry reporter hPSCs (Den Hartogh et al., 2015)) shows enrichment of HOPX expression in mCherry positive cells (Figure S8B). Similarly, assessment of gene expression during mouse gastrulation in vivo shows HOPX expression as early as 350 E6.5 in the proximal portion of the nascent primitive streak (P) (Figure 4D and Figure S8C-D) similar to the expression pattern of MESP1 (Figure S8E-F). From E7.0 to E7.5, HOPX is increasingly expressed throughout the developing endoderm. By E7.5, HOPX displays residual expression in the remaining distal primitive streak, endoderm (EA to EP), and the anterior mesoderm (MA) in coordination with other cardiogenic genes including NKX2-5 and MESP1 (Figure 4D and Figure 2A). 355 We analyzed expression networks correlated with HOPX + cells at day 2 of differentiation. These data show that HOPX is most highly expressed and represented in mes-endoderm subpopulations 2 and 3 ( Figure 4A). At day 2 HOPX + cells have significantly higher expression of mesoderm genes HAND1, T, MESP1, and MIXL1, no change in anterior primitive streak endodermal genes GSC and NODAL and significantly lower levels of definitive endoderm gene expression (FOXA2, ISL1, and SOX17) compared 360 to HOPXcells ( Figure S8G). These data suggest that HOPX + cells are associated with cardiogenic mesoderm, consistent with in vivo spatio-temporal expression data in the mouse from E6.5-E7.5 ( Figure  4D). From E8.5 onward, cardiac development and morphogenesis occurs as chambers, valves, and outflow tract form. We analyzed HOPX expression across diverse cell types contributing to heart 365 development in vivo using single-cell transcriptomic analysis of the E9.5 mouse heart (Li et al., 2016).
These data indicate that HOPX expression is distributed throughout all chambers and cell types of the heart ( Figure 4E). While HOPX expression largely coincides with expression of cardiac genes MYH7 and ACTN2, HOPX is also expressed in endothelial cells (CDH5 + and/or PECAM1 + ), smooth muscle cells (MYH11 + and/or TAGLIN2 + ), and epicardial cells (WT1 + ) ( Figure 4D and Table S4). Consistent 370 with mouse heart development, analysis of human foetal development at each trimester indicate a robust activation of HOPX during heart development in vivo ( Figure S8A). These findings confirm that HOPX is expressed at the early stages of mes-endodermal fate during germ layer specification in vitro and in vivo, which is much earlier than previously reported, and is expressed in multiple cell lineages that contribute to heart development. 375

Lineage trajectory of HOPX-expressing cells
We analyzed the lineage trajectory of HOPX + cells at single-cell resolution during cardiac directed differentiation to determine the core gene networks and transcription factors governing successive fate 380 choices of HOPX expressing cells during cardiac differentiation (Figure 4F-H, Figure S9, and Table  S5). At day 2 HOPX is expressed predominantly in mes-endoderm (D2:S2 9% and D2:S3 6%) and rarely in definitive endoderm (D2:S1 2%) with the HOPX lineage at this early stage of specification comprising 2 lineages (N2 and N3) enriched for expression of cardiogenic mesoderm genes such as MESP1 ( Figure  4F-H and Figure S9). From day 2 into day 5, HOPX + cells remain sparse in progenitor cell populations 385 where day 2 N3 splits into two day 5 lineages N4 and N5 (Figure 4F-H Table S5). Progressing to day 15 of differentiation HOPX 390 remains rare (2-4% of cells) and splits into two separate lineages derived from day 5 cardiac precursor cells (N4). Governed in part by increased NKX2-5 and downregulation of the cardiac progenitor TF YY1, HOPX cardiac precursor cells differentiate into MYL2/IRX4 + cardiomyocytes (N6-N7) while a separate branch governed by TFs such as PBX1 differentiate into outflow tract derivatives (N8-N9) (Figure 4F-H and Figure S9). Overall, these data show that HOPX is expressed early in germ layer specification and 395 shows spatio-temporal expression broadly across mes-endodermal lineages in vivo and in vitro. Analysis of HOPX gene expression across populations shows that HOPX is most highly expressed in endothelial cells at day 5 and in committed cardiomyocytes at day 30 of directed differentiation (Figure 4A).

Chromatin and expression analysis of HOPX in cardiac vs endothelial lineage specification 400
We analyzed epigenetic and transcriptional regulation at the HOPX locus in cardiac vs endothelium (Palpant et al., 2017a) (Figure 5A-B and Figure S10A). Cells were assayed by chromatin immunoprecipitation for repressive chromatin (H3K27me3), actively transcribed chromatin (H3K4me3), and gene expression by RNA-seq (Palpant et al., 2017b). In the context of cardiac directed differentiation, 405 these data show that the HOPX locus is epigenetically repressed on the basis of abundant H3K27me3 compared to H3K4me3 in day 5 cardiac precursor cells (Figure 5A-B). This is consistent with RNA-seq, qRT-PCR data, and expression analysis from HOPX-tdTomato reporter cells showing that HOPX is expressed late during cardiac differentiation, well after sarcomere formation during cardiac directed differentiation ( Figure 5C and Figure S10B-D). In the context of endothelium, consistent with our 410 previous analysis (Palpant et al., 2017b), the HOPX locus show significant gene expression correlated with abundant H3K4me3 and reduced H3K27me3 in KDR + /CD34 + ECs at day 5 of differentiation ( Figure 5A). These data show a direct link between chromatin regulation of the HOPX locus and expression of HOPX in cardiac vs endothelial lineage specification. 415

Functional analysis of HOPX in cardiac and endothelial fate
To dissect the role of HOPX early during differentiation of endothelial fate and late stages of cardiac differentiation, we analyzed single-cell expression data of HOPX + cells in combination with HOPX conditional loss of function during differentiation. At the progenitor stage (day 5), HOPX is identified 420 predominantly in D5:S3 with 84% of HOPX cells co-expressing the endothelial TF TAL1 and HOPX + cells express significantly higher levels of endothelial genes (TAL1, CD34) and significantly lower levels of cardiac progenitor genes (TNNI1, NKX2-5) compared to HOPXcells ( Figure 5D). Focusing exclusively on D5:S1 cells which have the most cardiogenic progenitor cell-like transcriptomes (<1% TAL1 + , 99% TMEM88 + , 96% TNNI1 + ), we found that HOPX + cells still represent a population with 425 robust activation of endothelial genes ( Figure S8H). Analysis of CRISPRi HOPX LOF cells revealed that loss of HOPX blocks expression of endothelial genes including TAL1, CD34, CDH5 but not GATA1 or ETV2. In contrast, HOPX LOF in progenitor cells significantly up-regulates genes involved in cardiac fate ( Figure 5E). This is consistent with studies showing that regulation of endothelial vs cardiac fate is governed by mutually antagonistic networks (Palpant et al., 2015;Van Handel et al., 2012), and suggests 430 that loss of HOPX blocks endothelial specification and enables cells to acquire a cardiac fate instead. At later stages of differentiation, HOPX + cells are identified predominantly with cardiomyocytes (D30:S2) ( Figure 4A). Analysis of gene networks associated with HOPX expression showed significantly increased expression of nearly all tested genes associated with cardiac development with the exception of MYH6, the foetal myosin heavy chain isoform, which was significantly upregulated in HOPXcells 435 (Figure 5F). Analysis of LOF cells showed that loss of HOPX specifically blocks expression of late cardiac differentiation gene networks including MYH7, MYL2, and TNNI3 with no effect on early cardiac specification genes (MYL7, TNNI1) ( Figure 5G). Analysis of single-cell data showed that among cells expressing HOPX at day 30, HOPX expression was significantly higher in subpopulation 2 compared to cells expressing HOPX in subpopulation 1 supporting a role in cardiomyocyte specification 440 as opposed to outflow tract differentiation which is consistent with fate mapping studies (Jain et al., 2015) ( Figure 5H). Collectively these data show a dependency of cardiac differentiation gene networks on expression of HOPX in committed cardiomyocytes. Based on these findings, we tested whether HOPX is required for cardiac or endothelial fate specification. Analysis of KDR + /CD34 + endothelial cells at day 5 of differentiation showed a significant 445 loss of KDR + /CD34 + endothelium in HOPX LOF cells vs controls indicating a direct role in governing specification of the endothelial fate ( Figure 5I). In contrast, there was no significant effect on generating cTnT + cardiomyocytes ( Figure 5J) and no difference in the onset of beating between WT and HOPX LOF cardiomyocytes ( Figure 5K).

Changes in myofilament composition underlie significant changes in functional parameters 450
including calcium sensitivity, force production, and elasticity that govern cardiac contractility during development (Yang et al., 2014). Engineered heart tissues are a well-established model for testing physiological analysis of cardiac differentiation and maturation (Mills et al., 2017). Comparative analysis of hPSC-derived cardiomyocytes in monolayer, engineered heart tissues (EHTs) and adult heart tissue shows that HOPX expression significantly increases in EHTs comparable to monolayer cardiomyocytes 455 cultured for one year ( Figure S8I) (Mills et al., 2017). Since HOPX directly regulates myofilament genes involved in late stages of differentiation, we tested whether HOPX LOF had an impact on cardiac contractility of heart tissues. EHTs were made comprising HOPX LOF or control cardiomyocytes with 10% HS27a stromal cells and placed on PDMS exercise polls for 10 days (Voges et al., 2017). Analysis of beating rate, a significant determinant of contractility, was not different between groups. However, loss 460 of HOPX significantly reduced the contractile force of EHTs by 65% compared to controls (P = 0.0009) ( Figure 5L). Collectively, these data provide evidence for a role of HOPX in cardiovascular development in which HOPX functions early during progenitor specification in governing differentiation of endothelium. Subsequently HOPX is shown to be essential for cardiomyocyte differentiation and the expression of gene networks required for definitive cardiomyocyte development and physiological 465 function ( Figure 5M).

Discussion
This study provides single-cell resolution RNA-sequencing of human cardiac directed in vitro computational analysis with CRISPRi hiPSCs for conditional loss of function, we identify HOPX as an early marker of cardiac fate specification during gastrulation and functionally required for endothelial and cardiovascular differentiation from pluripotency. Collectively, we use a widely implemented and efficient protocol using Wnt modulation for cardiac directed differentiation Lian et al., 2012) with CRISPRi hiPSCs  providing a platform to dissect cardiac 485 differentiation at single-cell resolution.
While the progression of heart development and morphogenesis from time points spanning E8.5 to P21 have been analyzed at single-cell resolution (DeLaughter et al., 2016;Li et al., 2016), a comprehensive transcriptomic profiling of the lineages derived by human pluripotent stem cell directed differentiation from pluripotency has not been available. Cardiac directed differentiation protocols using 490 small molecules to modulate Wnt signaling have emerged in recent years as a simple, cost-effective, and reliable method to generate high-purity cardiac derivatives from hPSCs. Stem cell-derived cardiomyocytes generated using this approach have been utilized in translational applications to model patient-specific diseases (Ang et al., 2016;Bayzigitov et al., 2016;Ebert et al., 2014;Smith et al., 2017;Wu et al., 2015), test cardiotoxicity (Maillet et al., 2016), screen novel therapeutic drugs (Casini et al., 495 2017;Sharma et al., 2014), generate engineered heart tissue constructs to model the 3D environment of the heart Tzatzalos et al., 2016), and develop cell-based regenerative therapies to repair heart tissue post-infarct (Hartman et al., 2016). The current study provides whole genome-wide analysis of stage-specific changes in gene expression during cardiac differentiation as a resource with which to dissect cell subpopulations at the molecular level. The statistical power of 43,168 transcriptomes 500 establishes a new experimental platform for fundamental and translational applications in cardiovascular biology. Analysis of subpopulations during early stages of differentiation indicate a surprising contribution of mesendoderm and definitive endoderm coordinately specified with cardiac fates through the progenitor stage of differentiation. In particular, a minority of cells (34%) comprise MESP1 + cardiogenic mesoderm 505 at day 2 that ultimately give rise to all cardiac derivatives at day 30. The interaction between endoderm and mesoderm in governing lineage specification in vivo is well known, and these data suggest that a critical functional role of induction cues provided by directed differentiation protocols is to establish the necessary population stoichiometry of transiently sustained endoderm required to support mesoderm in the derivation of high purity cardiac fates in vitro. 510 We evaluated lineage trajectories from single-cell data by implementing a lineage prediction algorithm, scdiff, specifically designed for learning regulatory networks controlling differentiation from single-cell time series data. This method relies on iterative analysis of known transcription factor-gene interactions to establish a gene regulatory framework that statistically links disparate populations captured across intermittent time intervals through differentiation. Analysis of cardiac directed differentiation using 515 scdiff revealed stage-specific genetic regulators underlying diversification of cardiac fate from pluripotency. In particular, these data revealed new insights into the bifurcation of cardiac precursor cells at day 5 of differentiation into NKX2-5 + /MYL2 + ventricular cardiomyocytes and NKX2-5 -/PITX2 + cardiac outflow tract (OFT) cells. While previous studies have routinely described a non-contractile THY1 + (CD90 + ) fibroblast-like cell used commonly for tissue engineering applications (Dubois et al., 520 2011;Thavandiran et al., 2013), this population remains poorly studied with no strong evidence for an in vivo correlate. Using transcriptional fate mapping and gene network analysis, we provide single-cell level transcriptome-wide evidence directly linked to in vivo cell cardiac types that non-contractile THY1 + cells are a cardiac OFT derivative. Future work will require a more comprehensive analysis of this population in diverse differentiation protocols to determine the reproducibility of this finding across various 525 induction mechanisms, with a key emphasis on single-cell analysis endpoints as a mechanism for phenotyping hPSC-derived cardiac subpopulations. Of importance, congenital heart disease (CHD) is among the most common forms of congenital defects (van der Linde et al., 2011), and OFT anomalies account for roughly 30% of CHD incidences (Thom et al., 2006). While the genetic basis of OFT malformations has been well studied (Arrington et al., 2012), the capacity to study OFT development and 530 disease using hPSC models has not been possible. Our finding of an OFT cell subpopulation derived from in vitro cardiac directed differentiation presents new opportunities to develop translational platforms utilizing this cell type for disease modelling or therapies. It is well-established that in vitro cardiac differentiation does not generate cardiomyocytes with the transcriptional profile, cellular diversity, morphometry, or function maturity of adult in vivo-derived 535 cardiomyocytes (Yang et al., 2014). This is at least in part a consequence of dysregulation of the stagespecific gene networks that are not properly modelled in vitro. In this study, we analyzed a panel of 52 transcription factors and epigenetic regulators across eleven cardiac differentiation-derived cell subpopulations at single-cell resolution. These data provide evidence that HOPX is one of the earliest transcriptional markers of cardiogenic mesoderm during germ layer specification in vitro and in vivo. 540 Functional studies show that HOPX functionally regulates gene networks required for vascular and cardiac differentiation and function. While HOPX is a key developmental regulator of cardiac myoblasts early in heart development in vivo (Jain et al., 2015) with data from this and other studies (Chen et al., 2002;Trivedi et al., 2010) showing a key role in heart maturation, we observe it rarely in in vitro derived cardiomyocytes. Engineered heart tissues (EHTs) force significant up-regulation of genes involved in 545 maturation including HOPX (Mills et al., 2017). We used this platform to show that loss of HOPX in EHTs reduces contractile function of hPSC-derived cardiomyocytes implicating it as a key genetic regulator of differentiation. Given the significance of promoting adult-like phenotypes from in vitro differentiated cell types for translational applications in disease modelling and therapy, these data provide evidence that HOPX expression is a key transcriptional regulator near absent during cardiac directed 550 differentiation in vitro underlying at least in part the transcriptional and functional immaturity of hPSCderived cardiomyocytes. Taken together, our study provides transcriptional profiling of human in vitro cardiac differentiation from more than forty thousand single-cells revealing cell diversity and genetic networks governing lineage progression of hPSC cardiac directed differentiation by small molecule Wnt 555 modulation. These data provide new insights into the complexity of cell populations represented in stagespecific transitions from pluripotency and, coupled with the use of CRISPRi loss of function, establish a unique reference point for dissecting gene networks involved in human cardiac development and disease.  of EOMES, MESP1, SOX17 and NKX2-5 expression in the mesoderm and endoderm of E6.5, E7.0, and E7.5 mouse embryos during gastrulation (unpublished RNA-seq data for E6.5 (n = 6) and E7.5 (n = 6) embryos and published data for E7.0 embryo (Peng et al., 2016)). Positions of the cell populations ("kernels" in the 2D plot of RNA-Seq data) in the germ layers: the proximal-distal location in descending numerical order (1 = most distal site) and in the transverse plane of the mesoderm and endoderm -640 Anterior half (EA) and Posterior half (EP) of the endoderm, Anterior half (MA) and Posterior half (MP) of the mesoderm, and Posterior epiblast (P) containing the primitive streak. Color scales represent levels of expression as log10 of fragments per kilobase million (FPKM + 1) (see Figure S3 for schematic of iTranscriptome). cardiomyocytes (all cells vs S1 vs S2) relative to expression levels in human foetal and adult heart samples (ENCODE). Gene expression is measured as counts per million mapped reads and each gene is internally normalized to maximum expression.   .0, and E7.5 mouse embryos during gastrulation (unpublished RNA-seq data for E6.5 (n = 6) and E7.5 (n = 6) embryos and published 720 data for E7.0 embryo, (Peng et al., 2016)). Positions of the cell populations ("kernels" in the 2D plot of RNA-Seq data) and in the germ layers: the proximal-distal location in descending numerical order (1 = most distal site) in the transverse plane of the mesoderm and endoderm -Anterior half (EA) and Posterior half (EP) of the endoderm, Anterior half (MA) and Posterior half (MP) of the mesoderm, and Posterior epiblast (P) containing the primitive streak. Color scales represent levels of expression as log10 of 725 fragments per kilobase million (FPKM+1). (E) Single-cell expression analysis of E9.5 mouse heart (Li et al., 2016) showing HOPX expression relative to markers of cardiomyocytes (MYH7, ACTN2) and endothelial cells (CDH5, PECAM1) (scale bars are Log2(RPM)). Table ( Circles indicate distinct nodes governed by a common GRN. Since cells can be re-assigned based on network stability, the re-distribution of subpopulations established by our initial unsupervised clustering 740 analysis (as outlined in Figure 2) are represented as pie charts within each circle indicating the percent of cells from each subpopulation contributing to that node. Nodes are demarcated as N1-N9. (G) Expression level within each node for known developmental lineage markers. Gene expression is represented across all nodes: NANOG (pluripotency), MESP1 (cardiac-mesoderm), TAL1 and CDH5 (endothelium) TNNI1, MYH7, NKX2-5, IRX4 (primarily first heart field cardiac genes), THY1, PITX2, HOXA2, and TBX18 745 (second heart field and outflow tract development). force with no change in beating rate in HOPX LOF tissues compared to control, n = 4-5 per group. Representative force traces are also shown. (M) Schematic lineage tree showing fate choices governed by HOPX during cardiac directed differentiation. Our data provide functional evidence that loss of HOPX is primarily identified within the endothelial fate at the progenitor stage of differentiation and loss of HOPX prevents expression of endothelial cell identity genes. At late stages of cardiac differentiation HOPX is 770 most closely related with progression to a definitive cardiomyocyte and loss of HOPX functionally prevents activation of key cardiomyocyte cell identity genes involved in differentiation. * P < 0.05. Data are presented as mean SEM.