The human FLT1 regulatory element directs vascular expression and modulates angiogenesis pathways in vitro and in vivo

There is growing evidence that mutations in non-coding cis-regulatory elements (CREs) disrupt proper development. However, little is known about human CREs that are crucial for cardiovascular development. To address this, we bioinformatically identified cardiovascular CREs based on the occupancy of the CRE by the homeodomain protein NKX2-5 and cardiac chromatin histone modifications. This search defined a highly conserved CRE within the FLT1 locus termed enFLT1. We show that the human enFLT1 is an enhancer capable of driving reporter transgene expression in vivo throughout the developing cardiovascular system of medaka. Deletion of the human enFLT1 enhancer (ΔenFLT1) triggered molecular perturbations in extracellular matrix organisation and blood vessel morphogenesis in vitro in endothelial cells derived from human embryonic stem cells and vascular defects in vivo in medaka. These findings highlight the crucial role of the human FLT1 enhancer and its function as a regulator and buffer of transcriptional regulation in cardiovascular development.

Introduction 59 Disruption of heart and major blood vessel formation during development results in congenital 60 heart defects at birth and are a major factor underlying child mortality and morbidity (1). The  There is now mounting evidence that disrupted cardiovascular regulatory elements can impair 79 heart development leading to disease (8-11). A prerequisite for functional studies to 80 understand the effect of non-coding SNPs, is the accurate identification of the regulatory 81 5 regions in the human genome that are important for cardiac development and disease. 82 Accessible datasets of the human genome, regulatory element associated chromatin marks 83 (12,13), TF analyses (14) and chromatin capture experiments (15-17) provide valuable 84 resources to define the human gene regulatory network in the heart. However, key challenges 85 to identify non-coding elements relevant for disease remain. The search space is still large, for 86 example, the current registry of human cis-regulatory elements in the Encode data set is 87 comprised of 926,535 entries (18). Furthermore, it is crucial to have functional validation 88 methods to determine both the sufficiency and necessity of a given human regulatory element 89 for normal development (19). In this study, we developed a bioinformatic pipeline to identify cardiac enhancers that are 92 involved in development and disease with a particular focus on the highly conserved cardiac 93 TF NKX2-5. NKX2-5 is essential for heart formation and homeostasis and is crucial for the 94 development of heart muscle cells (14,20,21). Since mutations in NKX2-5 can lead to CHD (22), 95 we reasoned that variants in NKX2-5 target enhancers may also impair cardiac development.  In order to identify human cis-regulatory elements that are involved in cardiovascular 117 development, we developed a bioinformatic pipeline to filter for sequences that were directly 118 bound by NKX2-5, a TF essential for heart development (Fig 1a). We therefore made use of a 119 previously generated dataset of NKX2-5 genomic targets identified in human pluripotent 120 derived cardiomyocytes by chromatin immunoprecipitation sequencing (ChIP-seq) (14). From 121 all of the ChIP-seq experiments, 20,879 regions were identified to be directly bound by NKX2-122 5. Since heart development is a highly conserved process, REs deeply embedded in such 123 essential processes are under positive selective pressure compared to non-functional non-124 coding sequences to maintain correct activity (12). We therefore filtered these regions for high 125 sequence conservation and obtained 62 sequences which were ultra-conserved across 100 126 vertebrate species, from fish to human. Furthermore, in order to filter for sequences that were 127 shown to be active REs, we used publicly available datasets of histone modification marks as a 128 measure to obtain active cardiac enhancers(13). We identified 38 regions that showed histone 7 marks for active enhancers H3K4 monomethylation (H3K4me1) and H3K27 acetylation 130 (H3K27ac). We further filtered these regions for those that could be associated with genes 131 known to play a role in heart development. We identified 7 CREs associated with highly 132 relevant genes expressed in the heart and deeply embedded in the genetic networks 133 controlling heart development (Table S1). To further understand the mechanism of regulatory 134 elements involved in cardiovascular development and to illustrate the evidence supporting our 135 hypothesis, we set out to investigate an enhancer element located in the intron 10 of the gene 136 FLT1 (Fig. 1b). We set out to assess the in vivo function of the human enhancer sequence enFLT1, which has 141 been validated to be bound by many TFs embedded in cardiovascular development (Fig. S1a).

142
In order to determine whether the RE is able to drive GFP reporter gene expression in the and endothelial tissues such as intersegmental vessels, dorsal aorta ( Fig. 1c'), the outflow tract 151 (Fig. 1d), blood vessels in the myocardium, the endocardium and the heart valves ( Fig. 1d').

203
This data suggests the enhancer of FLT1 is not essential for endothelial tube formation in vitro 204 and that other regulatory elements act to buffer FLT1 expression from the loss of enFLT1. 205 Nevertheless, given the altered transcriptional profile observed in ∆enFLT1 endothelial cells it 206 remains possible that this enhancer is required for normal development.            Vessel formation assays or angiogenesis assays were performed as previously published (34).

367
In brief, 40µl of Geltrex TM were added to each well of a 96 well glass bottom plates (Corning).

368
The plates were kept on ice while pipetting, subsequently centrifuged and left to set in the 369 incubator at 37˚C for 30min. Endothelial cells were harvested and resuspended in complete  Confocal images were acquired with Yokagawa CellVoyager CV8000 high-throughput discovery 376 system under 37˚C and 5% CO2. Maximum intensity projection (MIP) images were constructed 377 from 15µm z slices (300µm total z distance), captured every 40 minutes for 48 hours. Images  x 150 bp read length. The fastq files were processed using the RNAsik pipeline (52). The STAR 392 aligner (53) was used to align reads to the GRCh38 Assembly. Aligned reads were assigned to 393 features from the GRCh38 EnsEMBL Annotation (54) using the featureCounts program from 394 RsubRead (55). Degust (56) was used to perform and visualise differential expression analysis. 395 Firstly, the first dimension of unwanted variation were removed from counts using RUVr 396 routine from the RUVSeq R package (57). Next, TMM normalisation and the quasi-likelihood 397 test was performed using EdgeR's (58) standard workflow.

398
Salmon (59) was used to quantify transcript isoforms abundance, and the differential 399 abundance of FLT1 was tested and visualised using the DRIMSeq R Package (60).