Transcription factor-based transdifferentiation of human embryonic to trophoblast stem cells

During the first week of development, human embryos form a blastocyst comprised of an inner cell mass and trophectoderm (TE) cells, the latter of which are progenitors of placental trophoblast. Here we investigated the expression of transcripts in the human TE from early to late blastocyst stages. We identified enrichment of transcription factors GATA2, GATA3, TFAP2C and KLF5 and characterised their protein expression dynamics across TE development. By inducible overexpression and mRNA transfection we determined that these factors, together with MYC, are sufficient to establish induced trophoblast stem cells (iTSCs) from primed human embryonic stem cells. These iTSCs self-renew and recapitulate morphological characteristics, gene expression profiles, and directed differentiation potential similar to existing human TSCs. Systematic omission of each, or combinations of factors, revealed the critical importance of GATA2 and GATA3 for iTSC transdifferentiation. Altogether, these findings provide insights into the transcription factor network that may be operational in the human TE and broaden the methods for establishing cellular models of early human placental progenitor cells, which may be useful in the future to model placental-associated diseases. Summary statement Transcriptional analysis of human blastocysts reveals transcription factors sufficient to derive induced trophoblast stem cells from primed human embryonic stem cells.


Introduction 54
The correct functioning of the human placenta is crucial to ensure a healthy pregnancy 55 outcome. However, despite this critical role, it is one of the least understood organs. The 56 trophoblast cells of the placenta are derived from the trophectoderm (TE) cells of the human 57 blastocyst. After fertilisation the zygote undergoes a series of cleavage cell divisions to form 58 a tight ball of cells called a morula. The outer cells of the morula become polarised which 59 leads to inactivation of the Hippo signalling pathway and allows the translocation of YAP1 into 60 the nucleus where, together with TEAD4, it drives transcriptional activation of the TE 61 programme (Gerri et al., 2020). The human blastocyst implants into the uterine lining 62 approximately 7-9 days after fertilisation (James et al., 2012). The TE generates the first 63 trophoblast lineages: the mononuclear cytotrophoblast cells (CTB) and the multinucleated primitive syncytiotrophoblast (STB). The primitive STB is the initial invading interface and enzymatically degrades the uterine lining allowing for implantation of the conceptus (Hertig 66 et al., 1956). CTB cells are the precursor for trophoblast cell types in the placenta. CTB cells 67 sustain the syncytiotrophoblast across gestation by dividing asymmetrically, with one 68 daughter cell fusing with the overlying syncytiotrophoblast and the other daughter remaining 69 in the proliferative pool (Baczyk et al., 2006). The syncytiotrophoblast performs key functions 70 including providing a physical and immunological barrier to pathogens, transporting nutrients 71 and gaseous exchange, as well as producing and secreting hormones required to adapt the 72 maternal physiology to the pregnancy (Bansal et al., 2012;Costa, 2016). CTB cells are also the 73 precursor of extravillous trophoblast cells (EVTs) which invade the decidua to interact with 74 maternal immune cells and remodel spiral arteries to establish blood supply to the placenta 75 (Pijnenborg et al., 1980). 76 77 Human trophoblast stem cells (hTSCs) derived from blastocysts and first trimester placental 78 villous tissue provide a paradigm for elucidating molecular mechanisms regulating 79 trophoblast development (Okae et al., 2018). hTSCs show expression of well-established 80 markers of trophoblast including TP63 (Lee et al., 2007), GATA3 (Chiu and Chen, 2016) and 81 TEAD4 (Soncin et al., 2018) and can be directed to differentiate into both EVT and STB, as 82 identified by the expression of markers HLA-G (Apps et al., 2008) and SDC1 (Jokimaa et al., 83 1998) respectively. In addition, the establishment of a trophoblast organoid culture system 84 has allowed hTSCs to be cultured in 3D, somewhat resembling the architecture of villous CTB 85 cells and STB of the placenta (Haider et al., 2018;Turco et al., 2018). Most recently, induced 86 trophoblast stem cells (iTSCs) which resemble primary tissue-derived hTSCs have also been 87 captured in the process of reprogramming fibroblasts to human embryonic stem cells (hESCs) 88 in naïve cell culture conditions (Castel et al., 2020;Cinkornpumin et al., 2020;Dong et al., 89 2020;Guo et al., 2021;Liu et al., 2020) and when established naïve ES cells are cultured in 90 conditions which inhibit ERK and Nodal signalling (Guo et al., 2021). These in vitro models 91 can be utilised in studies of trophoblast biology and have the potential to further inform our 92 understanding of molecular mechanisms, transcription factors and signalling pathways 93 regulating human trophoblast biology. 94 NANOG. In all, these analyses confirmed that we successfully reprogrammed iTSCs that 288 resemble existing TSC lines based on the global transcriptome. 289 290 iTSCs function similarly to blastocyst-and placental-derived TSCs 291 We next examined whether iTSCs were able to undergo differentiation. We utilised a method 292 to generate terminally differentiated syncytiotrophoblast as has been previously described 293 (Okae et al., 2018). Accordingly, we observed formation of multinucleated syncytia which 294 exhibited a reduction in filamentous actin (F-actin) expression, indicating reorganisation of 295 the actin cytoskeleton (Fig. 5A). hCG is one of the first peptide hormones that is produced by 296 the syncytiotrophoblast (Kliman et al., 1986). Spent culture media from the iTSC-derived 297 syncytiotrophoblast cells was collected after 6 days of differentiation and subjected to an 298 over-the-counter pregnancy test kit which showed detectable hCG expression, similar to 299 exisiting hTSCs (Fig. 5B). 300 301

Narrowing down transcription factors essential to generate iTSCs 302
To determine which transcription factors are required for the induction of iTSCs, we examined 303 the effect of withdrawing individual transcription factors from the pool of factors on the 304 formation of hTSCs. MYC was kept constant reasoning that based on its expression pattern in 305 the embryo and reprogramming effect in other contexts that it functions to generally enhance 306 reprogramming rather than being essential for iTSC generation (Nakagawa et al., 2010). To 307 allow us to directly compare the success and efficiency of the iTSC programme induction we 308 maintained an equivalent total amount mRNA in each cocktail by replacing the omitted 309 factors with an equivalent amount of mRNA encoding GFP. 310

311
After 10 days of lipofection, cells were immunofluorescently analysed for the expression of 312 KRT18 and TP63 ( Fig. 6A; Fig. S1; Fig. S2) (Meistermann et al., 2021). Induction of the iTSC-313 programme was first determined by the presence of cells co-expressing nuclear TP63 and 314 filamentous KRT18 expression extending from the nucleus to the cell membrane, as we 315 observed in cells treated with the 5-factor cocktail ( Fig. 6B:A). Immunofluorescence analysis 316 for the detection of the additional markers, TEAD4 and KRT7, further confirmed the identity 317 of iTSCs ( Fig. S4:A). We observed that only the combination of GATA2, GATA3 and MYC 318 produced appreciable colonies of cells showing co-expression of TP63 and KRT18 ( Fig. 6B:K), albeit at lower proportions compared to the 5-factor cocktail. This 3-factor cocktail also 320 produced cells expressing nuclear TEAD4 and KRT7 ( Fig. S3K; yellow arrowheads). By contrast, 321 while the other conditions showed some upregulation of KRT7 and KRT18, the expression 322 pattern of the latter was patchy and disorganised, and the cells did not co-express TP63 nor 323 exhibit nuclear TEAD4 (Fig. 6B)(Fig. S3). This suggests a functional requirement for GATA2 and 324 GATA3 for the induction of iTSCs. 325 326

Discussion 327
In this study we investigate the sufficiency of four TE-associated transcription factors in 328 reprogramming hESCs to iTSCs. In the mouse CDX2, GATA3, TFAP2C and EOMES were 329 identified as key regulators of the TE lineage (Kuckenberg et al., 2010;Ralston et al., 2010;330 Russ et al., 2000;Strumpf et al., 2005). These factors are highly enriched in mouse TE and 331 mTSCs and have been shown to be capable of reprogramming fibroblasts towards mouse iTSC 332 (Benchetrit et al., 2015;Kubaczka et al., 2015). In this study we sought to identify human-333 specific TE transcription factors capable of reprogramming alternative cell types to iTSCs. 334

335
We identified GATA2, GATA3, TFAP2C and KLF5 as factors that are enriched in the human TE. 336 Together with MYC, these transcription factors were capable of reprogramming hESCs to 337 iTSCs that were transcriptionally equivalent to existing hTSCs and can be directed to 338 differentiate to alternative trophoblast populations. Future experiments to verify the in vivo 339 bipotent differential potential of these hTSCs may include engraftment studies into mouse 340 tissues and histological assessment to confirm their differentiation to both EVT and STB. In 341 support of GATA3 as a candidate hTSC reprogramming factor is the recent work implicating 342 GATA3 in the initiation of the TE lineage in human embryogenesis (Gerri et al., 2020). GATA2 343 is exclusively expressed in the TE of mouse, cow and human embryos, and regulates some 344 trophoblast-related genes in mouse and cow (Bai et al., 2011;Gerri et al., 2020;Ma et al., 345 1997). The mouse homologue of TFAP2C (Tcfap2c) is required post-implantation with 346 knockout mice dying at embryonic day 7.5 due to proliferation and differentiation defects in 347 the TE compartment (Auman et al., 2002;Werling and Schorle, 2002). In mouse TSCs, 348 TCFAP2C is detected at the promoters of key genes including Elf5, Gata3, Hand1, Id2 and 349 Tead4 (Kidder and Palmer, 2010). By contrast, the role of TFAP2C in human trophoblast 350 biology is not fully understood. It is known, however, that TFAP2C is a defining marker of CTBs across gestation (Lee et al., 2016) and it stimulates human placenta lactogen and hCG 352 production (Richardson et al., 2001). Functional analysis of KLF5 in human TE and trophoblast 353 has not yet been performed, but in the mouse this transcription factor is required for the 354 establishment of the inner cell mass and TE. Klf5 -/embryos arrest at the blastocyst stage due 355 to defects in TE development resulting in a failure to hatch (Lin et al., 2010). The expression 356 patterns of these factors during human embryogenesis, together with their associated roles 357 in TE and trophoblast biology, implicate GATA2, GATA3, TFAP2C and KLF5 as candidate 358 regulators of a human TE programme. repress target genes (Eloranta and Hurst, 2002;Pastor et al., 2018). Further molecular analysis 379 of hTSCs may reveal the TE-specific repertoire of TFAP2C gene targets and provide further 380 insight into the molecular mechanisms underpinning the TE transcriptional network. 381 Among the five transcription factors, we found that GATA2 and GATA3 are essential for 383 upregulation of the hTSC programme. Surprisingly, and in contrast to the mouse, TFAP2C 384 appears dispensable. While our reprogramming strategies worked, in the future, the 385 efficiency of hTSC reprogramming may be further enhanced by either equivalently increasing 386 the amount of each factor or by altering the stoichiometry of transcription factors, as has 387 been demonstrated for induced pluripotent stem cell and cardiomyocyte reprogramming 388 (Carey et al., 2011;Muraoka and Ieda, 2015). In addition, functional examination of these 389 factors in knock-out hTSCs may further refine the combination of essential factors required 390 for the maintenance of these cell types and these will be important studies in the future. The blastocyst-derived hTSCs and trophoblast organoids is that it cannot be ascertained whether 400 the starting population of cells would have given rise to a normal or a disease-affected 401 placenta. Attempts to derive hTSCs and trophoblast organoids after 12 weeks' gestation, or 402 at even later stages coincident with placental-related disease manifestation, have not been 403 successful (Haider et al., 2018;Okae et al., 2018;Turco et al., 2018). In addition, as we do not 404 have an understanding of the genetic basis of placental-related diseases we cannot currently 405 apply CRISPR/Cas9-mediated genome editing to create mutations for disease modelling in 406 existing hTSCs. Instead, we propose that the mRNA reprogramming strategy presented here 407 could be applied and further refined to generate iTSCs from patient fibroblasts, or fibroblasts 408 or mesenchymal stromal cells isolated from disease-affected placentas (Pelekanos et al.,  Alternatively, if fibroblasts are not amenable to this conversion, reprogramming could be 411 employed to generate patient-specific primed iPSCs as the starting population before 412 applying our five-factor strategy. This may have an advantage over transiting through naïve 413 iPSCs if it is found that epigenetic imprints are lost during iTSC generation depending on the 414 starting cell type or if there are persistent issues of karyotypic instability in the starting cell 415 type. Additionally, mRNA transfection has the benefit of being readily transferable to future 416 clinical application as it avoids the random integration of transgenes that can result in 417 genomic modification and tumorigenicity (Warren et al., 2010). In all, this strategy could allow 418 for the generation of a catalogue of clinically normal and disease-associated hTSCs that would 419 provide a tool for basic research into trophoblast biology as well as a powerful tool for 420 elucidating placental defects including recurrent miscarriage, preeclampsia, intrauterine 421 growth restriction and stillbirth as well as a future drug screening platform. were fixed approximately 2 h after thawing to allow them to recover. To collect day 6 and 7 437 samples, day 5 or 6 embryos were cultured to the appropriate time before fixation. 438 439

Microdissection of TE from human blastocysts 440
Embryos were placed in drops of G-MOPS solution (Vitrolife; Cat. No. 10129) on a petri dish 441 overlaid with mineral oil. The plate was placed on a microscope stage (Olympus IX70) and the 442 embryos were held with an opposing holding pipette and blastomere biopsy pipette 443 (Research Instruments) using micromanipulators (Narishige, Japan

RNA-seq analysis of TE samples 470
RNA-seq data from human TE was analysed as previously described (Blakeley et al., 2015;471 Hyslop et al., 2016). Briefly, the reference human genome sequence was obtained from 472 Ensembl, along with the gene annotation (GTF) file. The reference sequence was indexed 473 using the bowtie2-build command. Reads were aligned to the reference human genome 474 sequence using Tophat2 (Kim et al., 2013), with gene annotations to obtain BAM files for each 475 sample. BAM files were then sorted by read coordinates and converted into SAM files using 476 SAMtools. The process of mapping and processing BAM files was automated using a custom 477 Perl script. The number of reads mapping to each gene were counted using the program HTSeq-count (Version 0.6.1; Anders et al., 2015). The resulting count files for each sample 479 were used as input for differential expression analysis using DESeq2 (Anders and Huber,

Immunostaining 511
Embryos were fixed with 4% paraformaldehyde in PBS for 1 h at 4°C. Immunofluorescence 512 staining was performed as described previously (Fogarty et al. 2018). The primary antibodies 513 used are listed in Table S5. Embryos were placed on µ-Slide 8 Well coverslip dishes (Ibidi; Cat. 514 No. 80826) for confocal imaging. Imaging was performed using a Leica SP5 confocal 515 microscope and 3 μm thick optical sections were collected. hESC, iTSC and hTSC lines were 516 fixed with PBS 4% PFA for 1 h at 4°C, then washed three times with PBS (Life Technologies; 517 Cat. No. 14190-094). Blocking was achieved by incubation with PBS with 10% donkey serum 518 confluency. Cells were first transduced with the pLVX-tet3G lentivirus followed by selection 545 with G418 (250 μg/ml) for one week. Resistant cells were selected for at least 2 passages and 546 then transduced with the pLVX-TRE3G-GOI lentivirus pool, followed by selection with 547 puromycin (0.5 μg/ml) for four days. For the induction of GOI expression, doxycycline was 548 added to the mTeSR1 media at a concentration of 1 μg/ml. Clonal lines were generated and 549 screened for transgene integration. Lines were cultured in the presence of doxycycline for 48 550 h before cDNA was harvested and RT-qPCR was performed to determine the levels of 551 transgene induction. 552 553

Generation of templates for lentivirus production and in vitro transcription (IVT) 554
The vector design strategy was informed from previously published reports (Mandal and 555 Rossi, 2013). Briefly, nucleotide sequences of the open reading frames for the canonical 556 isoforms of reprogramming factors were identified from the Ensembl genome browser, and 557 the sequences were verified against the corresponding amino acid sequences in Uniprot. 558 Individual in vitro transcription (IVT) template constructs for GATA2, GATA3, TFAP2C and KLF5 559 consisting of a T7 promoter-5ʹUTR-Open Reading Frame-3ʹUTR-T7 terminator cassette cloned 560 into a pUC57 backbone were custom made (Genewiz UK Ltd). 5'UTR and 3'UTRs were added 561 to maximize stability of mRNA transcripts and to increase protein translation. The ORFs of 562 MYC and GFP were templated from plasmids bearing human MYC and GFP and ligated into 563 the pUC57 backbone flanked by the 5'UTR and 3'UTR sequences. Annotated sequence files of 564 all constructs are provided in Table S6. 565 566

Generation of modified mRNAs by in vitro synthesis 567
The mRNA synthesis protocol has been described previously (Mandal and Rossi, 2013). Briefly, 568 dsDNA templates were linearised from cDNA clones in pLVX vectors for GATA2, GATA3, 569 TFAP2C, KLF5 and MYC. A small amount of digestion mix was run on a gel to confirm complete 570 digestion. Linearised plasmid was purified using PCR purification kit (Qiagen; Cat. No. 28104). 571 A nanodrop was used to confirm the purity of the eluted product according to the 260/280 572 ratios. Poly(A) tail was added using KAPA PCR ready mix (2X), Xu-F1 and Xu-T120 primers 573 (Integrate DNA Technologies) and digested plasmid adjusted to 10 ng/ul. Tail PCR was run for 574 32 cycles and purified using PCR purification kit. In vitro transcription was performed using

Data from this study 614
Sequencing data from sample replicates comprising iTSCs (n = 4), H9 ES cells (n = 3) and 615 primary CT27 cells (n = 3) were first checked using the FastQC package 616 (https://www.bioinformatics.babraham.ac.uk/projects/fastqc/). Adapter removal was 617 performed using Trimgalore v0.6.6 (https://github.com/FelixKrueger/TrimGalore), and the 618 trimmed read data were re-checked for conformity and quality with FastQC. We are very grateful to the donors of human embryos whose contributions enable this 642 research. We thank all members of the Niakan lab for help and comments on the paper. We