HMGXB4 Targets Sleeping Beauty Transposition to Vertebrate Germinal Stem Cells

Transposons are parasitic genetic elements that frequently hijack key cellular processes of the host. HMGXB4 is a Wnt signalling-associated HMG-box protein, previously identified as a transcriptional regulating host factor of Sleeping Beauty (SB) transposition. Here, we establish that HMGXB4 is highly expressed from the zygote stage, and declines after transcriptional genome activation. Nevertheless, HMGXB4 is activated by its own promoter at 4-cell stage, responding to the parental-to-zygotic transition, marks stemness, and maintains its expression during germ cell specification. The HMGXB4 promoter is located at an active chromatin domain boundary. As a vertebrate-specific modulator of SETD1A and NuRF complexes, HMGXB4 links histone H3K4 methyltransferase- and ATP-dependent nucleosome remodelling activities. The expression of HMGXB4 is regulated by the KRAB-ZNF/TRIM28 epigenetic repression machinery. A post-transcriptional modification by SUMOylation diminishes its transcriptional activator function and regulates its nucleolar trafficking. Collectively, HMGXB4 positions SB transposition into an elaborate stem cell-specific transcriptional regulatory mechanism that is active during early embryogenesis and germline development, thereby potentiating heritable transposon insertions in the germline.


Introduction
HMGXB4 (previously known as HMG2L1) was shown to inhibit Wnt signalling 1 and smooth muscle differentiation 2 . Nevertheless, HMGXB4 is not commonly recognized as relevant for development.
HMGXB4 was also detected as the most abundant protein in the interactome of the ATP-dependent nucleosome remodelling NuRF (nucleosome remodelling factor) complex 3 , still its role in chromatin remodelling is not characterized. The multi-subunit NuRF complex relaxes condensed chromatin to promote DNA accessibility and transcriptional activation of targeted genes [4][5][6] . NuRF is a phylogenetically conserved chromatin remodelling complex, originally identified in Drosophila 7 . The human core complex of NuRF has similar properties to its Drosophila counterpart, and shares the orthologs of three of four components, BPTF (Bromodomain PHD finger transcription factor), SNF2L/SMARCA1 (SWI/SNF Related, Matrix Associated, Actin Dependent Member 1) and the WD repeat containing protein RBAP46/48. The core BPTF contains a PHD finger and a bromodomain that bind to trimethylated histone H3 lysine 4 (H3K4me3) and acetylated histones, respectively 8 .
H3K4me3 is highly enriched at transcription start sites (TSSs) of active genes and controls gene transcription 9,10 . In mammals, SETD1A histone methyltransferase complexes specifically methylate H3K4 11 . SETD1A and NuRF complexes can functionally collaborate to regulate promoter chromatin dynamics (e.g. during erythroid lineage differentiation 12 ). Promoters located at the border at Topological Associated chromatin Domains (TADs) are at key genomic positions, offering multiple looping possibilities with neighbouring transcriptional units.
HMGXB4 is a transcriptional activator of the Sleeping Beauty transposase 13 . Transposons or transposable elements (TEs) are discrete segments of DNA that have the distinctive ability to move and replicate within genomes across the tree of life. TEs are capable of invading naïve genomes by horizontal transfer (e.g. 14 ). The invasion could be successful if the host-encoded factors, required for transposition are phylogenetically conserved, and are readily available in the naïve organism. The general assumption is that TEs (and viruses) piggyback essential host encoded factors to assist their life cycle. HMGXB4 is such a candidate.
Sleeping Beauty (SB) was resurrected from inactive transposon copies from various fish genomes 15 . SB transposes via a DNA-based "cut and paste" mechanism, and utilizes several conserved host-encoded factors 13,[16][17][18][19][20] . These host-encoded factors regulate transposition throughout the transposition reaction 20 . The SB transposon consists of a single gene encoding the transposase, flanked by two terminal inverted repeats (IRs), which carry recognition motifs for the transposase. The 5'-UTR region of SB can function as a promoter of the transposase 13 , and HMGXB4 enhances transposase expression by interacting with sequences located in the 5'-UTR region of the transposon 13 . SB, by contrast to the Drosophila P element, that is controlled by a germline specific splicing process 21 , is not restricted to the germline, and is able to transpose in a wide variety of cells, including both somatic and germinal origin [22][23][24] .
Nevertheless, since somatic transposition is not heritable, transposons must be targeted to the germ cells to transmit their genome to the next generation. To achieve heritable mobilization, certain TEs transpose in undifferentiated germ cells (primordial germ cells) during embryonic and larval stages and germline stem cells in later developmental stages. Although the host encoded factor targeting factor is unknown, this strategy is used by the P element in Drosophila 25 . In contrast, retrotransposons barely mobilize directly in germline stem cells 26 . How SB is targeted to the germline is currently unknown.
Following the premise that HMGXB4 is involved in key biological processes, attractive to be captured by a transposon, we used SB transposition as a model to characterize these functions in a developmental context. Our study revealed that HMGXB4 is a vertebrate-specific member of the NuRF complex, and is indeed an essential, but rather overlooked developmental factor, connecting somatic and germinal stemness in early embryogenesis. HMGXB4 targets SB expression to germinal stem cells, where in conjunction with the NuRF complex and SETD1A remodels the chromatin, enhances the expression of the transposase and processes the de novo transcripts.

Results
HMGXB4 is among the earliest genes expressed in development As Wnt signalling, which is associated with HMGXB4 1,2 , is one of the earliest cellular processes activated in the developing embryo, we determined the expression profile of HMGXB4 during early development. Our analysis of single cell (sc) transcriptome datasets (scRNA-seq) of mouse and human pre-implantation embryos 27,28 revealed that HMGXB4 is among the ~300-400 genes detected at significant level (Log2 FPKM > 2) in every cell (Fig. 1a, Supplementary Fig. 1a-b). Furthermore, HMGXB4 is highly expressed prior to embryonic gene activation (EGA), followed by a reduced (but still significant) expression in the preimplantation embryo (Fig. 1a).
In addition to mammalian embryos, we monitored the expression of zHMGXB4 during zebrafish development from the zygote stage to hatching. In conjunction with the mammalian data, our qPCR data shows that zHMGXB4 is highly expressed already in the zebrafish zygote, and its expression level drops from the maternal stage to zygotic transition (Fig. 1b). Following its sharp decline after the blastula stage, zHMGXB4 expression is detectable again in pharyngulas (Fig. 1b). Thus, besides the first steps of embryogenesis, in agreement with its proposed association with Wnt signalling 1,2 , HMGXB4 is expected to act throughout embryonic development.

HMGXB4 is part of a regulatory network of stemness
To gain insight into the transcriptional regulation of HMGXB4, we performed an integrative analysis of RNA-seq (N ~300), Hi-C, ChiP-seq/ChIP-exo/CUT&RUN of transcription factors (TFs) and histone modification data over the HMGXB4 locus in HeLa, embryonic stem cells H1_ESCs and human early embryogenesis [28][29][30][31] . As KRAB-ZNF transcriptional regulators, known to recruit the TRIM28/KAP1-mediated transcriptional repression machinery to specific gene targets in early vertebrate development 32 , we mapped ChIP-exo seq peaks of 230 KRAB-ZNF proteins 33 around the genomic locus of HMGXB4. We also determined the expression dynamics of the potential regulators during early embryogenesis. Our approach uncovered that HMGXB4 expression is activated by both MAPK1 (alias ERK2) and ELK1 transcription factors (Fig. 1c), and identified repressive KRAB-ZNF proteins (e.g. ZNF468, ZNF763 and ZNF846) harbouring significant peaks (adjusted p-value < 1e-7) at the transcription start site (TSS) of HMGXB4 ( Supplementary Fig. 1d-e). Notably, while, the expression of HMGXB4 matches the dynamic of MAPK1/ERK2 throughout the human preimplantation embryogenesis, it is antagonistic to the ZNF468 repressor ( Supplementary Fig. 1c).
Collectively, our analyses suggest that HMGXB4 is part of a regulatory network of stemness, implicated in coordinating pluripotency and self-renewal pathways 31 , and is expected to be epigenetically controlled by repressive histone marks.
The analysis of 3-Dimensional (3D) conformation of human ESC genomes revealed that the promoter of HMGXB4 is marked by CTCF (Fig. 1d), and co-occupied by ChIP-seq peaks for H3K27ac, MED1 (Mediator 1), POU5F1/OCT4 and POLII over the TSS of HMGXB4, connecting gene expression and chromatin architecture 34 . Adding additional layers of CUT&RUN data analysis for H3K4me3 and H3K27me3 uncovers that the TSS of HMGXB4 has an enrichment for H3K4Me3 but not for H3K27Me3 in human 4-cell, 8-cell and ICM (inner cell mass), indicating that the promoter of HMGXB4 is active in all the stages of pre-implantation development. In addition, the promoter is active in germinal vesicle (GV) stage oocytes, suggesting that the expression of HMGB4 might link somatic and germinal stem cells. The 3D and ChiP analyses suggest that the promoter of HMGXB4 forms a boundary at active compartments (Fig. 1d). This key genomic boundary position might enable multiple interactions between enhancers and promoters in (both somatic and germinal) stem cells.

HMGXB4 links pluripotent and germinal stem cells
Detecting a high expression signal in oocytes led us to analyse single cell transcriptome datasets of germ cells at several developmental time points (GSE86146) as well as datasets of sex-specific germ cells (GSE63818) 35,36 . We readily observed elevated HMGXB4 expression in both female and male germ cells, compared with somatic cells in the same niche ( Fig. 1e and Supplementary Fig. 1f-g). In addition, we analysed scRNA-seq data of germ cells upon differentiation from pluripotent stem cells in vitro (GSE102943) 37 . This analysis revealed that HMGXB4 was expressed at comparable levels in both pluripotent and germinal stem cells (CD38 + ), and that the expression of HMGXB4 was maintained during the pluripotent to germ cell transition ( Supplementary Fig. 1f). Its expression, by contrast, declined in differentiated germ cells (CD38 -) ( Supplementary Fig. 1f), suggesting that HMGXB4 expression is specific to stem cells. This pattern of HMGXB4 expression is supported by the analysis of a large cohort of scRNA-seq datasets of gonad development (~ 3000 single cells, GSE86146) 35 (Fig. 1e), identifying HMGXB4 as a novel factor specific for pluripotent and germinal stem cells.
To substantiate the differential expression of HMGXB4 between stem versus differentiated germ cells, we used a mammalian (rat) spermatogonial stem cell (SSC) differentiation model. These SSCs maintain their stemness on mouse embryonic fibroblast (MEF) feeders, and differentiate when MEFs are replaced by STO (SNL 76/7) cells 38 . In conjunction with the single cell transcriptome analyses, this approach supported the specific expression of HMGXB4 in spermatogonial stem cells, whereas its expression levels (both transcript and protein) sharply dropped upon differentiation ( Fig. 1f-g), indicating that the expression of HMGXB4 is tightly regulated between self-renewing and differentiated states.
HMGXB4 activates Sleeping Beauty transposition in the germline HMGXB4 has been identified as a host-encoded of factor of SB transposition, serving as a transcriptional activator of transposase expression 13 . Specific expression of HMGXB4 in the germline tempted us to ask whether HMGXB4 is a host-factor that potentiates SB transposition in the germline.
To answer, we established a quantitative SB transposon excision assay in SSCs, cultured on MEF or on STO cells ( Fig. 2a-b). In the assay, SB transposase expression is driven by the transposon's 5'UTR containing the sequences at which HMGXB4 transactivates SB transcription. Our assay revealed that the frequency of SB excision was high in SSCs kept on MEFs, while sharply declined upon culturing on STO cells, which triggers differentiation (Fig. 2b). Thus, the rate of transposon excision matches the expression level of HMGXB4, suggesting that HMGXB4 is likely a host-encoded factor associated with activating SB transposition in germinal stem cells. Notably, SB excision, at a decreased level, still occurs in differentiated cells (Fig. 2b), agreeing with the assumption that the requirement of HMGXB4 for transposition is not absolute 13 . The transcriptional activation function of HMGXB4 is conserved in vertebrates A returning question of transposon-host interaction studies concerns their cross-species conservation.
While the HMGXB4 gene exists in all vertebrate species, the coding sequence of the fish version is significantly divergent (35%) from its human counterpart (Supplementary Fig. 2a-b). As SB is originated from fish genomes 15 , we asked if the transcriptional enhancer effect 13 of the human (h)HMGXB4 on SB transposition was reproducible in (zebra)fish embryos. In a reporter assay, luciferase expression was controlled by the 5'-UTR region of the transposon or by a mutated version, where the HMGXB4-responding region was deleted (pLIRΔHRR) 13 . Luciferase activity driven by the 5'-UTR was detectable in zebrafish (Danio rerio) embryo extract and depended on the presence of the HMGXB4 responding region (Fig. 2c). In addition, we transiently overexpressed zebrafish (z)HMGXB4 or (h)HMGXB4 by co-injecting the corresponding expression constructs with the reporter into zebrafish embryos. The presence of zHMGXB4 elevated the transcription of the 5'UTRluciferase reporter three-fold above the level obtained using (h)HMGXB4 in a similar assay (Fig. 2d).
The activator effect of zHMGXB4 was even higher compared to the human ortholog when tested in a colony forming transposition assay performed in HeLa cells (Fig. 2e). Collectively, while SB transposition responded more robustly to zHMGXB4 compared to hHMGXB4, the effect/pattern was similar, confirming our hypothesis that HMGXB4 is a conserved host factor of SB transposition in vertebrates 20 . Thus, the transcriptional activation function of HMGXB4 can be modelled by SB transposition from fish to human cells. To find out if HMGXB4 is covalently modified by SUMO1 39 , a tagged HMGXB4-HA (either human or zebrafish origin) was co-expressed with SUMO1, and the protein extracts were analysed by Western blotting (Fig. 3a and Supplementary Fig. 3b). This approach detected a slower migrating band in the presence of SUMO1. Re-probing validated the shifted band to represent SUMOylated protein product ( Fig. 3a), suggesting that SUMO1 specifically modifies HMGXB4 via a covalent bond formation to diglycine. To find out if only SUMO1 or other members of the SUMO family, such as SUMO2 and SUMO3 might also modify HMGXB4, expression constructs of the SUMO1,2,3 were co-transfected into HeLa cells, and protein extracts were analysed by Western blotting (Supplementary Fig. 3c). This approach detected slower migrating bands in the presence of all of the tested versions of SUMO, though the most intensive signal appeared (as two shifted bands) in the presence of SUMO1 ( Supplementary Fig. 3c). Thus, while all the three SUMO versions could modify HMGXB4, SUMO1 is the most potent modifier.
To map the SUMOylated lysine (K) residues of HMGXB4, we selected those that were phylogenetically conserved among vertebrate orthologs of HMGXB4 ( Supplementary Fig. 3d), and converted them to arginine (R) by site-specific mutagenesis ( Supplementary Fig. 3e). While most of the K to R substitution mutants only partially affected SUMOylation, the combination of K317R and K320R mutations abolished the two SUMOylated bands ( Fig. 3b and Supplementary 3e). We used this version, called as HMGXB4 SUMO-, for further experiments.
SUMOylation of HMGXB4 is reversible via SENP-mediated deconjugation, is stress sensitive and does not depend on PIAS1 SUMOylation is a highly dynamic reversible process enabling transient responses to be elicited, which is controlled by conjugating and de-conjugating enzymes. The SUMO moiety can be removed by the SENP [SUMO1/sentrin/SMT3)-specific peptidase] family of SUMO-specific proteases, SENP(1-3) and recycled in a new SUMOylation cycle (reviewed in 40 ). To decipher which SENPs de-conjugate SUMO from HMGXB4, we tested SENP1, SENP2 and SENP3 with SUMO1, SUMO2 and HMGXB4-HA in in vitro SUMOylation assays. As expected 41,42 , both SENP1 and SENP2 reduced SUMO1 conjugation, whereas SENP3 deconjugated SUMO2 ( Fig. 4a-b). Notably, SUMO1 modification of HMGXB4 is sensitive to the presence of the chemical stress factors, ethanol and H2O2, suggesting a potential additional layer of regulation by stress (Fig. 4c). In these conditions, SB transposition has a slight (~120 %), but reproducible elevation (not shown).
In principle, SUMOylation can be also facilitated by E3 ligases, such as PIAS proteins 43 . As PIAS1 was identified as an interactor partner of HMGXB4 in our Y2H assay (confirmed also by co-IP ( Supplementary Fig. 3a), we tested various members of the PIAS family, as well as their mutant versions, incapable of SUMO E3 ligase activity (e.g. PIAS1(C350A), PIASxα (C362A) and PIASxβ (C362S) 44 in a SUMOylation assay (Fig. 4D). Our results argue against a role of PIAS1 as an E3 ligase for HMGXB4. E3 SUMO ligase-independent activities of the HMGXB4-recruited PIAS1 might involve transcriptional coregulation 45 (not followed up in the current study).
Non-SUMOylated HMGXB4 links histone H3K4 Methyltransferase-and ATP-dependent nucleosome remodelling activities SUMOylation might affect several aspects of the target protein, including structure, interaction partners, cellular localization, enzymatic activity or stability (reviewed in 40 ). HMGXB4 has an estimated half-life of ~30 hours (Protparam/Expasy). Notably, both HMGXB4 wt and HMGXB4 SUMOwere detectable at similar levels at different timepoints following a cyclohexamide treatment (Supplementary Fig. 4a-b), indicating that SUMOylation had no effect on the stability of HMGXB4 protein. Alternative to a hypothesis-driven strategy, we have performed an unbiased high throughput protein interactome analysis to decipher the effect of SUMOylation on HBGXB4 function. We used a triple SILAC pull-down approach, suitable for relative quantification of proteins by mass spectrometry 46 . We also included the SB transposase in the assay in order to find out which functions of HMGXB4 are affected by the presence of the transposase. Thus, in the experimental setup, we transfected HEK-293T cells with HA-tagged HMGXB4 wt and HMGXB4 SUMOin the presence/absence of SB transposase and/or SUMO1 ( Supplementary Fig. 4c). While the interactomes of HMGXB4 wt and HMGXB4 SUMOwere highly related, SUMOylation specifically affected the affinity of HMGXB4 to its interaction partners in two groups of proteins.
Firstly, in the HMGXB4 SUMOinteractome, we detected the BAP18 (BPTF associated protein of 18 kDa, alias (C17orf49) ( Fig. 3c and Supplementary Fig. 5e). BAP18, in association with the ATPdependent NuRF active (H3K4me3) chromatin reader complex 3 , has been previously reported in androgen receptor induced transactivation 47 . In addition, among the most differentially recruited proteins of HMGXB4 SUMO-, we also identify SETD1A (Fig. 3c-d and Supplementary Fig. 5f), a histone-Lysine N-methyltransferase generating mono-, di-and trimethylation at H3K4 (H3K4me1-3) at transcriptional start sites of target genes 48 , indicating that HMGXB4 links members of a multiprotein complex that participates in both depositing and reading active chromatin marks at H3K4 (Fig. 3e).
Notably, in comparison to HMGXB4 SUMO-, the wildtype HMGXB4 has a lower affinity for both SETD1A and C17orf49/BAP18 (Fig. 3c-d), suggesting that the activation of target gene expression by HMGXB4 is controlled by SUMOylation. Thus, non-SUMOylated HMGXB4 links histone H3K4 methyltransferase-and ATP-dependent nucleosome remodelling activities.
To confirm the effect of SUMOylation on the transcriptional activator function of HMGXB4, using the SB model, we performed both transcription and transposition assays in the presence of either

SUMOylation induces nucleolar compartmentalization of HMGXB4
The second notable group of the differential proteome of HMGXB4 WT /HMGXB4 SUMOwas associated with nucleolar functions (> 25%) (Fig. 5A), suggesting that the SUMOylation might affect nucleolar compartmentalization and activities. The nucleolar interactors had higher affinity to HMGXB4 WT , and were associated Translation initiation and elongation, Transcriptional control (Fig. 5b), Nonsense-mediated decay, Ribonucleoprotein complex (ribosomal structure) (Fig.s S5A-B). The presence of the SB transposase intensified the affinity of interaction in all of these GO categories ( Fig. 3d and Supplementary Fig. 5d). Thus, via HMGXB4, the transposase might sponge on transcription activation, non-sense-mediated decay, transcript processing and protein translation machineries of the host cell.
To validate nucleolar localization and its regulation by SUMOylation, we used confocal microscopy to monitor subcellular trafficking of HMGXB4 upon SUMOylation. We co-transfected expression vectors of HA-tagged HMGXB4, HMGXB4 SUMO-, EGFP-tagged SUMO1 (EGFP-SUMO1) and SB (EGFP-SB) into HeLa cells in various combinations, and subjected the cells to microscopy. This strategy revealed an antagonistic subcellular localization pattern of HMGXB4 and HMGXB4 SUMO-( Fig. 5c and Supplementary Fig. 6), supporting the prediction of our differential interactome data.
While HMGXB4 SUMOstayed in the nucleoplasm, HMGXB4 co-localized with the nucleolar marker fibrillarin, confirming that SUMOylation regulates the subnuclear trafficking of HMGXB4 ( Fig. 5c and Supplementary Fig. 6), and thus physical sequestration from the NuRF/SETD1A complex. It is notable that while, endogenous SUMO1 level is capable of supporting the nucleolar trafficking of HMGXB4, HMGXB4 SUMO-, in the presence of SB transposase partially disrupts the integrity of the nucleolus, which mobilizes the fibrillarin marker all over the cytoplasm (Fig. 5c).
SB transposase co-localized with either or with HMGXB4 SUMOin the nucleoplasm or HMGXB4 in the nucleolus ( Fig. 4c and Supplementary Fig. 6), suggesting that SB transposition piggybacks both nuclear and nucleolar functions of HMGXB4, involved in transcription initiation and transcript processing, respectively. Curiously, unlike HMGXB4, the SB transposase is enriched in the perinuclear nuage of cells (Fig. 4c), where the machinery of piRNA biogenesis is concentrated 49 .

Discussion
Viruses and transposons frequently piggyback 'essential' cellular mechanism(s) of the host. The role of HMGXB4 in Sleeping Beauty (SB) transposition is conserved from fish to human, supporting the assumption that HMGXB4-SB transposon interaction can be generally modelled in vertebrates 20 .
Here, we used the HMGXB4-SB host-parasite interaction model to decipher certain cellular function(s) of the transposon-targeted, but otherwise poorly characterized developmental gene, HMGXB4. Our study identifies HMGXB4 as a novel factor linking pluripotency to the germline, and the host encoded factor that shepherds SB transposition to germinal stem cells.
HMGXB4 is among the first expressed genes in the embryo, and in agreement with its regulatory role in Wnt signalling 1,2 , its expression level is dynamically changing throughout embryogenesis. Following maternal expression, HMGXB4 is activated by its own promoter at 4-cell stage, responding to the parental-to-zygotic transition. HMGXB4 marks stemness, and maintains its expression during germ cell specification. The promoter of HMGXB4 is located at an active chromatin domain boundary in stem cells, potentially offering multiple looping possibilities with neighbouring genomic regions. Thus, beside the germline, the recruitment of HMGXB4 supports efficient SB transposition during early embryogenesis in various somatic progenitor cells, suggesting that HMGXB4 is primarily recruited as a spatio-temporal transcriptional activator of the transposase in stem and progenitor cells.
HMGXB4 provides a physical bridge between BAP18 and SETD1A, thereby linking histone H3K4 methyltransferase-and ATP-dependent nucleosome remodelling activities. Notably, HMGXB4 is not conserved outside vertebrates, thus it provides a vertebrate-specific function(s) to the core NuRF complex, first identified in Drosophila 7 .
Via HMGXB4, SB piggybacks a multiprotein complex, capable of both depositing and reading active chromatin marks at H3K4. Furthermore, via ERK2/MAPK1-ELK1, MED1, CTCF and POU5F1, HMGXB4 is part of the transcription regulatory network, implicated in pluripotency and selfrenewal 31 .
HBGXB4 is regulated by a reversible post-translational modification, SUMOylation. While SUMOylation does not affect the stability of the HMGXB4 protein, it regulates its binding affinity to its protein interacting partners. The non-SUMOylated HMGXB4 recruits the SETD1A/NuRF complex, and acts as a transcriptional activator, whereas SUMOylation serves as a signal for its nucleolar partition. The recruitment of HMGXB4 from the nucleoplasm to the nucleolus provides a flexible regulation of the transcription activating epigenetic machinery by affecting the stoichiometry of the HMGXB4 containing protein complexes.
The SB transposase follows its host factor during its subnuclear trafficking to the nucleolus, thus piggybacks both SUMOylated and non-SUMOylated functions of HMGXB4. In addition to its wellcharacterized role in ribosome biogenesis 50 , the nucleolus is involved in several other crucial functions, including maturation and assembly of ribonucleoprotein complexes, cell cycle regulation and cellular aging. These nucleolar functions are frequently targeted by several viruses to support their own replication (reviewed in 51 ). For example, regulatory viral proteins, such as the accessory protein 3b from the SARS-CoV, affecting cell division and apoptosis, predominantly localizes in the nucleolus 52,53 .
Interestingly, the SB transposase, but not HMGXB4, is enriched in the perinuclear nuage-like structure, associated with piRNAs, known to repress transposable elements via RNAi (reviewed in 49 ).
Curiously, SB is not endogenous in human, thus there are no SB-specific piRNAs present in human cells, suggesting that SB might be capable of recognizing an evolutionary conserved feature of Piwiinteracting small RNA (piRNA) biogenesis.
The HMGXB4-mediated germline targeting is a likely conserved feature of the Tc1-like family of transposons (where SB belongs) in vertebrates. In addition to the Tc1-like elements, the Drosophila P element utilizes a similar strategy to target germinal stem cells 25 . While it has been shown that the targeting process in Drosophila is controlled by the piRNA pathway 25 , the host encoded targeting factor of P element transposition is yet to be identified, and could not be identical to the vertebrate specific HMGXB4.
Unlike retrotransposons that rarely mobilize in undifferentiated germinal stem cells 26 , SB directly targets this cell type. Retrotransposons, by contrast, use an indirect approach. In this scenario, certain Drosophila retrotransposons were shown to "hijack" the microtubule transporting system to transfer their transcripts from the interconnecting supporting nurse cells to the transcriptionally inactive oocyte 26 . Mammalian retrotransposons likely use a similar scheme 54 . The more aggressive, direct targeting strategy used by DNA transposons (e.g. Sleeping Beauty, P element) is expected to generate a higher level of germline toxicity, and might -at least partially -explain the evolutionary success of retrotransposons over DNA transposons in higher vertebrates.
As a third strategy, TEs were suggested to manipulate the blastomere to adopt a germinal, rather than somatic fate. During this process, TE-derived sequences have been incorporated into gene regulatory networks of the pluripotent cells 19,[55][56][57][58] .
While, HMGXB4 is involved in epigenetic regulation of gene expression itself, it is controlled by the stress-sensitive KRAB-ZNF/TRIM28-mediated epigenetic repression mechanism 59 . In addition, the SUMO-specific conjugation of HMGXB4 is also a stress-inducible, dynamic process, and thus could activate transcription upon environmental changes. The stress-sensitiveness of HMGXB4 would enable SB transposon to sense and react to cellular stress, a known feature of transposable elements 60 .
Similar, a stress responsive SUMO-regulated chromatin modification has been also implicated in reactivating integrated viruses in the genome (e.g. heterochromatin histone demethylase, JMJD2A in Kaposi's sarcoma associated herpes virus (KSVH) 61 .
Importantly, our current work on deciphering a relationship between a host-encoded factor piggybacked by a transposable element also spotlights on so far overlooked aspects of HMGXB4.
Besides nucleosome remodelling, HMGXB4 is involved in modulating downstream regulatory processes of target gene activation and production. The activity of HMGXB4 is stem/progenitor cell specific, and the expression level of HMGXB4 drops sharply upon differentiation and stays at an undetectable level in differentiated cells. Nevertheless, HMGXB4 is epigenetically regulated, stress sensitive and when expressed, it can support target gene activation in any cell type. Thus, aberrant activation of HMGXB4 in differentiated cells (e.g. cancer) might result in undesirable gene expression.
In this context, it is notable that HMGXB4 has been identified as a target of epithelial splicing regulatory proteins upon epithelial-mesenchymal transition (EMT) 62 , suggesting that HMGXB4 could be an important target in future cancer research.
The underlined lysine (K) residues were subjected to site directed mutagenesis to arginine (R) and tested in the in vitro SUMOylation assay (co-transfecting them with SUMO1 into HeLa cells and subjected to immunoblotting) ( Supplementary Fig. 3d).

Number
Position Sequence  Table 2 List of ubiquitously expressed gene in early human development (.xlsx)

Methods
Constructs pCAG-Venus-SB10, The SB10 transposase 15 gene was cloned into pCAG-Venus, zHMGXB4 coding sequence was PCR amplified from the cDNA of zebrafish embryo with zHMG EcoRI Fwd. and NotI Rev. primers. The PCR product was digested with EcoRI and NotI restriction enzymes and sub-

Mitotic inactivation of MEFs
Mouse embryonic fibroblasts (MEFs) are often used as feeder cells in embryonic stem cell research.
MEFs were isolated from 12.5 to 13.5 post coitum (p.c.) mouse embryos. The embryos were dissociated and then trypsinized to produce single-cell suspensions. After expansion, confluent MEFs cells were treated with 10µg/ml mitomycin-C (Sigma) for 2 hours in DMEM at 37°C. Cells were then washed twice with PBS followed by trypsinization, and counted before dilution and plating.

Rat spermatogonial stem cells culturing
Rat spermatogonial stem cell lines were cultured on mitomycin-C treated MEFs in spermatogonial culture medium (SG medium) as described 67  Stable isotope labelling with amino acids in cell culture (SILAC) This protocol relies on the incorporation of amino acids containing substituted stable isotopic nuclei (e.g. 12 C, 13 C and 13 C/ 15 N) into proteins in living cells. The three cell populations are grown in culture media that are identical except that one medium contains a "Light," and the other two medium a "Medium Heavy" (or Medium) and "Heavy," form of a particular amino acid ( 12 C-Arginine, 13 C-Arginine and 15 N-Arginine, respectively). The mass spectra data were analysed using MetaCore from GeneGo Inc (www.genego.com). A fold change cutoff of 0.5 with a p-value < 0.05, was set to identify proteins whose expression was significantly differentially regulated. Enrichment analysis was conducted using GeneGo curated ontologies along with Gene Ontology to provide a quantitative analysis of the most relevant biological functions represented by the data.

Mass spectrometry
A triple SILAC pull-down experiment using anti-HA resin to investigate interaction partners of