A small protein, derived from an alternative 53BP1 promoter transcript and expressed via translational reinitiation on an internal overlapping ORF, modulates proteasome activity

The complexity of the metazoan proteome is significantly increased by the expression of small proteins (<100 aas) derived from smORFs within lncRNAs, uORFs, 3’ UTRs and, more rarely, reading frames overlapping the CDS. These smORF encoded proteins (SEPs) can have diverse roles, ranging from the regulation of cellular physiological to essential developmental functions. We report the characterisation of a new member of this protein family, SEP53BP1, derived from a small internal ORF that overlaps the CDS that encodes 53BP1. Its expression is coupled to the utilisation of an alternative, cell-type specific, promoter coupled to translational reinitiation events mediated by a uORF in the alternative 5’ TL of the mRNA. The uORF-mediated initiation at the internal AUG53BP1 is conserved in metazoan species ranging from human to zebrafish. As such, it couples SEP53BP1 expression to the integrated stress response (ISR). We demonstrate that one function of this protein is to interact with, and stimulate, the activity of the 26S proteasome. As such, it opens the door to new approaches in the treatment of clinical conditions that arise due to the accumulation of toxic intracellular protein aggregates


INTRODUCTION
Protein synthesis represents a key step in the regulation of gene expression. The differential recruitment of mRNA populations onto polysomes permits a rapid response to changes in the cellular environment. As such, it is a key process in the maintenance of homeostasis, and perturbations in its control are associated with numerous disorders. Translation can be subdivided into four main steps: initiation, elongation, termination and sub-unit recycling. Most regulation is exerted at initiation, and this has been confirmed in translational profiling studies covering the entire mammalian transcriptome 1  being optimal 2 . If sub-optimal, scanning ribosomes will sometimes ignore the AUG codon and continue to the next. This phenomenon, known as leaky scanning, can produce N-terminal truncated proteins or proteins from overlapping reading frames 3,4 .
The 5'TL contains a number of features that can regulate the translational readout during both PIC recruitment and subsequent scanning 5 . This includes uAUGs and uORFs (upstream Open Reading Frames). Genomic analysis has estimated that ~50% of human 5'TLs contain one or more uORFs 6,7 . Both uAUGs and uORFs can function as translational repressors limiting PIC access to downstream start codons 8 . The amplitude of this repression is dictated by the uAUG context 9 . However, small uORFs (< 50 codons) can also couple the readout to stress and TC levels in the cell, via a process referred to as delayed reinitiation in which the 40S ribosome remains on the mRNA and continues to scan subsequent to translation of the uORF. This process permits access to start codons downstream of the AUG of the principle ORF (AUG GENE ) 7,10 . However, the efficiency of reinitiation at downstream start sites varies depending on parameters such as uORF length and the distance between the stop codon and the AUG. This process is conserved from human to zebrafish 11 . Reacquisition of the Met-tRNA by the 40S ribosome post-uORF termination is dependent on eIF2-GTP levels. When low, the slow reacquisition can cause a bypass of a proximal downstream AUG as the 40S is unable to re-recruit the TC necessary to form an initiation competent PIC. The TC levels respond to stress via the regulation of a series of "stress activated protein kinases" that include GCN2, HRI, PKR and PERK. These form the axe of the "integrated stress response" (ISR). Their substrate is the  subunit within eIF2.GDP generated during each round of translational initiation 12,13 . Phosphorylation impedes GDP/GTP exchange and the subsequent TC regeneration. Thus, reinitiation in combination with leaky scanning offer the possibility to significantly increase the complexity of the mammalian proteome by permitting access to internal AUGs (iAUG).
Alterations in the 5'TL arise due to the use of alternative promoters (AP), transcriptional start site (TSS) heterogeneity and alternative splicing [14][15][16] , with studies suggesting that AP exceeds alternative splicing in generating transcriptome diversity 14 . A genome wide analysis revealed that ∼18% of human genes use multiple promoters 17 . Promoter switches change the nature of the first exon, and hence the 5'TL, and this event has been linked to a number of human pathologies 18,19 . This switch is rarely complete but it can be amplified by the selective recruitment of one of the TL variants onto polysomes, as occurs with the MDM2 gene in tumour cells 20,21 . Therefore, by generating 5'TL heterogeneity, which can be both tissue and cell type specific, alternative promoters regulate the protein readout, the proteome and ultimately the cellular phenotype 16 . Indeed, in a transcriptome/translatome analysis using a glioblastoma model, the authors concluded that selective polysomal recruitment of specific mRNA populations could initiate and drive tumour formation 22 .
We have discussed uORFs as translational regulatory elements. However, transcriptome analysis has identified thousands of yet non-annotated small open reading frames (smORFs) with the potential to encode biologically active peptides or SEPs (smORF-encoded proteins/peptides) smaller than 100 aas [23][24][25][26] . Detecting the products of smORFs, which are numerous and small, is technically not straightforward 27 . However, it has been facilitated by ribosome profiling 6 . This technique couples ribosome footprinting to high-throughput RNAseq and provides quantitative information about ribosome density across a transcript. It has been used to identify alternative START/STOP sites, initiation from non-AUG codons, translational pausing/frame-shifting as well as expression from uORFs and alternative ORFs 28 . A bioinformatics analysis of the ribosome profiling database revealed that 40% of long noncoding RNAs (lncRNAs) carry smORFs and are expressed in human cells 29 . Another source of SEPs are the uORFs, with ~35% of mRNA coding genes having uORFs that are expressed 29 . These SEPs can act either in-cis to modulate downstream initiation events, or have distinct biological function(s) 7,30 . Stalling of ribosomes over the AUG GENE start site can also cause queuing of scanning ribosomes within the 5'TL. This can permit the expression of smORFs initiating on near-cognate codons with the potential to increase the SEP repertoire 31,32 . Another interesting group are the cis-acting peptides that are responsive to environmental signals and have been coined "peptoswitches" 33 .
Both leaky scanning and reinitiation permit access to internal AUGs. When in-frame with the principle ORF this gives rise to N-terminally truncated proteins 34,35 . When positioned internal and out-of-frame (ioORF: internal overlapping ORF), they represent a second source of smORFs. About 4% of human mRNAs appear to express SEPs from AUG codons downstream of the AUG GENE 29 . In fact, the expression of biologically active proteins from ioORFs has been known for some time. It was described in mammalian viral systems as far back as the early 1980's 4,36 . Nevertheless, its implications for the human proteome are only now beginning to be appreciated 37 . For example, within the ataxin-1 (ATXN1) transcript, a small ioORF starting 30nts downstream of the AUG ATXN1 , and in the -1 reading frame, is expressed by leaky scanning 38 . The SEP, Alt-ATXN1, co-localises and interacts with Ataxin-1 within nuclear inclusions. The prion protein gene PRNP also expresses a novel polypeptide from an ioORF, referred to as AltPrP 39 . It localises at mitochondria, is up-regulated by ER stress and proteasomal inhibition and was detected in human brain homogenates, primary neurons, and peripheral blood mononuclear cells. Despite sizes smaller than 100 aas, the products of smORFs can have essential biological functions 23,24 . In mice, the Mln smORF expresses a 46 aa SEP implicated in muscle contraction 40 . In humans, a 24 aa long SEP called humanin, synthesized from a lncRNA, is involved in apoptosis, interacting with BAX (Bcl-2-associated X protein) 41 , and the MRI-2 smORF (69 aas) has been implicated in DNA repair 24,42 . Intriguingly, it has been proposed that in general, the expression of SEPs may be coupled to the stress response, an observation that would tie it in nicely with the process of translational reinitiation 43 . With regards to clinical medicine, a number of human cancer specific antigens are also derived from iORFs 44,45 . Their expression reflects the change in the translational landscape that occurs with cellular transformation and they represent novel targets for immune based therapies 46 .
In this manuscript, we have extended on our earlier study in which we reported a differential RNAseq analysis on the tumoural MCF7 and non-tumoural MCF10 cell lines 47 . A number of genes were identified that exploited alternative promoters to generate 5'TL heterogeneity that could, in-turn, modulate the protein readout. One of these, the 53BP1 gene, uses two promoters (Fig. 1A). The P1 promoter (TSS12390) was active in both cell backgrounds. It generates two transcripts, referred to as V1 and V2 (NM_001141979.1, NM_001141980.1), which possess the same 5'TL but differ due to an alternative splicing event within the CDS (hereafter referred to as V1/2). The second P2 promoter (TSS20205) was more active in MCF7 cells 47 . It generates a V3 transcript (NM_005657.2) with a ~278 nts 5'TL carrying a 5codon uORF whose stop codon is 15 nucleotides upstream of the AUG 53BP1 (Fig. 1A). We postulated, and now confirm, that this uORF directs reinitiation events at an ioORF that expresses a 50 aas SEP which we refer to as SEP 53BP1 . We provide evidence that uORF mediated expression of such a protein is conserved right through to zebrafish. The endogenous SEP 53BP1 protein has been detected in a number of human cell lines of lymphoid origin and shows punctate staining in both cytoplasmic and nuclear compartments. proteasome machinery. Thus, we have identified a novel small protein whose expression is linked to a promoter switch, coupled to a stress responsive translational reinitiation event on an AUG ioORF . At least one function of the expressed SEP protein is to modulate proteasome activity. We discuss these observations in the light of current models for SEP function, and the potential therapeutic applications of increased SEP 53BP1 expression in the treatment of human neurodegenerative diseases 48 .

Organisation and expression profiles of the 53BP1 gene transcripts:
In our earlier work, we reported on a differential RNAseq analysis comparing the tumoural MCF7 and nontumoural MCF10 cell lines 47 . The 53BP1 gene was a particularly intriguing hit. It uses two promoters. The P1 promoter is active in both cell backgrounds. It generates two transcripts, named V1/2, originating from alternative splicing but carrying the same 5'TL. The mRNA has two potential AUG start codons in the 53BP1 ORF, located at the end of the first and beginning of the second exons, and separated by four codons, hinting at two N-terminal isoforms (Fig.   1a). These we refer to as AUG 53BP1(a) , which has a relatively good Kozak context, and AUG 53BP1(b) whose context is poor (Fig. 1a, lower panel). The 5'TL is ~113nts long, 71% G/C and contains no uAUGs. The second P2 promoter was more active in MCF7 cells (Fig. 1b) 47 .
Based upon CAGE analysis it generates a V3 transcript with a ~278 nts 5'TL carrying a 5 codon uORF whose stop codon is 15 nucleotides upstream of the AUG 53BP1(b) (Fig. 1a: AUG 53BP1(a) is in the first exon of the V1/2 transcript). Luciferase based reporter assays revealed that the V3 5'TL was more repressive than V1/2 with regards to initiation events on the AUG 53BP1 start sites, due to the uORF 47 . Furthermore, polysome gradient profiling of the two cell lines revealed that whereas the V1/2 transcript was mainly polysomal in both, the V3 transcript was polysomal only in the tumoural MCF7 cells 47 . Therefore, at the outset of our current study we evaluated to what extent P2 promoter activity was a marker of the tumoural phenotype. We performed an RT-PCR analysis of V1/2 and V3 across a range of established tumoural and non-tumoural cell lines available in the lab. No clear correlation with the tumoural phenotype was observed (Fig. 1b).

The protein readout from the V3 mRNA is different:
The small uORF in the V3 5'TL could promote delayed reinitiation events downstream of the AUG 53BP1 . Examination of the human sequence reveals that the next start codon downstream AUG 53BP1(b) opens an ioORF, +1 relative to the 53BP1 ORF, that would encode a polypeptide of 50 aas that we named SEP 53BP1 (Fig. 1a). To monitor expression in the V1/2 and V3 5'TL backgrounds at both the AUG 53BP1 and AUG SEP , we inserted the sequences upstream of the AUG SEP into our LP/SP overlapping ORF reporter 10,49 . This fuses the 53BP1 ORF to LP (which carries a FLAG and HA tag) and the AUG SEP to SP (which carries a MYC and HA tag: the AUG SEP and its Kozak context were retained) (Fig. 2a). It allows us to follow initiation events at the AUG 53BP1 (we were unable to distinguish between the sites AUG 53BP1(a) and AUG 53BP1(b) on V1/2: however, based upon context we presume that the former is the major start codon) (Fig. 2a). Transient expression assays in HEK293T cells, revealed that the V1/2 5'TL directed initiation events mainly at AUG 53BP1 whereas with V3 the majority of initiation events occurred at AUG SEP (Fig. 2a). This pattern was also observed in transient assays performed in MCF10 and MCF7 cells (Fig. 2b).
To monitor the impact of the V3 uORF on the readout we mutated its stop codon (UGA→UGC: V3 UGA/UGC ) thereby fusing the uAUG to the 53BP1 ORF (Fig. 2b). This effectively removes events arising from delayed reinitiation. When transiently expressed in HEK293T, MCF10 and MCF7 cells, the V3 UGA/UGC directed expression mainly from the AUG uORF (Fig. 2b). This would be consistent with its good Kozak context (Fig. 1a). Leaky downstream scanning to the AUG 53BP1 and AUG SEP was weak in the HEK293T and MCF10 cell backgrounds but more evident in MCF7 cells.
The results demonstrate high levels of initiation at the AUG SEP in transcripts carrying the V3 5'TL and this is mediated by its small uORF. To confirm a role for delayed reinitiation we introduced a 30 nts spacer element between the uORF and the AUG 53BP1(b) . Consistent with a reinitiation model, this increased initiation events at the AUG 53BP1(b) relative to AUG SEP (Fig. 2c: compare lanes 1 and 2). The expression pattern observed with the 30 nts spacer construct was similar to that obtained when the uAUG was changed to GCG (Fig. 2c: compare lanes 2 and 3). In the uAUG/GCG mutant, we still observed significant initiation events at the AUG SEP .
That this arose due to leakiness of the AUG 53BP1(b) was confirmed by introducing changes that improved its context (..CAGAUGG… → …GCCAUGG) (Fig. 2c: compare lanes 3 and 4). In conclusion, the presence of a short uORF positioned close to a very leaky downstream AUG 53BP1(b) means that the major initiation events on the V3 transcript take place on AUG SEP . This would direct the expression of a smORF-encoded peptide (SEP) of 50 aas (SEP 53BP1 ) 23,24 .
The configuration of the V3 5'TL and the ioORF are conserved: Sequence alignment suggests that the SEP 53BP1 ioORF is conserved across vertebrates and can be found even in zebrafish (Danio rerio) (Fig. 3a). Likewise, uORFs are frequently found within the annotated 5'TLs of the 53BP1 gene. For example, zebrafish have a single promoter expressing a single 5'TL variant with an uORF of 19 codons (GenBank: BC129236.1 and NM_001080170: longer than the human) whose stop codon is 15 nts upstream of the first AUG 53BP1a (similar to human) ( Fig. 3b). uORFs are frequently present in zebrafish transcripts and, as in mammals, they serve to modulate the translational readout 11 . The zebrafish ioORF would express a SEP polypeptide of 40 aas (Fig. 3a). The nucleotide spacing between the AUG 53BP1a and AUG ioORF is 400 nts in zebrafish compared to 97 nts in the human V3 mRNA. Within this 400 nt region there is a second AUG in the 53BP1 ORF (AUG 53BP1b ) that could express an N terminally truncated (121 aas) 53BP1 protein (Fig. 3B). To examine initiation events on this transcript, we RT-PCR cloned all 53BP1 sequences upstream of the ioORF STOP codon (changing it at the same time to a sense codon) starting from total zebrafish embryonic RNA. This was then fused to our LP/SP reporter to generate 53BP1ZLP/SP WT (Fig. 3B). To monitor the role of the uORF on start site selection, a number of mutations were created. The uORF AUG/GCG removed the start codon and the uORF UAA/AGG fused the uORF to the 53BP1/LP ORF in the reporter (Fig. 3b). We also exploited two BamHI sites, one positioned just before the uORF UAA stop codon and the second just after the AUG 53BP1a . Deletion of the small BamHI fragment removed both the uORF UAA and AUG 53BP1a codons fusing uORF to the ORF of 53BP1/LP ( Fig.   3b: 53BP1ZLP/SPBam). As in the human reporter construct, the ioORF was fused to the SP reading frame (Fig. 2). In the WT background, we could detect products from AUG 53BP1a , AUG 53BP1b and AUG ioORF , with the latter corresponding to the zebrafish SEP 53BP1 (Fig. 3c, lane 2). Removal of the uAUG significantly enhanced expression at the AUG 53BP1a but did not impact significantly on the downstream start sites (Fig. 3c, lane 3). These latter initiation events would now arise due to leaky scanning through AUG 53BP1a whose context is poor (Fig. 3b). Thus as in humans, the uORF in zebra fish represses 53BP1 expression. Fusing the uORF to the 53BP1 ORF, either by the uORF UAA/AGG mutation (Fig. 3c, lane 4) or the ∆BamH1 deletion (which also removes AUG 53BP1a : Fig. 3c, lane 5) produced a single band on the blot whose slower migration indicates that it arises from an initiation event on AUG uORF . The "nonleakiness" of this start codon would be consistent with its good Kozak context (Fig. 3b). We confirmed this by introducing the uORF AUG/GCG mutation into the ∆BamH1 background. The slow migrating band was lost and we restored the expression of products from the AUG 53BP1b and AUG ioORF (Fig. 3c, lane 7). Thus, in zebrafish the uORF is also permitting initiation events downstream of the AUG GENE (in this case AUG 53BP1a ). These downstream initiation events can give rise to N-terminal truncated forms of the 53BP1 protein and the expression a SEP 53BP1 .
However, unlike the human V3 transcript the single zebrafish 5'TL assures robust expression from all initiation sites. 53BP1 : Polyclonal Abs against the SEP 53BP1 were generated using two peptides that spanned most of the protein (VLTSVCYLDTFLISRRTKKILC and WMLCPILNKQLEKNEETVIVGC: Proteogenix, France). The Ab did not detect SEP 53BP1 expression in HEK293T cells (Fig. 4a, lane 2), an observation that would be consistent with the low levels of the V3 transcript in this cell line (Fig. 1b). We therefore generated an authentic full-length V3-53BP1 cDNA clone and transiently expressed it in HEK293T cells (Fig. 4a). A doublet band co-migrating with the SEP 53BP1 protein expressed in a WGE (the in-vitro system was programmed with a capped/poyladenylated mRNA covering the SEP 53BP1 ORF) was detected on blots (Fig. 4a, lanes 1 and 3). Proof that this arose from initiation events at the SEP 53BP1 AUG codon came from both altering the AUG SEP53BP1 Kozak context from good to bad (…aggAUGa…→…cggAUGa…), which impacted negatively on expression (Fig. 4a, lane 4), and changing the AUG SEP53BP1 to GCG which ablated all expression (Fig. 4a, lane 5).

Studies on the human SEP
Furthermore, and consistent with the reinitiation model derived from the reporter assays ( Fig.   2), expression increased under stress conditions that activated PERK (treatment with the drug Cyclopiazonic acid, CPA), thereby increasing the intracellular phospho-eIF2 levels ( Fig.   4b) 50 . The transiently expressed SEP 53BP1 had an intracellular half-life of 2.5 hrs (Fig. 4c).
Immunofluorescence (IF) imaging of transfected HEK293T cells revealed a mainly cytoplasmic localisation (Fig. 4d). However, staining could be observed in the nucleus and, in rare occasions, it was almost exclusively nuclear (Fig. 4d, lower panel).

Detection and localisation of the endogenous SEP 53BP1 protein Transient expression
assays have allowed us to elucidate the mechanism by which P2 promoter activation will permit the expression of a novel SEP. However, at this point in the study it was necessary to detect the endogenous protein, and determine the cellular compartment(s) in which it accumulates as a route towards function. We had already observed that P2 promoter activity and V3 transcript levels are regulated in a cell-specific manner (Fig. 1b). Furthermore, we had reported that polysomal recruitment of V3 could also be cell specific 47 . With this in mind, we scanned the ribosome-profiling database (http://sysbio.sysu.edu.cn/rpfdb/index.html). The image in Supplementary Fig. 1 was extracted from a study performed by the Brosch lab using THP-1 cells (a  human  acute  monocytic  leukaemia  cell  line: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE39561) 51 . The accumulation of reads around both the AUG 53BP1 and AUG SEP53BP1 would be consistent with their utilisation as start sites. We therefore performed polysomal analysis of the total, V1/2 and V3 mRNAs in this cell background (Fig. 5a). Only a minor fraction of the total 53BP1 gene transcripts were polysomal (26%) (Fig. 5a, left hand profile). Concerning V3, very little was associated with light polysomes (9%) although a more significant fraction was observed within the heavy polysomes (44%), in particular the heaviest fraction (fraction 11, 32%). It is worth remembering that the ioORF is only 150 nts in length and can accommodate a maximum of five elongating ribosomes 6,52,53 . This would mean that V3 transcripts in the heavy polysomal fraction, that we define as >5 ribosomes per transcript, must be translating both ORFs. We also analysed another lymphocyte cell line that was available to us, namely Raji cells (Fig. 5a, right hand profile: no ribo-profiling data is available for this cell line). The polysomal profiles indicated that the majority (80%) of the 53BP1 transcripts were polysomal and this was also observed with both V1/2 (78%) and V3 (87%) (Fig. 5a). Immunoblots detected SEP 53BP1 expression in both these cell lines but only weak expression in MCF7 cells, the cell line in which we originally reported V3 expression (Fig. 5B)  Raji, respectively).

The SEP 53BP1 interactome:
To gain insights into function we employed a yeast-2-hybrid (Y2H) screen to identify partners. SEP 53BP1 was used as a prey, and screened against a peptide library generated from a human B cell Lymphoma_RP1. This background was selected because we had observed SEP 53BP1 in two cell lines of lymphoid origin. Around 51 million interactions were tested and 5 genes (PSMA7, UBQLN4, TRIP12, MAPRE1, BCOR) gave interactions with good confidence levels (Fig. 6a). The selected interaction domain (SID) for each prey is depicted in Supplementary Fig. 2. String analysis connected three of the five genes (PSMA7, UBQLN4, TRIP12) to proteasome biology (Fig. 6a). We sought to biochemically validate this analysis focusing on the protein products of the first two genes.
PSMA7 encodes the 4 subunit of the 20S proteasome barrel and it plays a key role in its assembly 54,55 . Sedimentation analysis performed on cytoplasmic extracts prepared from HEK293T cells transiently expressing SEP 53BP1 revealed that a fraction of the 50 amino acid protein co-sedimented with the 4 protein in fractions 4-6 ( Fig. 6b, upper panel). Extracts contained ATP to ensure 26S proteasome integrity during the assay 56,57 .That these fractions corresponded to the 20S/26S proteasome was confirmed by disruption of the complexes using SDS treatment of the extracts prior to gradient loading. (Fig. 6b, lower panel). It should be noted that whereas only a fraction of SEP 53BP1 co-sedimented with 4 in native conditions, the majority of the protein entered into the gradient, and a significant fraction was found in the pellet ribosomal fraction (as indicated by the presence of the ribosomal protein S6: Fig. 6b, upper panel). However, our interactome study did not give any hits with components of the translational machinery. Overall, the sedimentation profile of SEP 53BP1 is quite remarkable considering its small molecular size (very little of the protein remains on the top in fraction 10).
These results suggest that it has multiple interacting partners in the cell. We directly demonstrated the 4-SEP 53BP1 association by co-IP performed on HEK293T cell extracts transiently expressing the latter (Fig. 6c) and this corroborated their intracellular co-localisation ( Fig. 6d). UBQLN4 also plays a role in the regulation of intracellular protein degradation by mediating the proteasomal targeting of misfolded or accumulated proteins 58 . Its overexpression, as observed in some human tumours, also represses homologous DNA repair 59 .
It did not enter into our glycerol gradients, suggesting that under our assay conditions the protein was mainly free in the extracts (Fig. 6b), and we were unable to co-IP transiently expressed SEP 53BP1 by pulling down the endogenous UBQLN4 protein. Nonetheless, an interaction could be demonstrated by co-IP in cells transiently expressing both UBLQN4 and SEP 53BP1 (Fig. 6e).
Since the interactome clearly pointed to a role in proteasome biology, we asked if SEP 53BP1 expression influenced proteasome function. We compared proteasome activity in HEK293T extracts prepared from cells transfected with empty vector or a vector expressing SEP 53BP1 (Fig. 6f). The presence of the small polypeptide stimulated 26S proteasome activity as confirmed by the inhibitory effect of the drug MG132. The stimulation was greater than two fold over a 90 mins assay period.

DISCUSSION
It is increasingly evident that the complexity of the metazoan proteome is considerably increased by the expression of SEPs (< 100 aas) that until recently escaped detection using conventional biochemical procedures (they are also referred to as small protein 60  In this manuscript, we have identified a new member of the SEP family expressed from an ioORF within the 53BP1 gene, the main CDS of which expresses a protein that plays a central role in non-homologous DNA repair 76 . It is the mode of expression and the function of this SEP 53BP1 that is novel. In humans, it couples alternative promoter activity (P1 versus P2) to a translational reinitiation event on the internal AUG SEP53BP1 mediated by a short uORF within the P2 derived mRNA 5'TL. Both these events can respond to intracellular stresses [77][78][79] .
Curiously, it has already been proposed that SEP expression may be an integral part of the cellular "stress response" 80 . The link to developmental functions is also intriguing, because promoter switching and translational reprogramming are key events during differentiation in metazoans 81,82 . We have observed P1 promoter activity (V1/2 expression) in all human cell lines tested, indicating that it is the probably the major source of the 53BP1 protein (Fig. 1b).
On the other hand, P2 promoter activity showed considerable cell line variability and it remains unclear the molecular basis of its regulation (Fig. 1b). Its V3 transcript directs expression mainly of SEP 53BP1 , indicating that expression of this protein is also not ubiquitous. The low levels that we observed in MCF7 cells (Fig. 5b), despite high cellular levels of the V3 transcript and high polysomal occupancy 47 , suggests that intracellular stability is also regulated in a cellspecific manner. The smORF responsible for SEP expression can be observed in metazoan species from human through to zebrafish (Fig. 3a), with the caveat that part of this conservation may arise from the constraints imposed by the overlapping 53BP1 ORF.
However, most of the key functional domains of 53BP1 reside in its C-terminus and there may be more primary sequence plasticity within its long largely disordered N-terminus 83 . Curiously, zebrafish have only one annotated 53BP1 promoter but the organisation of the 5'TL, more specially the longer uORF, appears to permit ribosome access to both AUG 53BP1 and AUG SEP53BP1 at high efficiency (a sort of fusion of the human V1/2 and V3 readouts: Fig. 3c).
This behaviour, with the caveat that our studies have as yet only be performed in mammalian cells, would be consistent with current models of reinitiation 84,85 However, the cis-acting sequences on the mRNA that regulate reinitiation, and their responsiveness to intracellular stresses, are well conserved between mammal and zebrafish 8,86,87 . Extensive studies on SEP protein expression, and function, have been reported using a D. melanogaster model 72,74,75,88 , and riboprofiling studies have confirmed the presence of smORFs in zebrafish 89 .
Consequently, zebrafish presents itself as a useful animal model to explore the role of the both uORF and the AUG SEP53BP1 in metazoan development.
Our interactome studies revealed that a fraction of the cytoplasmic SEP 53BP1 associates with the proteasome via its 4 subunit (Fig. 6).The SID on 4 maps to the C-terminus, a region that is found largely exposed on the surface of the 20S and 26S proteasome (Fig.7a/b). This interaction appears to stimulate 26S proteasome activity, based upon the MG132 inhibitory effect 90 , despite the fact that proteolytic activities are on the  rings and positioned in the interior of the 20S cylinder 91 . The 26S specifically targets polyubiquitinylated substrates that are degraded in an ATP-dependent manner (Fig. 6f) 92 . This polyubiquitinylated selectivity resides within the 19S regulatory complex positioned at each extremity of the 20S cylinder ( Fig. 7b) 93 . However, during certain stresses, binding of activating proteins can open the  ring on the 20S permitting the entry of protein substrates. Proteins that enter are degraded in an ubiquitin/ATP independent manner. This "active" 20S serves to remove misfolded or oxidised proteins that accumulate during the stress 94 . Furthermore, our detection of SEP 53BP1 in two cell lines of lymphoid origin is intriguing because lymphoid tissues express a specific proteasome involved in antigen processing called the "immunoproteasome" whose assembly involves changes in the composition of the β ring 95,96 . It remains to be determined if SEP 53BP1 is also modulating the activity of the "active" 20S and the immunoproteasome, all of which retain the 4 SID. Proteasomes are found in both cytoplasmic and nuclear compartments 57,97 . One of the putative nuclear localisation signals is actually located on the C-terminal tail of 4 (Fig.   7a) 98 . It seems conceivable that the nuclear SEP 53BP1 may enter in association with the proteasome. Furthermore, proteasome levels in the nucleus responds to stresses, such as glucose starvation, hypoxia or low pH, 99 many of which may also be modulating SEP 53BP1 intracellular levels.
The regulation of proteasome activity has important clinical applications. Proteasome function becomes impaired during many age-related neurodegenerative disorders, including Parkinson's disease, amyotrophic lateral sclerosis and Alzheimer's disease. All these conditions are characterised by the accumulation of toxic intracellular protein aggregates that arise because of reduced proteasome activity 100,101 . Consequently, a considerable amount of research has focused on the development of pharmacological small molecules that can stimulate proteasome function, as potential therapeutic agents for these conditions 48 .
However, few of these small molecule compounds activate proteasomes in vivo 48 . Increasing P2 promoter activity and intracellular SEP 53BP1 levels offers a new avenue of research in the treatment of these conditions.
In bicistronic transcripts, the SEP and the product of the CDS frequently exhibit a functionality link. This can involve a direct protein-protein interaction 38,71,102,103 , or alternatively they act indirectly on the same metabolic or physiological pathway 104,105 . Our Y2H analysis showed no hit with 53BP1 and we have been unable to demonstrate any interaction by co-IP using transiently over-expressed proteins. However, the Y2H study did provide a hit on TRIP12, an E3 ubiquitin-protein ligase that regulates 53BP1 steady state levels in the cell 106 . In a similar vein, UBQLN4 has been reported to co-localise with 53BP1 at the sites of DNA damage and to repress homologous recombination-mediated DNA repair 59 . We have not yet biochemically confirmed these Y2H interactions but it remains possible that SEP 53BP1 may also impact on the DNA repair process by modulating 53BP1 activity and levels.

In conclusion:
We have identified a new member of the SEP family within the 53BP1 gene, conserved across metazoans. The mode of expression indicates that its intracellular levels will be regulated via alternative promoter activation and cellular stress. One of its functions is to interact with, and modulate, the activity of the cellular proteasome. As such, it offers the possibility of new therapeutic approaches for the treatment of conditions coupled to the accumulation of toxic intracellular protein aggregates.    Confocal images were collected on a Zeiss LSM800 confocal scanning microscope equipped with a Plan-Apochromat 63x/1.40 Oil DIC M27 objective. Pictures were analysed using the ImageJ software.

METHODS
Z-series videos in THP1 and RAJI cells are shown as maximum z-projections, and gamma, brightness, and contrast were adjusted (identically for compared image sets) and were generated using the Imaris software.
Proteasomal activity assay. The proteasomal activity assay was performed on HEK293T cells transfected with either empty vector or a vector expressing SEP 53BP1 . Activity was measured using the Proteasome Activity kit (ab107921, Abcam) following the manufacturer's protocol. MG-132 (Sigma) treatment served as a control to differentiate between 26S proteasome activity and other protease activities present in the extract. The reaction was followed for 90 minutes, and fluorescence was measured on a fluorometric microplate reader        ring interface (black dotted line) with the exposed C-terminus of 4 circled in red. All images were extracted from the PDBe database (https://www.ebi.ac.uk/pdbe/node/1).        slices (6,09 µm), in a manner that assures that the cell is imaged in its entirety. Pictures were analysed by the Imaris software. The first few seconds of each video shows the original pictures collected, with the endogenous SEP 53BP1 staining red and the nuclei staining blue. This is followed by a 3-Dimensional reconstruction of the nucleus using the surface function within Imaris. In this representation, the cytoplasmic SEP 53BP1 stains magenta and nuclear SEP 53BP1 stains light blue.