The SARS-CoV-2 protein ORF3c is a mitochondrial modulator of innate immunity

The SARS-CoV-2 genome encodes a multitude of accessory proteins. Using comparative genomic approaches, an additional accessory protein, ORF3c, has been predicted to be encoded within the ORF3a sgmRNA. Expression of ORF3c during infection has been confirmed independently by ribosome profiling. Despite ORF3c also being present in the 2002-2003 SARS-CoV, its function has remained unexplored. Here we show that ORF3c localises to mitochondria during infection, where it inhibits innate immunity by restricting IFN-β production, but not NF-κB activation or JAK-STAT signalling downstream of type I IFN stimulation. We find that ORF3c acts after stimulation with cytoplasmic RNA helicases RIG-I or MDA5 or adaptor protein MAVS, but not after TRIF, TBK1 or phospho-IRF3 stimulation. ORF3c co-immunoprecipitates with the antiviral proteins MAVS and PGAM5 and induces MAVS cleavage by caspase-3. Together, these data provide insight into an uncharacterised mechanism of innate immune evasion by this important human pathogen.


INTRODUCTION
Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is the causative agent of COVID-19 and the recent pandemic that has had unprecedented effects upon humanity, ranging from numerous casualties to severe economic impact. It is imperative to have a thorough understanding of the pathogen-host interactions that occur during infection. A key first step is to delineate and characterise functionally the full complement of accessory proteins encoded in the SARS-CoV-2 genome. However, most studies have been informed by pre-existing research based upon the closely related SARS-CoV (referred to herein as SARS-CoV-1 to avoid confusion), which caused a relatively minor outbreak in 2002 -2003. As a result, proteins that had not been annotated in studies of SARS-CoV-1 were overlooked during the initial scientific response to the COVID-19 pandemic.
SARS-CoV-2 and SARS-CoV-1 belong to the taxon Sarbecovirus, a subgenus of the genus Betacoronavirus in the family Coronaviridae. Members of the Coronaviridae possess large positive-sense, single-stranded RNA genomes (approximately 30 kb in size) which contain 7 -16 protein-coding open reading frames (ORFs). The two largest ORFs, ORF1a and ORF1b, are translated from the genomic mRNA directly (with translation of ORF1b depending on ribosomes making a programmed frameshift near the end of ORF1a) and encode polyproteins pp1a and pp1ab, which are cleaved into individual functional proteins that support viral replication. The remainder of the ORFs are translated from a set of "nested" subgenomic mRNAs (sgmRNAs). The encoded proteins are either structural components of the virion or so-called "accessory" proteins: mostly dispensable for viral replication in cell culture, these latter proteins nonetheless confer significant advantages to replication in hosts, often due to interactions with innate immune pathways. The assemblage of accessory proteins varies substantially across the Coronaviridae, and can vary even between closely related coronaviral species, thus constituting an important field of research. The potential translational repertoire of SARS-CoV-2 has been the subject of many publications, with multiple groups reporting evidence of novel ORFs via a range of approaches [1][2][3][4][5][6][7][8] . Collectively these studies highlight the lack of an established, accepted SARS-CoV-2 viral proteome because the functional relevance of these reports is seldom validated experimentally 9 .
In 2020, making use of the ~54 then-available sarbecovirus genomes, we and others used comparative genomic approaches to analyse the coding capacity of SARS-CoV-2. Our analysis revealed a previously undetected conserved ORF, overlapping ORF3a in the +1 reading frame, and precisely coinciding with a region of statistically significantly enhanced synonymous site conservation in ORF3a-frame codons, indicative of a functional overlapping gene, ORF3c 10 . ORF3c was predicted independently by Cagliani et al. who termed it ORF3h 11 and Jungreis et al. who termed it ORF3c 2 . A community consensus fixed the name as ORF3c 12 . Given ORF3c was previously unknown, it had not been the focus of any pre-existing studies despite being present in SARS-CoV-1. However its continued presence throughout evolution indicates that it is beneficial to successful viral replication, immune evasion or transmission, at least in natural hosts (seemingly mainly Rhinolophus bats) [13][14][15][16] . In our comparative genomic analysis, we found ORF3c to be the only novel ORF that is conserved across sarbecoviruses and subject to purifying selection 10 . Translation of ORF3c is supported by ribosome profiling data 1 , which is not the case for the majority of other predicted novel SARS-CoV-2 ORFs. ORF3c thus represents a previously uninvestigated area of sarbecovirus research. SARS-CoV-2 infection is known to dysregulate host immune responses, specifically the type I interferon (IFN) response, resulting in the severe clinical symptoms characteristic of this pathogen 17 . Type I IFNs are essential innate cytokines that induce a host response that restricts and eliminates SARS-CoV-2 infection 18 . Host cells produce type I IFNs in response to activation of pattern recognition receptors (PRRs): host proteins that recognise molecules from either pathogens or damaged cells. Pathogen-associated molecular patterns (PAMPs) activate PRRs that in turn promote transcription of type I IFNs and other antiviral genes 19 . The two key cytosolic PRRs involved in the antiviral response are retinoic acid-inducible gene (RIG-I) and melanoma differentiation-associated protein 5 (MDA5). RIG-I senses short dsRNA and 5′-ppp/pp-RNA 20 whereas MDA5 senses high molecular weight dsRNA and mRNA that lack 2′-O-methylation at the 5′ cap 21,22 . Both RIG-I and MDA5 form filaments with dsRNA to recruit mitochondrial antiviral-signalling protein (MAVS). This interaction promotes the formation of MAVS polymers at the mitochondrial outer membrane that activate transcription factors interferon regulatory factor 3 (IRF3) and nuclear factor kappa-light-chain-enhancer of activated B cells (NF-κB) 23 . The activated IRF3 dimer and NF-κB complex translocate into the nucleus and bind to responsive promoters to stimulate the transcription of type I IFNs 24 and other pro-inflammatory factors. To suppress production of these molecules, viruses have evolved different strategies targeting the activation of RIG-I, MDA5 or MAVS. For example, multiple SARS-CoV-2 proteins directly and indirectly target MAVS to diminish IRF3 activation and so production of IFN-β and other pro-inflammatory proteins encoded by IRF3 responsive genes. Membrane (M) protein interacts with MAVS to suppress MAVS polymerisation, leading to diminished IFN-β production and enhanced SARS-CoV-2 replication 25 , whereas ORF9b interacts with mitochondrial import receptor subunit Tom70 at mitochondria to suppress MAVS and reduce activation of the IFN-β promoter 26 . ORF9b has also been found to bind to IKK-γ (NEMO), a protein that acts downstream of MAVS 27 . Finally, when ORF10 is overexpressed, it activates mitophagy receptors to induce MAVS elimination 28 .
Here we present an initial functional characterisation of ORF3c. We find that it localises to the mitochondrial outer membrane, a platform heavily involved in innate immune signalling. We provide evidence that ORF3c subverts the cascade of cellular antiviral responses via preventing activation of transcription from the IFN-β gene. This appears to be mediated by interactions of ORF3c with both PGAM5 (mitochondrial protein phosphoglycerate mutant family member 5, previously known as phosphoglycerate mutase family member 5) and MAVS, alongside a subsequent cleavage of MAVS by activated caspases.

ORF3c is conserved across sarbecoviruses
Since we identified ORF3c as a previously undetected and conserved ORF in 2020 10 , numerous additional sarbecovirus sequences have been published, including many divergent sequences from bat hosts. We inspected GenBank sequence records for all unique ORF3c sequences ( Figure 1A, Figure S1). The ORF3c protein is 39-41 amino acids in length and has a predicted C-terminal transmembrane region and a shorter N-terminal hydrophobic region ( Figure 1A).

ORF3c is expressed via leaky scanning
ORF3c is a small protein (predicted MW = 4.9 kDa) and the ORF entirely overlaps with ORF3a in the +1 frame (with respect to ORF3a). We hypothesised that ORF3c is expressed via ribosomal leaky scanning on the ORF3a sgmRNA, in a similar manner to the ORF7b and ORF9b proteins of SARS-CoV-1 29,30 . This would require scanning preinitiation (43S) complexes to proceed past both the AUG start codon of ORF3a and a subsequent inframe AUG, and then initiate at a third AUG: the start codon of ORF3c (mechanism reviewed in Firth, 2012 31 ). Both AUGs in the ORF3a frame have intermediate or weak initiation contexts (thus facilitating leaky scanning) whereas the ORF3c AUG has a strong initiation context 32 .
To test the leaky scanning hypothesis and measure the ORF3a:ORF3c relative expression levels, an expression cassette was created composed of the 5′ end of the ORF3a sgmRNA transcript (including the 77 nucleotide leader) up to the final coding nucleotide of ORF3c (excluding the stop codon). This was followed by a foot and mouth disease virus (FMDV) 2A sequence (which mediates co-translational separation of the polypeptide chain) and then the Renilla luciferase (RLuc) sequence. The 2A-RLuc ORF was in-frame with the AUG of either ORF3a or ORF3c. A range of mutants were created based on both of these constructs, which ablated (i) the first AUG of ORF3a; (ii) the second AUG of ORF3a; (iii) both AUGs of ORF3a; or (iv) the AUG of ORF3c ( Figure 1B). In all cases, AUGs were mutated to ACG (which, however, may still allow low level initiation 31 ).
Plasmid DNA templates were transcribed in vitro by the T7 RNA polymerase and the transcripts were then purified and transfected into Vero cells in a 96-well plate format. The luciferase values were measured at 20 h post transfection. RLuc was normalised to an internal firefly luciferase (FLuc) control. Values for the ORF3c WT and ORF3a and ORF3c mutant RNAs were then normalised to those for the ORF3a WT RNAs ( Figure 1C). Expression of ORF3a or ORF3c was sensitive to, respectively, mutation of the first or both ORF3a-frame AUGs, and mutation of the ORF3c-frame AUG. However, the second AUG was not noticeably utilised for ORF3a expression. As expected, mutation of the ORF3cframe AUG did not affect ORF3a expression. With the WT sequence, the levels of ORF3c and ORF3a expression appeared to be similar to each other. Consistent with a leaky scanning model, when the first or both ORF3a-frame AUGs were mutated, ORF3c expression approximately doubled. Thus, under the conditions tested, approximately 50% of preinitiation complexes appear to scan past the first two AUGs within the ORF3a sgmRNA and initiate at the next downstream AUG to translate ORF3c. This translational efficiency would result in an approximately equal stoichiometric ratio of ORF3a:ORF3c proteins from the one sgmRNA.
Golgi and ERGIC compartments during the viral assembly, budding and trafficking processes 44 .
ORF3c is predicted to possess a C-terminal -helix ( Figure 1A), which is likely to form a transmembrane domain 10 , suggesting that ORF3c is a single-pass integral transmembrane protein. To confirm this predicted membrane association, C-terminally HA-tagged ORF3c (ORF3c-HA) was overexpressed by transfection in Vero cells and the cells were subject to subcellular fractionation. Cytoplasmic, membranous and nuclear fractions were probed for ORF3c-HA via immunoblotting and ORF3c was found to localise primarily within the membranous fraction ( Figure 1D).

ORF3c does not multimerise
Many viruses encode similarly small transmembrane proteins that spontaneously multimerise to form ion channels within host membranes (so-called viroporins, reviewed in Nieva, 2012 45 ). Previously, we and others suggested that ORF3c may possess viroporin activity due to its size and hydrophobic nature 10,11 . To assess the ability of ORF3c to multimerise, we co-transfected ORF3c-HA and ORF3c-FLAG into HEK293T cells and performed an anti-FLAG immunoprecipitation. FLAG-tagged PGAM5 was included as a positive control to ensure interaction partners were not lost (see later results and Figure 5). The eluate was then probed for both HA and FLAG tags by immunoblotting ( Figure 1E). Whereas FLAG-tagged PGAM5 co-immunoprecipitated ORF3c-HA, ORF3c-FLAG did not. Thus, this experiment did not provide evidence for ORF3c homo-oligomerisation in cultured cells, suggesting that ORF3c is unlikely to form ion channels.

ORF3c can localise to the ER membrane in vitro
According to the predicted C-terminal helical transmembrane domain 10 , ORF3c most likely is a tail-anchored protein. Whilst this group of diverse and functionally important integral membrane proteins are present in all intracellular membranes with a cytosolic surface 46,47 , it is generally accepted that most tail-anchored proteins bearing transmembrane domains of substantial hydrophobic character are post-translationally targeted to the endoplasmic reticulum (ER) -the central organelle involved in coronaviral RNA replication [48][49][50] and to which multiple SARS-CoV-2 proteins localise. In contrast, tail-anchored proteins containing transmembrane domains of reduced hydrophobicity and increased charge within the extremely short exoplasmic C-terminal region (C exo ), typically target to the mitochondrial outer membrane (MOM) 51,52 . Thus, we explored sequentially the ability of ORF3c to integrate into the membrane of the ER and mitochondria using in vitro systems specialised for each organelle.
Having exploited canine pancreatic microsomes to study the integration of other SARS-CoV-2 membrane proteins into the ER 53 , we studied ORF3c biogenesis using this system ( Figure 2A). We found that in vitro synthesised and imported ORF3c ( Figure 2B) was protected from added protease ( Figure 2C, lanes 1-3) and was resistant to extraction with alkaline sodium carbonate buffer ( Figure 2C, lanes 4-6), suggesting that ORF3c can integrate stably into the ER membrane in vitro.
To investigate its membrane topology, we modified ORF3c to facilitate the detection of ER import -incorporating an OPG2 tag either at the extreme N-or C-terminus to generate OPG2-ORF3c and ORF3c-OPG2, respectively ( Figure 2B). Since the OPG2 epitope supports efficient ER lumenal N-glycosylation 54 , we synthesised radiolabelled ORF3c and its two OPG2-tagged ORF3c variants in the presence of ER membranes and used endoglycosidase H (Endo H) treatment of the resulting membrane-associated products to identify N-glycosylated species ( Figure 2D, even numbered lanes). On the basis of these studies, we found that both the N-and C-termini of OPG2-tagged ORF3c can be Nglycosylated and hence can be translocated into the lumen of ER microsomes ( Figure 2D, lanes 1-6). Interestingly, when the same ORF3c proteins were analysed using semipermeabilised HeLa cells (SP HeLa cells; Figure 2D, lanes 7-12), the extent of this Nglycosylation was greatly reduced but the total ORF3c signal increased, suggesting it may be targeted to another organelle.
Whereas canine pancreatic microsomes are highly enriched in ER-derived membranes, SP cells preserve the integrity of multiple subcellular organelles including both the ER and mitochondria 51,55 . Therefore, we considered the possibility that, in contrast to the robust Nglycosylation observed using purified ER membranes, the presence of mitochondria in SP cells may reduce the opportunity for ORF3c to mislocalise to the ER by providing access to the mitochondrial outer membrane (MOM) 51,56 .

ORF3c inserts efficiently into mitochondrial membranes in vitro
Next, we investigated the ability of ORF3c to insert into the membranes of isolated mitochondria by using an in vitro assay comparable to that used to assess ER import. Tail anchored proteins are not found in the inner mitochondrial membrane, whereas a range of endogenous MOM proteins (such as Tom5, Tom6 and Tom7) possess this topology. We radiolabelled ORF3c in reticulocyte lysate and incubated the protein with mitochondria purified from HEK293T cells. ORF3c increasingly associated in vitro with mitochondria over time ( Figure 2E). In contrast to the mitochondrial matrix protein COX4-1, association of ORF3c with mitochondria occurred independent of an inner mitochondrial membrane potential. Following ORF3c import, mitochondria were treated with proteinase K to degrade the non-imported protein. We observed a gradual increase in ORF3c signal intensity with time ( Figure 2E, lanes 6-10) suggesting that the protein becomes proteinase K resistant probably due to membrane integration. To rule out that ORF3c targeting occurred due to a distinct contaminating-membrane in the mitochondrial preparation, anti-Tom22 antibodies immobilised on magnetic beads were used to immuno-isolate mitochondria after ORF3c import. The resulting immuno-isolated mitochondria showed clear enrichment for ORF3c and Tom20, while ERLIN2 (an ER marker) was not copurified ( Figure 2F). Accordingly, we conclude that ORF3c is targeted selectively to mitochondria.
To address whether imported ORF3c was integrated stably into mitochondrial membranes, we imported ORF3c into purified mitochondria that subsequently were treated with sodium carbonate buffer (pH 11.4). Under these conditions, integral membrane proteins are retained in the membrane pellet, whilst soluble and peripheral membrane proteins are released into the supernatant. Upon carbonate treatment, ORF3c was present exclusively in the membrane pellet fraction, indicating that it was incorporated into the lipid phase ( Figure 2G, lanes 2-3). Alternatively, mitochondrial membranes were solubilised in detergent (Triton X-100). Upon detergent treatment, ORF3c was largely released into the supernatant, although a fraction remained unsolubilised ( Figure 2G, lanes 4-5). The residual amount of ORF3c recovered in the pellet fraction after Triton X-100 treatment indicated that ORF3c is prone to aggregation in the presence of the ionic detergent. In conclusion, our in vitro data strongly suggest that ORF3c is targeted to and inserted in mitochondrial membranes.

ORF3c localises to the mitochondria in cells
To confirm the mitochondrial localisation of ORF3c in intact cells, Vero cells were transfected transiently with ORF3c-HA and immunofluorescence microscopy was performed to identify its subcellular localisation. Clear membrane staining was observed and cells were co-stained using antibodies against various sub-cellular markers. No colocalisation was observed between ORF3c-HA and tubulin, ER markers (calnexin, PDI), Golgi markers (RCAS1, TGN46), or endolysosomal markers (early: EEA1, late: CD63, lysosome: LAMP1). However, clear co-localisation was observed using antibodies against the MOM import receptors Tom20 and Tom70 ( Figure 3A).
To ensure that the C-terminal HA tag was not mis-directing ORF3c-HA localisation, alternative epitope tags (OPG2 and Strep) were fused to either the N or C terminus of ORF3c. Ectopic expression of all ORF3c versions displayed clear co-localisation with endogenous Tom70 ( Figure 3B). Furthermore, the mitochondrial localisation of ORF3c did not alter over 48 h post-transfection ( Figure S2).

PGAM5 interacts specifically with ORF3c
To identify host interaction partners, we immunoprecipitated ORF3c-HA from transfected Vero cells grown in labelled medium (stable isotope labelling of amino acids in culture, SILAC). ORF3c-HA-binding proteins were compared in triplicate to a HA-only control. Immunoprecipitated samples were subject to trypsin digest and then analysed by liquid chromatography with tandem mass spectrometry (LC-MS/MS).
Six proteins were identified as being enriched in ORF3c-HA samples relative to HA-only control samples with a p-value, q-value and FDR all < 0.05 ( Figure S3), and minimum of 0.5 log2 (i.e. 1.41-fold) enrichment. These were: PGAM5, RPL8, EIF6, ARPC5, CAVIN1 and CAPZB. The full set of quantified filtered proteins, and list of significantly enriched proteins are included as supplementary files (Supplemental Table 1, Supplemental Table 2). Of these six proteins, PGAM5 was chosen for further investigation due to its recently reported role in antiviral signalling 57,58 . The ORF3c:PGAM5 interaction was confirmed by coimmunoprecipitation of PGAM-FLAG and ORF3c-HA ( Figures 1E, 5B).
PGAM5 is a single pass transmembrane protein with a cytosolic serine/threonine phosphatase domain 59 , which localises to the MOM via its N-terminal transmembrane domain 60 . Here it oligomerises into dodecamers, the catalytically active state [61][62][63] . Besides roles in cell death related processes and mitochondrial dynamics, recently PGAM5 was found to play a role in upregulating IFN-β signalling during infection by viruses from multiple families 57,58 , acting via a direct interaction with MAVS -a tail-anchored protein that also localises to the MOM following oligomerisation 64,65 . PGAM5 multimerisation, induced as a result of viral or poly(I:C) stimulation, causes increased phosphorylation of TBK1 and IRF3 and a subsequent increased transcription of IRF3-responsive genes including IFN-β. Interestingly, this function of PGAM5 appears to be independent of its phosphatase activity, despite the dodecameric form of PGAM5 being catalytically active 57 . It has also been reported that SARS-CoV-2 infection results in increased ubiquitylation of PGAM5, accompanied by an overall decrease in PGAM5 protein levels, suggesting it is targeted to the proteasome during infection 66 . Therefore, we investigated the ability of ORF3c to dysregulate innate immune signalling in stimulated cells, hypothesising that the ORF3c:PGAM5 interaction would abrogate the PGAM5-driven antiviral effect.

ORF3c inhibits IFN-β signalling
We co-transfected ORF3c with luciferase reporter plasmids, wherein luciferase expression is driven by a range of promoters responsive to innate immune signalling, into HEK293T cells that were stimulated with either TNF-, PMA, Sendai virus (SeV) or IFN-. These would stimulate the promoters responsive to, respectively, NF-κB, activator protein 1 (AP-1), IFN-β and ISG56.1, or the interferon-stimulated response element (ISRE). Similar approaches have been used extensively to identify SARS-CoV-2 immune antagonists 17,27,[67][68][69][70] . The results indicated that the transcription from the IFN-β-responsive promoter, but not those responsive to NF-κB or ISRE, is inhibited by ORF3c ( Figure 4A). This appears to be mediated by the AP-1 element within the IFN-β promoter ( Figure 4A, 4B) in a dose-dependent manner ( Figure S4A). Other coronaviral proteins, known to antagonise this pathway, were also included as controls ( Figure S4B, S4C). Importantly, we observed the same trend regardless of the ORF3c tag position, or indeed using untagged ORF3c ( Figure 4B). Intriguingly, we could not detect the N-terminal HA tag, indicating that the N terminus may possess an internal cleavage site ( Figure 4B, see also Figure 6A). Finally, to decrease the probability of artefacts arising due to the system choice (a concern that has been raised in relation to other potential SARS-CoV-2 immune antagonists 71 ), we performed qRT-PCR analysis of SeV-infected A549 cells expressing ORF3c-FLAG stably ( Figure 4D). This indicated that the IFN-β, ISG56 and ISG54 mRNAs were specifically downregulated in ORF3c-expressing SeV-stimulated cells ( Figure 4C), indicating this blockade is at the transcriptional level. The downregulation of ISG56 mRNA level in qRT-PCR is inconsistent with the previous ISG56.1 reporter gene assay, which might be due to sensitivity differences between the two techniques, and between HEK293T and A549 cells. Given that A549 cells are derived from human alveolar cells and are broadly used in respiratory virus infection, the qRT-PCR data from these cells that analyses the transcription of endogenous genes is more compelling than the reporter gene assays. Collectively, these data provide evidence that ORF3c might contribute to the suppression of the IFN-β signalling pathway during SARS-CoV-2 infection in cultured cells.

ORF3c interacts with MAVS to restrict IFN-β promoter activation
Next, we co-transfected FLAG-tagged proteins from within the IRF3-signalling pathway (RIG-I CARD, MDA5, MAVS, TRIFΔRIP, TBK1 or IRF3-5D 72 ) alongside the IFN-β promoter-driven FLuc reporter and ORF3c. These proteins would activate the IFN-β promoter from various nodes and allow us to elucidate the stage(s) targeted by ORF3c. The results indicated that ORF3c acts upstream of the TRAF3/TBK1 nexus ( Figure 5A). These data, alongside published results that PGAM5 interacts with MAVS 57 , led us to hypothesise that ORF3c interacts with MAVS. To investigate this, we performed a coimmunoprecipitation. Tom20 was included to assess whether we were isolating mitochondrial membrane proteins nonspecifically. The results indicate that both MAVS and PGAM5 are specific interaction partners of ORF3c ( Figure 5B). Immunofluorescence experiments were also strongly supportive of co-localisation of ORF3c and MAVS in intact membranes ( Figure S5). MAVS overexpression is sufficient to induce expression from the IFN-β and AP-1 responsive promoter 65,73 . Thus we hypothesise that sequestration of MAVS due to interaction with ORF3c may underlie the observed decrease in transcription from the IFN-β gene ( Figure 5C).

ORF3c induces MAVS cleavage
During these experiments, we noticed an additional, lower molecular weight specific band appearing in the MAVS immunoblots upon ORF3c co-transfection. This was reproducible but did not appear upon MAVS/ORF3a co-transfection, indicating this MAVS cleavage was an ORF3c-specific effect ( Figure 6A). As it was inhibited by treatment with the pan-caspase inhibitor z-VAD, and mimicked the effects of apoptosis induction via the Bcl-2 inhibitor ABT737 ( Figure 6B), this suggests that ORF3c-induced MAVS cleavage may be mediated by caspases and accompanied by apoptosis. To further investigate the ORF3c-mediated MAVS cleavage, we examined MAVS mutants that had reported resistance to viral protein or caspase-mediated cleavage [74][75][76] . We found that the MAVS point mutations D429A/D490A, but not Q427A, E463A or C508R, blocked the ORF3c-induced cleavage, indicating the cleavage is mediated by caspase-3 ( Figure 6C).
Intriguingly, ORF3c is not expressed in the SARS-CoV-2 variant of concern delta (B.1.617.2) lineage due to a CAG to UAG mutation that introduces a premature termination codon at codon 5. The delta (B.1.617.2) variant therefore represents an opportunity to examine the effects of ORF3c depletion, albeit with the caveat that many other mutations co-exist in this genome compared to earlier lineages. We utilised this naturally occurring difference to examine MAVS levels in infected cells ( Figure 6D). In these experiments the relative intensities of the three MAVS-related bands differed from the previous experiment ( Figure 6A), possibly as a result of the different cell type (A549 versus HEK293T) or detection of endogenous rather than FLAG-tagged MAVS. Nonetheless, we found that MAVS is severely downregulated in infected cells compared to mock, and both the downregulation and the ratio of cleaved MAVS to full-length MAVS were somewhat reduced in delta-infected cells compared to cells infected with an early SARS-CoV-2 lineage possessing an intact ORF3c. Immunofluorescence staining for mitochondrial markers (Tom20 and Tom70) ( Figure 7A, Figure S6A) indicated that, although mitochondrial staining was reduced in infected cells compared to uninfected cells, no clear differences were apparent between the variants. Additionally, by transmission electron microscopy, no phenotypic differences in the mitochondria of cells infected with different variants were observed ( Figure 7B). This is perhaps unsurprising, due to the multiple redundant mechanisms of IFN dysregulation conferred by different SARS-CoV-2 proteins.

SARS-CoV-2 ORF3c variants in the human host
As noted above, ORF3c coding capacity is lost in the SARS-CoV-2 delta (B.1.617.2) variant, due to the appearance of a premature termination codon. To further study the appearance and dynamics of different ORF3c variants we queried 12,906,225 SARS-CoV-2 sequences from the GISAID database 77 with coverage of the ORF3c region. Seven ORF3c amino acid variants were present at an abundance of ≥ 0.1% of the total, namely the original variant (WT), besides R36I, L21F/R36I, S22L, Q5Y, K17E and the aforementioned delta premature termination codon variant (PTC) ( Figure 7D, Figure S6B). The CAG to UAG substitution at codon 5 of ORF3c that gave rise to the PTC variant pseudoreverted to UAU in the Q5Y variant. This restored expression of full-length ORF3c but with a Q to Y amino acid change at position 5 (the overlapping ORF3a amino acids are SD in WT, LD in PTC, and LY in Q5Y). This Q5Y pseudorevertant increased rapidly in mainland Europe sequencing reports around July 2021 before levelling off and eventually dying away as the delta virus variant was replaced with the omicron virus variant (B.1.1.529) in late 2021 ( Figure 7E). Reporter gene assays indicated that the ORF3c Q5Y pseudorevertant was less efficient than the WT at antagonising IFN-β production but still had a marked effect ( Figure  7C).

DISCUSSION
Coronaviruses encode a variety of accessory proteins in their genomes, which are nonessential for RNA replication but confer advantageous properties to the virus allowing efficient viral propagation in the host. Many of these accessory proteins are known antagonists of the innate immune response and are useful targets for antiviral treatment strategies. However their variability across the Coronaviridae often means that studies cannot be extrapolated to other members; for example, SARS-CoV-1 ORF3b, which overlaps the 3′ region of ORF3a, is truncated in SARS-CoV-2 5 ; and ORF10 in SARS-CoV-2 is entirely lacking in SARS-CoV-1. Previously, using comparative genomics, we identified ORF3c, an accessory protein conserved across the Sarbecovirus subgenus 10 . Here we have presented a functional analysis of ORF3c, revealing it to be a tail-anchored transmembrane protein that appears to be inserted into the mitochondrial outer membrane, where it interacts with MAVS and PGAM5, and reduces IFN-β signalling.
To the best of our knowledge, this is the first report of PGAM5 being involved in coronaviralhost protein:protein interactions, although it had been identified as a potential target for viral-induced proteasomal degradation 66 . PGAM5 localises to the MOM 60 , although there are also reports of PGAM5 within the inner mitochondrial membrane with the C-terminal catalytic domain facing the intermembrane space 78,79 . It has been suggested that its location may depend on cellular stress levels: PGAM5 is known to activate the MAP kinase pathway by dephosphorylating ASK1 (associated with cellular stress) 80 and is involved in numerous cell death-related processes. Primarily, it is thought to regulate mitochondrial dynamics (fusion versus fission). Its ability to promote or suppress various cell death pathways is a contentious issue (recently reviewed in Cheng, 2021 81 ): whilst reported initially as a pro-necrotic factor 59,82,83 , this has been disputed 84 . Equally controversial are its role(s) in apoptosis: it has been reported to suppress apoptosis in certain models 85 , yet be essential for apoptosis induction in others [86][87][88] . What does appear consistent -and dependent on the phosphatase function of PGAM5 -is a role in the induction of mitophagy, an organelle-specific form of autophagy that protects the cell against necroptosis by selectively degrading damaged mitochondria 87,[89][90][91][92] .
It is unlikely that ORF3c affects the phosphatase function of PGAM5, given this is thought to be independent of the role of PGAM5 in immune signalling. PGAM5 multimerisation has been shown to be required for both IFN-β upregulation and induction of many cell deathrelated events. One possibility is that the ORF3c-induced decrease in IFN-β may be at least partially caused by a reduction in PGAM5 multimerisation due to its interaction with ORF3c, and a subsequent redirection of PGAM5 to the proteasome. Equally, it is possible that PGAM5 multimerisation continues in the presence of ORF3c, but the PGAM5:MAVS interaction is ablated. A third hypothesis involves the formation of a potential trimeric complex (ORF3c:MAVS:PGAM5), resulting in the functional abrogation of both host proteins. As the self-multimerisation of MAVS and PGAM5 are independent events 57 , this would be an efficient mechanism of sequestering both potential innate response activators with a single viral protein.
In addition to the observed ORF3c:MAVS and ORF3c:PGAM5 interactions that may inhibit PGAM5:MAVS stimulation of IFN-β production, we also observed cleavage of MAVS when ORF3c was overexpressed and this cleavage appeared to be driven by caspase-3 suggesting a link to apoptosis. Whether ORF3c-driven apoptosis is an artefact of overexpression outside of the context of virus infection (where other viral proteins might inhibit apoptosis) is currently unknown. While we found a tantalising suggestion of differences in MAVS cleavage between delta and non-delta infections, the overall strong downregulation of full-length MAVS in either infection compared to mock and the presence of cleaved MAVS even in mock-infected cells in this system, besides potential effects of other differences between delta and non-delta viruses, make it difficult to draw robust conclusions at this stage. Intriguingly, ORF3a of both SARS-CoV-1 and SARS-CoV-2, which localises to the plasma membrane, has been implicated in apoptosis induction; however in the case of SARS-CoV-1 this was mapped to the cytosolic C-terminal domain and therefore could not have been an incorrectly attributed function of ORF3c 93,94 .
PGAM5 has been reported to be cleaved (within the N-terminal transmembrane domain) in response to mitochondrial dysfunction and mitophagy, specifically during outer membrane rupture, resulting in its release into the cytosol 78,79,95 . Although we did not observe cleavage of PGAM5 in the presence of ORF3c (indicating mitophagy is not occurring), it would be of interest to confirm this with other methods. Equally, it would be of interest to analyse the phosphorylation patterns of PGAM5 when bound to ORF3c given that the phosphorylation status of this protein has numerous effects upon the downstream signalling pathways that it activates. During preparation of this manuscript, data became publicly available indicating ORF3c overexpression does not affect mitophagy, despite its mitochondrial localisation, but rather blocked autophagy by causing autophagosome accumulation 96 . This finding supports our preliminary evidence that ORF3c may direct cells towards apoptosis in preference to other cell death pathways, possibly by sequestration of PGAM5 and prevention of mitophagy activation.
Among SARS-CoV-2 proteins, ORF3c is not alone in having an inhibitory effect on IFN-β expression; however the mode of action differs significantly between the proteins involved. Some proteins directly reduce IFN-β mRNA or protein levels: ORF6, ORF8 and N all have similar effects to ORF3c and reduce IFN-β mRNA (and hence protein) levels although, unlike ORF3c, ORF6 and ORF8 simultaneously reduce expression from ISRE-containing promoters 67,68,97 . ORF6 has also been shown independently to reduce IRF3 and STAT1 nuclear translocation 17,36,69,98 ; in comparison, N is thought to inhibit the TRIM25:RIG-I interaction 99,100 . NSP13 6,69 and NSP6 69 bind TBK1 directly, preventing IRF3 phosphorylation. Other sarbecoviral proteins inhibit type I IFN activation as an indirect result of their enzymatic function: SARS-CoV-1 NSP16 reduces MDA5 and IFIT activation by capping the viral RNA 101 ; NSP14 reduces IFN levels by shutting down host translation 102 . Still others (NSP1 and NSP6) suppress the signalling induced by type I IFN, whilst leaving protein levels unaffected 69,103 . The convergent effects of these viral antagonistic proteins, which collectively target multiple layers of the immune signalling cascade, no doubt combine to reduce the host antiviral response and increase virus fitness in the natural host.
Despite the redundancy in IFN antagonists, those that operate directly from a mitochondrial location are uncommon amongst characterised coronaviral proteins. ORF9b and ORF10 are the exceptions. Similar to our observations for ORF3c, these proteins localise to the mitochondria upon overexpression and are able to dampen the immune response in the absence of other coronaviral proteins 28,40,104,105 , indicating they may each act in an unassisted fashion and not from within a virally encoded protein complex. ORF9b does, however, interact with host Tom70 6,26,104,105 which in turn is known to interact with MAVS 106,107 . It has been suggested that the ORF9b:Tom70 interaction may lead to either apoptosis or mitophagy, as the levels of functional Tom70 will affect both of these processes; but these hypotheses have not been validated experimentally 105 . This provides an interesting parallel to ORF3c, which appears to operate at the same cellular location as Tom70 (specifically, the MOM 108 ), yet we did not observe ORF3c and Tom70 to coimmunoprecipitate (data not shown). It remains possible that an indirect, transient interaction may occur between ORF9b, ORF3c, Tom70 and MAVS. Additionally, unlike ORF3c, ORF9b has been observed to inhibit the IKK-γ (NEMO) cascade 27 , suggesting that ORF9b has additional functions downstream of MAVS, specifically inhibiting the NF-κB pathway. Thus it appears that sarbecoviruses have evolved complementary approaches, mediated by ORF9b and ORF3c respectively, to subvert IFN-β signal transduction and reduce mitochondrial innate immune pathway activation from within the mitochondrion itself. The mechanism employed by ORF10 is different yet again; however this protein is not conserved across the subgenus. ORF10 localises to the mitochondrion where it interacts with the mitophagy receptor NIX, to activate mitophagy and thereby eliminate aggregated MAVS 28 . This may in part explain why downregulation and degradation of MAVS is still visible during infection with a delta variant lacking ORF3c ( Figure 6D). Downregulation of MAVS has also been reported from proteome-wide studies of SARS-CoV-2 infected cells (although the level of reduction appears to depend on the model system) 109,110 .
Prior to the identification of ORF3c, a screen of viral and host protein:protein interactions did not identify either MAVS or PGAM5 as probable interaction partners for any SARS-CoV-2 protein 6 . This was reflected in a thorough literature review 111 . Equally, there are remarkably few confirmed direct interactions of MAVS with SARS-CoV-2 proteins. Although there are some reports of the M protein interacting with MAVS 25,70 , this is not reconciled with the lack of a mitochondrial localisation for the M protein which is found consistently at the ER and Golgi 70,105 . As such, the relevance of this potential interaction during an actual viral infection is open to question. In short, ORF3c is the only conserved sarbecoviral protein that has been shown to bind directly to MAVS within the MOM, where the majority of activated MAVS would be located during a viral infection.
There are several obstacles currently impeding further progress. For example, analysis of ORF3c-HA transfected cell lysates following digestion with trypsin or chymotrypsin and LC-MS/MS analysis failed to identify ORF3c-derived peptides even when inclusion lists of predicted ORF3c peptides were employed, likely due to high hydrophobicity of the peptides. This explains why multiple published analyses using trypsin digestion have failed to identify the ORF3c protein during infection (data not shown) 9,112 . Equally, our own work has been limited by the poor immunogenicity of ORF3c: a peptide-raised rabbit antibody was not reactive against transfected cell lysates, nor was a sheep polyclonal antibody raised against the entire ORF3c protein (data not shown). However, we are confident that these hurdles will be overcome with time.
The discovery of ORF3c necessitates a reassessment of previous sarbecoviral ORF3atargeted studies, which may have also included ORF3c during protein overexpression. DNA-based constructs (although useful and often necessary) create an artificial system, because these vectors generally exclude viral untranslated regions and are designed to optimise expression from the desired AUG codon. Thus the degree of ORF3c expression, alongside the desired ORF3a, remains an unknown factor for most previous studies. It is also probable that ORF3c-mediated effects were overlooked in historical SARS-CoV-1 studies due to the extensive use of Vero cells, which allow efficient replication of many coronaviruses but are deficient in type I IFN production 113 . For example, inadvertent deletion of ORF3c via ORF3a mutation results in only minor attenuation in these cells, as measured by viral infectious titre 114,115 (although, notably, deletion of the entire ORF3a region reduced cytopathic effect and cell death 116 ).
Sarbecoviruses appear to be primarily bat viruses ( Figure S1) and ORF3c appears to be conserved throughout this clade (with the exception of two different sarbecovirus sequences from Rhinolophus hipposideros where ORF3c is truncated; Figure S1B). In the human host, ORF3c is clearly not essential (as demonstrated by the success of the delta variant where ORF3c is disrupted). However, whether or not ORF3c provides a selective advantage in the human host is unclear. It is possible that the observed loss, restoration and variations of ORF3c in the human host may be random events whose effects on virus fitness are outweighed by increases in virus fitness conferred by other mutations (e.g. in the spike protein) with ORF3c variations being "carried along". The importance of ORF3c in the human host presumably will become clearer as the virus adapts to long-term persistence in the human population. Similarly, other hosts such as the palm civet Paguma larvata and the Malayan pangolin Manis javanica may be intermediate hosts to which these viruses have not fully adapted. The observed Q5Y pseudorevertant of the delta PTC truncation is curious and may reflect a selective advantage of restoring ORF3c protein expression, but it might also have been a random event. Notably Q5 is perfectly conserved across the sarbecovirus ORF3c sequences ( Figure S1A) suggesting that a Q at this position is functionally important, at least in bats. Sarbecoviruses have many different ways to antagonise host innate immunity and it may be that ORF3c is redundant in the human host (or its relative importance may also depend on cell type, host genetic background or disease state).
Although not essential, ORF3c may still lead to an increase in virus fitness in the human host. Future work will be needed to compare WT and ORF3c knockout viruses in both human and bat cell lines, besides animal models.

ACKNOWLEDGMENTS
The authors would like to thank Dr James Hastie and the staff at the MRC Protein Phosphorylation and Ubiquitylation Unit, University of Dundee, for providing anti-3c custom sheep antibodies. We would also like to thank the UK Health Security Agency for providing virus isolates. We gratefully acknowledge all data contributors, i.e. the Authors and their Originating laboratories responsible for obtaining the specimens, and their Submitting laboratories for generating the genetic sequence and metadata and sharing via the GISAID Initiative, on which some of this research is based.

DECLARATION OF INTERESTS
The authors declare no competing interests.
For DNA transfections, cells were seeded in 6-well plates at 30-40% confluency the day prior to transfection. Four

Generation of tagged ORF3c expression constructs
The coding sequence of ORF3c fused with Strep or HA coding sequences (either N-or Cterminal) was amplified via PCR and each amplicon was inserted into the pCAG-PM vector 119,120 using AflII and PacI restriction sites. A GGSGGS linker was used in these cases. The resulting plasmids were confirmed by Sanger sequencing (University of Cambridge, Department of Biochemistry DNA Sequencing Facility).
OPG2-tagged ORF3c plasmids were generated by site-directed mutagenesis (Stratagene QuikChange, Agilent Technologies) and confirmed by DNA sequencing (GATC, Eurofins Genomics). To improve the signal intensity of in vitro synthesised radiolabelled proteins, linear DNA templates containing an additional five methionine residues (5M) at the C terminus were generated by PCR, and linear DNA templates were transcribed into mRNA using T7 polymerase (Promega).
FLAG-tagged and untagged ORF3c expression vectors were constructed by PCR amplification from pCAG.ORF3a and inserted into pCAG-FLAG or pcDNA6B-FLAG. FLAGtagged ORF3c mutant Q5Y was prepared by site-directed mutagenesis using a commercial kit (New England Biolabs, E0554S).

Generation of an inducible ORF3c-expressing cell line
Inducible A549 cells expressing ORF3c were prepared by lentivirus transduction. The lentivirus preparation and infection have been described 123 . In brief, A549 cells were transduced with a lentivirus vector expressing a tetracycline repressor (TetR) linked to a nuclear localisation signal (NLS) and enhanced green fluorescent protein (EGFPnlsTetR). The transduced cells were selected with medium supplemented with 0.5 mg/mL G418, and then the EGFPnlsTetR positive cells were sorted by fluorescence activated cell sorting (FACS). The sorted cells were transduced with a second lentivirus vector expressing FLAGtagged ORF3c. Transduced cells were then selected by medium supplemented with 500 ng/mL puromycin.

RT-qPCR
Inducible A549 cells expressing ORF3c-FLAG or empty vectors were seeded in triplicate at 4 x 10 5 cells per well on 24-well plates. The next day, the cells were mock-induced or induced with 100 ng/mL doxycycline overnight. Then, the cells were infected with SeV for 6 h to stimulate IFN-β transcription. The cells were harvested and total RNA was extracted using RNeasy Mini Kit (Qiagen, 74106). Thereafter, 500 ng RNA was used in cDNA synthesis using SuperScript III reverse transcriptase (Invitrogen, 18080093). The expression of genes of interest was analysed by qPCR. Each cDNA sample was duplicated in the reaction, using SYBR Green Master Mix (Thermo Fisher Scientific, 4309155). A ViiA 7 real-time PCR system (Thermo Fisher Scientific) was used to determine each reaction cycle threshold. Amplification of the investigated genes was normalised to GAPDH amplification. Fold induction of each gene was calculated relative to unstimulated control of each condition.

Leaky scanning assays
A plasmid was engineered in which a T7 promoter sequence preceded the first 77 nucleotides (the leader sequence) of the ORF3a subgenomic RNA. This was followed by the ORF3a coding region, up to but excluding the stop codon of ORF3c. This was followed by a foot and mouth disease virus 2A peptide sequence and then a RLuc ORF. The RLuc was in-frame with the AUG (start) codon of either ORF3a or ORF3c; the plasmids were termed pL_3a_SGRluc and pL_3c_SGRluc respectively. These plasmids were linearised with FastDigest AanI (Invitrogen) and RNA was in vitro transcribed using the T7 mMessage kit (Invitrogen) and terminated with DNase treatment. RNA was purified with a silica-gel based membrane method (RNA Clean and Concentrator, Zymo Research) and the integrity was confirmed by gel electrophoresis. FLuc RNA was used as a transfection control. This RNA was generated by T7 in vitro transcription, using an in-house FLuc plasmid linearised with BamHI as described 124 .
RNA transfections were conducted in triplicate in 96-well plates. Vero cells were seeded the day prior at 40% confluency (1 x 10 5 cells per well). One hundred and fifty ng of L3a_SGRluc RNA (or that of a derived mutant) plus 30 ng of FLuc RNA was transfected into each well using 1

SDS-PAGE and immunoblots
Lysates from transfected cells were analysed by sodium dodecyl sulphate polyacrylamide gel electrophoresis (SDS-PAGE) using Tris-Glycine, at the percentage polyacrylamide suitable for resolution of the protein of interest. Precast Novex 10-20% tricine protein gels (Thermofisher Scientific) were used to resolve proteins under 10 kDa. Proteins were transferred to 0.

Immunoprecipitation
To detect protein-protein interactions, HEK293T cells (3 x 10 6 cells per plate) were transfected with 2 μ g of FLAG-tagged GFP, MAVS, TOMM70, TOMM20 or PGAM5 plasmids and 2 μ g of ORF3c-HA expressing plasmid. At 48 h post transfection, transfected cells were washed with ice-cold PBS and lysed with cell lysis buffer (200 mM NaCl, 50 mM Tris, 1% NP40, pH 7.0) supplemented with cOmplete Mini EDTA-free protease inhibitor cocktail (Roche, 11836170001). The lysates were cleared by centrifugation at 13,000 rpm, and the FLAG-tagged proteins were immunoprecipitated with anti-FLAG agarose (A2220, MERCK). Input samples were collected as described above.

Confocal microscopy
Vero cells were transfected transiently with ORF3c plasmids (described above). At 24 h post-transfection, cells were trypsinised and re-seeded upon sterile 13 mm glass coverslips. At 48 h post-transfection, the cell culture medium was removed and coverslips were washed once with PBS prior to fixation with 4% PFA/PBS. For MitoTracker staining, the stock solution of MitoTracker® Red CM-H2XRos (Invitrogen, M7513) was prepared with DMSO to a concentration of 1 mM. Before staining live cells on coverslips, MitoTracker was diluted to 100 nM with pre-warmed DMEM without serum. The cells were washed twice with pre-warmed DMEM to remove serum and then incubated with MitoTracker containing medium at 37 °C for 1 h before fixation.
For SARS-CoV-2 infections, A549 +ACE2 +TMPRSS2 cells were grown on glass-bottomed 24-well plates (MatTek, P24G-0-13-F). Cells were infected at an MOI of 1 and placed on a rocker at room temperature for 2 h. The inocula were removed and replaced with full medium. At 24 h post infection plates were submerged in 4% PFA/PBS for 20 min to fix cells.
Free aldehydes were quenched with 15 mM glycine/PBS before cells were permeabilised with 0.1% saponin/PBS. Blocking and subsequent steps were performed using 1% BSA/0.01% saponin/PBS. Coverslips were inverted into droplets of primary antibody for 1 h, before being washed three times with PBS. Coverslips were then placed onto a second droplet containing fluorophore-conjugated secondary antibody for 45 min. Coverslips were washed three times before being mounted on to glass slides using ProLong Gold Antifade mountant containing DAPI (Invitrogen, P36931). Cells were imaged using a LSM700 confocal microscope (63x/1.4 NA oil immersion objective; ZEISS).

Electron microscopy
Cells were seeded upon plastic Thermanox coverslips in 24-well plates. Following infection, plates containing infected cells (see virus infections methods) were submerged in 2% PFA / 2.5% glutaraldehyde / 0.1 M cacodylate buffer, pH 7.4. Cells were washed with 0.1 M cacodylate buffer before being stained using 1% osmium tetroxide:1.5% potassium ferrocyanide for 1 h, and staining was further enhanced with UA-Zero (Agar Scientific) for 30 min. Cells were washed, dehydrated and infiltrated with Epoxy propane (CY212 Epoxy resin:propylene oxide) before being embedded in Epoxy resin. Epoxy was polymerised at 65°C overnight before Thermanox coverslips were removed using a heat-block. Seventy nm sections were cut using a Diatome diamond knife mounted to an ultramicrotome. Ultrathin sections were stained with lead citrate. An FEI Tecnai transmission electron microscope at an operating voltage of 80 kV was used to visualise samples.

Semi-permeabilised (SP) cell preparation
HeLa cells (human epithelial cervix carcinoma cells, mycoplasma-free), as described 53,54,126 , were provided by Martin Lowe (University of Manchester) and were cultured in DMEM supplemented with 10% (v/v) FBS (Gibco, 10500-064) and maintained in a 5% CO 2 humidified incubator at 37°C. Cells were seeded at 1 x 10 6 per 10 cm 2 dish and, once ~80% confluent, cells were semi-permeabilised using digitonin (Calbiochem) and endogenous mRNA was removed by treatment with 0.2 U Nuclease S7 Micrococcal nuclease, from Staphylococcus aureus (Sigma-Aldrich, 10107921001) and 1 mM CaCl 2 as described 53,54 . After quenching by the addition of EGTA to 4 mM final concentration, SP cells were resuspended in an appropriate volume of KHM buffer (110 mM KOAc, 2 mM Mg(OAc) 2 , 20 mM HEPES-KOH pH 7.2) to give a suspension of 3 x 10 6 SP cells/mL as determined by trypan blue staining (Sigma-Aldrich, T8154). Freshly prepared SP cells were then included in translation master mixes such that each translation reaction contained 2 x 10 5 cells/mL.

In vitro ER import assays
Translation and membrane insertion assays, supplemented with nuclease-treated canine pancreatic microsomes (from stock with OD 280 = 44/mL) or SP HeLa cells, were performed in nuclease-treated rabbit reticulocyte lysate (Promega) as described 53,54 . Briefly, in the presence of EasyTag EXPRESS 35

Recovery and analysis of radiolabelled products synthesised in ER import assays
Following puromycin treatment, microsomal or SP cell membrane associated fractions were recovered by centrifugation through an 80 μ L high-salt cushion [0.75 M sucrose, 0.5 M KOAc, 5 mM Mg(OAc) 2 and 50 mM HEPES-KOH, pH 7.9] at 100,000 g for 10 min at 4°C. Then, the pellet was suspended in SDS sample buffer and, where indicated, samples were treated with 1000 U of endoglycosidase Hf (New England Biolabs, P0703S). To confirm association of ORF3c with the ER membrane, microsomal membrane-associated fractions were resuspended in KHM buffer (20 μ L) and subjected to a protease protection assay using trypsin (1 μ g/mL) with and without 0.1% Triton X-100 or sodium carbonate extraction (0.1 M Na 2 CO 3 , pH 11.3) as described 53 , prior to suspension in SDS sample buffer. All samples were solubilised for 12 h at 37°C prior to resolution by SDS-PAGE (16% PAGE, 120 V, 120 min). Gels were fixed for 5 min (20% MeOH, 10% AcOH), dried for 2 h at 65°C, and radiolabelled products were visualised using a Typhoon FLA-700 (GE Healthcare) following exposure to a phosphorimaging plate for 24-72 h.

Mitochondria isolation
HEK293T cells were harvested in PBS, washed, and resuspended in 1X THE buffer (300 mM trehalose, 10 mM KCl, 10 mM HEPES, 1 mM EDTA, and 2 mM PMSF). Homogenization was performed in a PTFE pestle/glass Potter-Elvehjem at 700 rpm. The resulting cell lysate was centrifuged at 400 x g for 10 min at 4°C, followed by a second centrifugation step at 800 x g for 10 min at 4°C to remove unbroken cells. To sediment mitochondria, the supernatant was centrifuged at 10,000 x g for 10 min at 4°C. The mitochondrial sediment was washed with THE buffer and resuspended in the same buffer. The concentration of mitochondria was determined using a standard Bradford assay.

In vitro import of [ 35 S]ORF3c-5M into HEK293T isolated mitochondria
The ORF3c gene was amplified by PCR, adding five methionine codons to the C terminus (ORF3c-5M). The corresponding mRNA was prepared using the SP6 mMESSAGE mMACHINE kit (Invitrogen) according to the manufacturer's specifications. Radiolabelled

Proteinase K accessibility assay
After [ 35 S]ORF3c-5M import, mitochondria were resuspended in TBS (intact mitochondria) or TBS + 1% Triton X-100 (lysed mitochondria), followed by digestion with proteinase K (PK) (10 µg/mL final concentration) for 10 min on ice. Next, PK was inhibited with 2 mM PMSF. Intact mitochondria were then washed. Proteins were precipitated with trichloroacetic acid (TCA). Samples were analysed by Tris/tricine SDS-PAGE. Proteins were transferred onto PVDF membranes. The [ 35 S]ORF3c-5M protein signal was detected by digital autoradiography and mitochondrial proteins were immunodetected.

Chemical extraction of mitochondrial membranes
After [ 35 S]ORF3c-5M import, mitochondria were incubated in sodium carbonate buffer (pH 11.5) or in 1% Triton X-100 at 1 mg/mL final concentration for 20 min on ice. Insoluble material was sedimented at 100,000 x g for 1 h at 4°C. Pellet and soluble fractions were collected, and the protein was precipitated with TCA. Proteins were resolved on urea SDS-PAGE, transferred to PVDF membranes, and analysed for the [ 35 S]ORF3c-5M protein by digital autoradiography and mitochondrial proteins by immunodetection.

Immuno-isolation of mitochondria after ORF3c import 1
After import of [ 35 S]ORF3c-5M, mitochondria were re-isolated with an anti-TOM22 mitochondria isolation kit (Miltenyi Biotec). In brief, resuspended mitochondria were incubated with anti-TOM22 Microbeads for 1 h with end-to-end mixing at 4°C. Beads were then collected through a MACS column placed in a magnetic MACS separator. The column was then washed to remove non-specific binding. After removing the column from the magnetic MACS separator, the beads were recovered and treated with TCA. Samples were separated on Tris/tricine SDS-PAGE, transferred onto PVDF membranes, and analysed by autoradiography to detect [ 35 S]ORF3c-5M protein and immunodetection for the required proteins.

SILAC immunoprecipitation experiment
Vero cells were grown in high glucose DMEM lacking arginine and lysine (Life Technologies), with 10% dialysed (7 kDa MWCO) FBS and supplemented with light (R0K0), medium (R6K4) or heavy (R10K8) stable isotope labelled arginine and lysine. Cells were maintained in labelled medium for 6 passages before transfection and immunoprecipitation. Labels were switched for the different replicates to control for any impact of the different culture media on cell growth/gene expression as follows: replicate 1 (L: HA-3c, M: Control), replicate 2 (M: HA-3c, H: Control) and replicate 3 (H: HA-3c, L: Control).
Vero cells were transfected with ORF3c-HA as described above. At 24 h post-transfection, monolayers were washed in ice-cold PBS and co-immunoprecipitation of the HA-tagged protein and interacting proteins was performed using a commercially available kit (HA Tag Magnetic IP/Co-IP kit, Pierce). An input sample (10%) was retained for analysis by immunoblot prior to binding the lysates to the magnetic anti-HA beads. After removing the remains of the wash buffer (third and final wash), samples were heated to 95• for 5 min in buffer consisting of 200 mM HEPES pH 8, 1% SDS and 1% NP40. Beads were collected by centrifugation and the supernatant was retained. Equal volumes of the control and HA-3c samples for each replicate were then combined and the samples were reduced, alkylated and trypsin-digested using the SP3 method 127 .
LC-MS/MS analysis was conducted on a Dionex 3000 coupled in line to a Q-Exactive-HF mass spectrometer. Digests were loaded onto a trap column (Acclaim PepMap 100, 2 cm × 75 µm inner diameter, C18, 3 µm, 100 ˚A) at 5 µL per min in 0.1% (v/v) TFA and 2% (v/v) acetonitrile. After 3 min, the trap column was set in line with an analytical column (Easy-Spray PepMap® RSLC 15 × 50 cm inner diameter, C18, 2 µm, 100 ˚A) (Dionex). Peptides were loaded in 0.1% (v/v) formic acid and eluted with a linear gradient of 3.8-50% buffer B (HPLC grade acetonitrile 80% (v/v) with 0.1% (v/v) formic acid) over 95 min at 300 nL per min, followed by a washing step (5 min at 99% solvent B) and an equilibration step (25 min at 3.8% solvent). The Q-Exactive-HF was operated in datadependent mode with survey scans acquired at a resolution of 60,000 at 200 m/z over a scan range of 350-2000 m/z. The top 16 most abundant ions with charge states +2 to +5 from the survey scan were selected for MS2 analysis at 60,000 m/z resolution with an isolation window of 0.7 m/z, with a (N)CE of 30%. The maximum injection times were 100 and 90 ms for MS1 and MS2, respectively, and AGC targets were 3e6 and 1e5, respectively. Dynamic exclusion (20 s) was enabled.
Data analysis was conducted in MaxQuant 1.6.7.0 128 . Options were set at default unless specified. Multiplicity was set at 3, with Arg6 and Lys4 set as 'medium' labels and Arg10 and Lys8 set as 'heavy' labels. Digestion was Trypsin/P, permitting up to two missed cleavages. Oxidation (M) and N-terminal protein acetylation were selected as variable modifications, and carbamidomethylation as a fixed modification. Under instrument settings, intensity was set to 'total sum'. Fasta files containing the Uniprot African Green Monkey proteome (Chlorocebus sabeus, 19,223 entries, downloaded 16 May 2020) and a custom .fasta file containing HA-3c protein sequences were used for the search databases. The proteomics data generated in this study have been deposited in the ProteomeXchange Consortium (http://proteomecentral.proteomexchange.org) via the PRIDE repository (PXD037765).
Downstream data analysis was conducted in Matlab R2019a from the MaxQuant proteinGroups output file. Reverse database hits and common contaminants (MaxQuant contaminant list) were removed. Within each replicate, data were first normalised for equal protein loading based on the median SILAC intensity of shared proteins. Rows with >3 missing values were removed. Missing data were imputed (knn), and data were converted to HA-3C/HA-only ratios for each replicate, and then re-normalised across the three replicates by dividing by column median. A one-sample t-test was used to determine statistical significance. Multiple hypothesis testing was controlled using the approach of Storey, 2002 129 , with all meeting a FDR < 0.05. The Matlab scripts used for all data processing and figure generation with this data are available from the Emmott lab Github page at: https://github.com/emmottlab/sars2_3c/.

Virus infections
To prepare lysates for immunoblots, A549 +ACE2 +TMPRSS2 cells were infected with SARS-CoV-2 (Wuhan) [GenBankAcc: MW041156.1] or delta variant [GISAID accession number: EPI_ISL_1731019, kindly provided by Prof. Wendy Barclay, Imperial College London] at a MOI of 2, or mock infected. At 24 h post-infection, cells were lysed directly with 2x SDS sample buffer and heated to 95• for 10 min. To prepare samples for electron microscopy, A549 +ACE2 +TMPRSS2 cells were grown on 13-mm Thermanox (plastic) coverslips (Nunc) in the wells of a 24-well plate. Cells were infected at an MOI of 1 and placed on a rocker at room temperature for 2 h. The inocula were removed and replaced with full medium and cells were returned to incubators for 24 h, at which point the plate was prepared for EM (see Electron microscopy methods).

Identification of sarbecovirus ORF3c variants
To identify sequences with coverage of the ORF3a region, NCBI online tblastn 130 was used on 11 Oct 2022, using the SARS-CoV-2 ORF3a protein (GenBank: YP_009724391.1) as query and the NCBI nr/nt database as subject, with organism taxonomy limited to Coronaviridae (taxid:11118) excluding SARS-CoV-2 (taxid:2697049), word size = 2, max target sequences = 1000, no low-complexity masking, and other parameters as defaults.
The highest e-value of the 417 returned sequences was 8 x 10 −4 . The coordinates of tblastn matches on the subject sequences were extended to maximal stop-codon-to-stop-codon open reading frames, which were extracted, translated, and aligned as amino acid sequences with MUSCLE v3.8.31 131 . The amino acid alignment was used to guide alignment of the nucleotide sequences (EMBOSS:tranalign 132 ) and the nucleotide sequences were 5′-truncated to the conserved ORF3a AUG initiation site. Sequences with defective or truncated ORF3a due to incomplete sequencing, frame-disrupting insertion/deletion errors, or other inconsistencies were discarded. An ORF homologous to SARS-CoV-2 ORF3c was not identified in the following sequences: MZ293757 (Hipposideros bat coronavirus, unclassified Coronaviridae member), and HQ166910 and NC_025217/KF636752 (subgenus Hibecovirus) and these sequences were therefore discarded. All remaining sequences had ≥ 62.6% amino acid identity to SARS-CoV-2 in ORF3a and may be regarded as members of subgenus Sarbecovirus, whereas the four discarded highly divergent non-sarbecovirus sequences had ≤ 27.1% amino acid identity to SARS-CoV-2 in ORF3a.
For the remaining 394 sequences, the ORF3c amino acid sequences were determined based on the conserved initiation site. In one (MG772933) ORF3c began with a GUG instead of an AUG codon. ORF3c was truncated in eight sequences. Five of these (EU371560, EU371561, EU371562, EU371563 and EU371564; ORF3c truncated to MLLLQVLFMLQQ) form a discrete subclade of SARS-CoV-1 viruses, but the sequences lack metadata and their provenance is unclear; nonetheless their ORF3a proteins have 99.3% amino acid identity to other SARS-CoV-1 isolates that have intact ORF3c. Another one (FJ882963; ORF3c truncated to MLLLQVLFML) also has an ORF3a protein with 99.3% amino acid identity to other SARS-CoV-1 isolates with intact ORF3c. Although these eight SARS-CoV-1 sequences were found to be ORF3c-defective, 184 other SARS-CoV-1 sequences had an ORF3c amino acid sequence that was identical to that of the SARS-CoV-1 reference sequence. In contrast, the remaining two sequences with truncated ORF3c (shown in Figure S1B) represent distinct sarbecovirus lineages.
ORF3c sequences with 100% amino acid identity to ORF3c of either the SARS-CoV-1 or SARS-CoV-2 NCBI reference sequences (NC_004718 and NC_045512, respectively) were discarded. Furthermore, four additional sequences (KF514407, AY463059, AY463060 and HG994853) with ≥ 99% amino acid identity to NC_004718 or NC_045512 in ORF3a were also discarded. In general, sequences with <99% identity in ORF3a were associated with non-human hosts and -where they had a different ORF3c amino acid sequence from NC_004718 and NC_045512 -were retained. Lab mutant recombinant sequences MT782114 and MT782115 were also removed. The remaining 191 ORF3c amino acid sequences represented 54 unique sequences ( Figure S1A). ORF3c sequences were clustered with BLASTCLUST with a 90% identity threshold (-p T -L 0.95 -b T -S 90), resulting in seven clusters, and a representative sequence (SARS-CoV-1, SARS-CoV-2, or the most abundant unique sequence in the cluster) was chosen for Figure 1A.

Analysis of SARS-CoV-2 ORF3c sequences from the GISAID database
The 13,427,526 available SARS-CoV-2 genome sequences were downloaded from epicov.org 77  To identify and extract ORF3c sequences, we searched for exact matches to any of seven 18-nt seed sequence queries spanning the region from the start of the ORF3a sgmRNA transcription regulatory sequence (TRS) to the nucleotide 5′-adjacent to the ORF3c AUG initiation codon, where each seed sequence had a 9-nt overlap with the previous seed sequence (viz. ACGAACTTATGGATTTGT, ..., AGCAAGGTGAAATCAAGG). We then extracted the 200 nt downstream from the 3′-most matched seed, removed sequences with any "N"s (i.e. ambiguous nucleotides) in this region, trimmed the 5′ ends to the start of ORF3c, truncated the 3′ end at 126 nt downstream to encompass the 41 sense codons and the stop codon of ORF3c, leaving 12,906,225 sequences covering the ORF3c region.
These were translated and the number of occurrences of each unique 42-mer (amino acids + stop codons) were enumerated. Any sequence with fewer than 12,906 occurrences (i.e. 0.1% of total) were discarded, leaving 11 unique 42-mers. Five of these were variants of the PTC mutant where the different 42-mers give rise to a single translatable peptide MLLL, and therefore their occurrence counts (4,110,409, 21,375, 17,287, 15,798, and 14,876) were summed. The remaining seven variants are shown in Figure S6B. Collection date and country metadata were extracted from the sequence records and used to produce Figures  7D and 7E.  i) a protease protection assay using trypsin in the presence or absence of Triton X-100 (TX-100), ii) alkaline sodium-carbonate extraction, or iii) endoglycosidase H (Endo H) treatment. Resulting products were analysed by SDS-PAGE and phosphorimaging. (B) Schematics of parent ORF3c and its N-terminally (OPG2-ORF3c) and C-terminally (ORF3c-OPG2) OPG2-tagged variants. Comprising residues 1-18 of bovine rhodopsin (Uniprot: P02699), the OPG2 tag is indicated in blue and its N-glycan sites in pink. The putative ORF3c transmembrane domain is indicated in yellow. (C) Membrane-associated products of ORF3c synthesised in the presence of ER-derived microsomes were treated with trypsin (lanes 2-3) or sodium carbonate (lanes 5-6) as outlined in Ai-Aii. (D) Using ER-derived microsomes (lanes 1-6) or SP cells (lanes 7-12), parent and OPG2tagged variants of ORF3c were synthesised as outlined in Aiii. N-glycosylated (2Gly) and non-glycosylated (0Gly) species were confirmed using Endo H. Note that, whilst on the same gel, the signals in lanes 5-6 and 11-12 of panel (D) have been overexposed compared to lanes 1-4 and 7-10 (demarcated by dotted lines) in order to enhance the visibility of the radiolabelled products. RRL, rabbit reticulocyte lysate. (E) In vitro import and proteinase K (PK) accessibility assay of isolated HEK293T mitochondria upon [ 35 S]ORF3c-5M and COX4-1 import. p, precursor protein; m, mature protein.
(G) Chemical protein extraction following [ 35 S]ORF3c-5M protein import into mitochondria isolated from HEK293T cells. Mitochondria were treated with sodium carbonate (Na 2 CO 3 , pH 11.5) or with TBS 1% Triton X-100. T, total; P, pellet; and S, soluble.  (A) HEK293T cells were co-transfected with plasmids encoding IFN-β, ISRE, NF-κB, ISG56.1 or AP-1-driven FLuc, together with RLuc and ORF3c-HA plasmids. On the following day, cells were stimulated with SeV (IFN-β and ISG56.1), 1,000 units/mL IFN-α (ISRE), 20 ng/mL TNF-α (NF-κB) or 20 ng/mL PMA (AP-1). SeV infection and PMA stimulation were maintained overnight whereas cytokine treatments were maintained for 6 h. The same stimulation conditions were used in all experiments unless otherwise stated. Cell lysates were collected after stimulation to measure luciferase levels and protein expression. FLuc activity was normalised to RLuc and the fold induction is shown relative to unstimulated controls. Below the figures are representative immunoblots of indicated protein expression. (B) The same reporter gene assays as described in (A) were performed with ORF3c tagged with HA at either the N or C terminus, or untagged ORF3c. (C) After overnight induction with tetracycline, ORF3c-inducible cells were infected with SeV for 6 h, and then collected to extract cellular mRNA. Total mRNA was reverse transcribed into cDNA and the indicated genes were quantified by qPCR. mRNA levels of target genes were normalised to the GAPDH mRNA level, and fold induction was calculated relative to the unstimulated control. Statistical analysis (unpaired two sample t-test): ns = not significant, *P   (D) A549 +ACE2 +TMPRSS2 cells were infected with SARS-CoV-2 (Wuhan) or delta variant (EPI_ISL_1731019) at MOI 2. At 24 h post infection, infected and mocked-infected cells were collected and lysed. The cell lysates were analysed by immunoblotting using the indicated antibodies. MAVS immunoblots are shown at high and low intensity for clarity. The intensity of full-length MAVS (~72 kDa) from 3 technical repeats was normalised to GAPDH, and then further normalised relative to mock infection and is shown in percentage (upper bar graph). The ratio of cleaved (~57 kDa) / full length (~72 kDa) MAVS was calculated and is shown in percentage (lower bar graph). Quantification of band intensity was performed with the LI-COR imaging system. Statistical analysis was based on three technical repeats of the same cell lysates (* P < 0.05). (C) HEK293T cells were transfected with plasmids encoding RLuc, IFN-β driven FLuc, and two doses of either WT ORF3c-HA or a Q5Y mutant. The cells were stimulated, and cell lysates were collected and analysed as described in Figure 4A. (D) SARS-CoV-2 sequences present in the GISAID database that had specific-day collection dates specified were analysed. Data for all ORF3c variants present in 0.1% or more of the 12,906,225 ORF3c sequences analysed are shown. The two graphs show the same data, but on linear (upper panel) and log (lower panel) scales. The PTC mutant (red curve) corresponds to the SARS-CoV-2 delta variant. (E) Pie charts showing the geographic distribution of sequences obtained with the WT ORF3c 5th codon (CAG), PTC mutant 5th codon (UAG) and Q5Y "pseudorevertant" 5th codon (UAU). HEK293T cells were co-transfected with plasmids encoding IFN-β or AP-1-driven FLuc, as well as RLuc and an increasing dose of the ORF3c-HA plasmid. The cells were stimulated, and cell lysates were collected and analysed as described in Figure 4A. (B) IFN-β and (C) AP-1 reporter gene assays as described above were performed with FLAG-tagged ORF3c, ORF3a, NSP6, NSP9, NSP12 or empty vector (EV). Statistical analysis (unpaired two sample t-test): ns = not significant, *P  Figure 7D for variant abundances over time.

SUPPLEMENTARY TABLES
Supplemental Table 1 MaxQuant proteinGroups output file for all quantified proteins from the Vero cell SILAC ORF3c-HA immunoprecipitation experiment after removing reverse database hits and contaminants. Table 2 Curated MaxQuant proteinGroups output file for the six quantified proteins that met >0.5 log2 (i.e. 1.41-fold) enrichment and p-value < 0.05 criteria for enrichment over HA-only control in the Vero cell ORF3c-HA SILAC immunoprecipitation experiment.