A Master Autoantigen-ome Links Alternative Splicing, Female Predilection, and COVID-19 to Autoimmune Diseases

Chronic and debilitating autoimmune sequelae pose a grave concern for the post-COVID-19 pandemic era. Based on our discovery that the glycosaminoglycan dermatan sulfate (DS) displays peculiar affinity to apoptotic cells and autoantigens (autoAgs) and that DS-autoAg complexes cooperatively stimulate autoreactive B1 cell responses, we compiled a database of 751 candidate autoAgs from six human cell types. At least 657 of these have been found to be affected by SARS-CoV-2 infection based on currently available multi-omic COVID data, and at least 400 are confirmed targets of autoantibodies in a wide array of autoimmune diseases and cancer. The autoantigen-ome is significantly associated with various processes in viral infections, such as translation, protein processing, and vesicle transport. Interestingly, the coding genes of autoAgs predominantly contain multiple exons with many possible alternative splicing variants, short transcripts, and short UTR lengths. These observations and the finding that numerous autoAgs involved in RNA-splicing showed altered expression in viral infections suggest that viruses exploit alternative splicing to reprogram host cell machinery to ensure viral replication and survival. While each cell type gives rise to a unique pool of autoAgs, 39 common autoAgs associated with cell stress and apoptosis were identified from all six cell types, with several being known markers of systemic autoimmune diseases. In particular, the common autoAg UBA1 that catalyzes the first step in ubiquitination is encoded by an X-chromosome escape gene. Given its essential function in apoptotic cell clearance and that X-inactivation escape tends to increase with aging, UBA1 dysfunction can therefore predispose aging women to autoimmune disorders. In summary, we propose a model of how viral infections lead to extensive molecular alterations and host cell death, autoimmune responses facilitated by autoAg-DS complexes, and ultimately autoimmune diseases. Overall, this master autoantigen-ome provides a molecular guide for investigating the myriad of autoimmune sequalae to COVID-19 and clues to the rare but reported adverse effects of the currently available COVID vaccines.


Introduction
Autoimmune disorders are an important feature of the disease manifestations of COVID-19 and long-COVID syndromes. Based on the insights we gained from numerous COVID-related autoantigens (autoAgs) and their associated cellular process and pathways [1][2][3][4][5], we propose a model to explain how viral infections in general and SARS-CoV-2 in particular can lead to a wide array of autoimmune diseases ( Figure  1). We illustrate how viral infections lead to extensive molecular alterations in the host cell, host cell death and tissue injury, autoimmune reactions, and the eventual development of autoimmune diseases.
During infections, opportunistic viruses have to hijack the host cell machinery in order to transcribe and translate the viral genes, synthesize viral proteins with correct polypeptide folding and post-translational modifications, and assemble viral particles. At the same time, viruses have to manipulate the host's immune defense to avoid elimination. This intricate host-virus symbiosis is accomplished by extensive alterations of host molecules and reprogramming of host molecular networks. The infected host cells undergo extreme stress and ultimately die, which releases altered molecules (i.e., potential autoAgs) that the immune system may recognize as non-self. In response, the host also synthesizes a cascade of molecules such as dermatan sulfate (DS) to facilitate wound healing and dead cell clearance.
We have discovered previously that DS possesses peculiar affinity for apoptotic cells and their released autoAgs [6][7][8][9]. DS, a major component of the extracellular matrix and connective tissue, is increasingly expressed during tissue injury and accumulates in wound areas [1,10]. Because of their affinity, DS and autoAgs form macromolecular complexes which cooperatively activate autoreactive B1 cells. AutoAg-DS complexes may activate B1 cells via a dual binding mode, i.e., with autoAg binding to the variable region of the B1 cell's autoBCR and DS binding to the heavy chain of the autoBCR. Upon entering B1 cells, DS may regulate immunoglobulin (Ig) production by engaging the Ig-processing complex in the endoplasmic reticulum and the transcription factor GTF2I necessary for Ig gene expression [8,9]. AutoAg-DS affinity therefore defines a unifying biochemical and immunological property of autoAgs: any self-molecule possessing DS-affinity has a high propensity to become autoantigenic, and this has led to the identification of numerous autoAgs [7,[11][12][13].
To gain a better understanding of autoimmune sequelae due to COVID-19, we present a master autoantigen atlas of over 750 potential autoAgs identified from six human cell types [1,2,4,5,7,11]. These autoAgs show significant correlation with pathways and processes that are crucial in viral infection and mRNA vaccine action, reveal common autoAgs associated with apoptosis and cell stress which may serve as markers for systemic autoimmune diseases, and provide a detailed molecular map for understanding and for investigating diverse autoimmune sequalae of COVID- 19 and potential rare sideeffects to viral vector-and mRNA-based vaccines. For the first time, we reveal intriguing features of autoAgs and their coding genes. Furthermore, we discuss how UBA1 (or UBE1, ubiquitin-like modifieractivating enzyme 1), an autoAg found overexpressed in SARS-CoV-2 infection, may predispose aging females to autoimmune disorders. 7 or less accessible to the spliceosome and suppressing RNA splicing at a particular exon. 19 hnRNP proteins are identified by DS-affinity, with 17 found affected by SARS-CoV-2 infection.
The large number of autoAgs of the RNA splicing machinery and their involvement in SARS-CoV-2 infection provide support to the notion that viral infections exploit alternative splicing. It is logical to speculate that viruses hijack the splicing machinery to force the host to synthesize virus-beneficial protein isoforms and thereby reprogram the host cellular protein network so that the virus can survive and replicate. It is also plausible that protein isoforms from virus-induced alternative splicing are recognizable by our immune system as unusual and non-self and hence may trigger an (auto)immune response.
Various studies have reported alternative splicing among autoAgs. For example, an informatics analysis of 45 autoAgs showed that alternative splicing occurred in 100% of the transcripts, which was significantly higher than the ~42% rate observed in a randomly selected set of 9,554 gene transcripts. Furthermore, 80% of the transcripts underwent non-canonical alternative splicing, which was significantly higher than the <1% rate in randomly selected human gene transcripts [38]. As another example, Ro52/SSA is one of the autoAg targets strongly associated with the autoimmune responses in mothers whose children have manifestations of neonatal lupus. The gene for full-length Ro52 spans 10 kb of DNA and contains 7 exons, and an alternatively spliced transcript encoding a novel autoAg expressed in the fetal and adult heart has been identified [39]. In a patient with primary Sjörgren syndrome, an alternative mRNA variant of the nuclear autoAg La/SSB was found to result from a promoter switch and alternative splicing [40].

Common autoAgs associated with cell stress and apoptosis
We have consistently found that DS binds apoptotic cells regardless of cell type [6,8]. To figure out which molecules are involved in this affinity, we searched for DS-affinity proteins shared in all 6 human cell lines of this study and found 39 autoAg candidates (Fig. 9). These include 9 ER chaperone complex proteins, 5 14-3-3 proteins, 3 hnRNPs, and 3 tropomyosin proteins. All are known autoAgs except for ANP32A and YWHAB (14-3-3 alpha/beta). Given that ANP32A's paralog ANP32B and 5 other 14-3-3 isoforms are known autoAgs, it is likely they are also true autoAgs. Remarkably, several classical ANA (antinuclear antibody) autoAgs that define systemic autoimmune diseases are among the autoAgs found in the DS-affinity proteomes of all 6 human cell lines, including histone H1 and H4, SSB (lupus La), XRCC5/Ku80, XRCC6/Ku70, and PCNA. Because these autoAgs are commonly found in apoptotic cells, it is not surprising that autoimmune responses targeting these autoAgs tend to be systemic; in other words, they all are potential markers of systemic autoimmune diseases.
UBA1, X-inactivation escape, and female predilection of autoimmunity Among the above common autoAgs, UBA1 (or UBE1, ubiquitin-like modifier-activating enzyme 1) plays an essential role in dead cell clearance. UBA1 catalyzes the first step in ubiquitination -the "kiss of death"that marks cellular proteins for degradation. It has long been speculated that dysregulation of apoptotic pathways and dysfunctional clearance of dead cells are among the main causes of autoimmunity, which is in line with our findings [6,8]. Apoptosis also directly contributes to the maintenance of lymphocyte homeostasis and the deletion of autoreactive cells. Therefore, dysfunction of UBA1 could result in deficient clearance of apoptotic cells and aberrant autoimmunity.
Recently, UBA1 somatic mutations have been linked to a severe adult-onset autoinflammatory disease termed VEXAS syndrome [41]. A somatic mutation affecting methionine-41 in UBA1 results in a loss of the canonical cytoplasmic isoform of UBA1 and in the expression of a novel catalytically impaired isoform. Additionally, mutant peripheral blood cells show decreased ubiquitination and activated innate immune pathways.
Strikingly, UBA1 protein expression is found up-regulated at different time points of SARS-CoV-2 infection, whereas two deubiquitinating enzymes, USP9X and USP5, are down-regulated [33] (Supplemental Table  1). Furthermore, among the 657 proteins of the COVID autoantigen-ome, 178 have been found to be affected by ubiquitination (Fig. 10). They are most significantly associated with RNA metabolism and cellular response to stress. In addition, ubiquitination affects proteins involved in signaling by Rho GTPase, RNA splicing, translation, protein folding, nonsense-mediated decay, DNA damage stress-induced senescence, and the cytoskeleton. These findings underline the extensive involvement of ubiquitination in viral infection. UBA1 is coded by the UBA1 gene located on the X chromosome with no homolog on the Y chromosome, and more importantly, UBA1 can escape X-chromosome inactivation. UBA1 appears to be protected against chromosome-wide transcriptional silencing by a chromatin boundary flanked by histone H3 modifications and CpG hypomethylation [42]. In human female fibroblasts, UBA1 mRNA is detected from both the active and inactive X chromosomes, and UBA1 is expressed in a large panel of somatic cell hybrids retaining inactive X chromosomes [43]. In human endothelial cells from dizygotic twins, UBA1 and a few other X-chromosome encoded proteins are expressed at higher levels in female cells [44]. UBA1 expression is estimated to be ~ 60% from X-active alleles, 30% biallelic, and 10% from X-inactive alleles [45]. near equal to that of the active allele [46]. X-inactivation and escape may enhance phenotypic differences between females and males and may also enhance variability within females due to mosaicism from cells with the X-maternal or X-paternal inactivated and to a variable degree of escape from X-inactivation [46]. Aging, which is associated with telomere shortening, can relax X-inactivation and force global transcriptome alterations [47], which may lead to gene escape and altered expression of UBA1. Therefore, dysfunction of UBA1 due to X-inactivation escape may predispose women, particularly aging women, to increasing dysfunctional regulation of apoptosis and aberrant autoimmunity.

Considerations for vaccine design based on Spike-protein via viral vectors or mRNAs
To understand the various rare but reported side effects from the currently available viral vector-and mRNA-encoded S-protein COVID vaccines, we searched for autoAgs that may interact with the spike protein of SARS-CoV-2 and found 15 autoAg candidates ( Table 2). Of these, CALU, ESYT1, MOV10, and MARCKS may also interact with many other SARS-CoV-2 proteins as discussed earlier. Curiously, at least 2 of these are associated with blood clotting problems, and 5 are implicated in neurological disorders ( Table  2). For example, CALU (calumenin) is a calcium-binding protein and is expressed in high levels in the heart, placenta, and skeletal muscle. CALU is associated with pharmacodynamics and response to elevated platelet cytosolic Ca 2+ , platelet degranulation, and Coumarin/Warfarin resistance. Warfarin is an anticoagulant (blood thinner) drug used to treat blood clots such as deep vein thrombosis and pulmonary embolism and to prevent stroke in people with heart problems such as atrial fibrillation, valvular heart disease or in people with artificial heart valves. Although largely speculative at present, these potential S-protein-interacting autoAgs may provide partial explanations for the rare hematological, neurological, and muscular side effects reported for the currently available COVID vaccines (Table 2). Although it is known that S proteins are synthesized intracellularly following vaccination with mRNAs or viral vectors, many of the precise molecular steps remain unknown. In particular, how do these newly synthesized S proteins fold and are they glycosylated differently . CC-BY 4.0 International license available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint this version posted August 4, 2021. ; https://doi.org/10.1101/2021.07.30.454526 doi: bioRxiv preprint depending on the cell type that rakes up the mRNA or the viral vector? How does the newly synthesized S protein interact with other host cell components before being processed (or degraded) and presented to immune cells? For example, could the nascent S proteins interact with CALU or ESYT1 to cause blood clotting problems, could S protein interaction with HSPA5 contributes to fungal infection outbreaks as seen in India? These and many other questions await further investigation. This is of interest because mRNA and vector-based vaccines make use of a variety of cell types in vivo to produce the immunogen, whereas recombinant protein-based vaccines introduce the ex vivo prepared immunogen directly to the immune system.
In addition, this study identified a large number of autoAg candidates that are crucial for vector-based or mRNA vaccine action, including translation, RNA processing and metabolism, vesicles and vesiclemediated transport, and protein processing and transport (Figs. [2][3][4][5][6]. For example, the master autoantigen-ome contains 56 ribosomal proteins, 16 eukaryotic translation initiation factors, 16 aminoacyl-tRNA synthases/ligases, and 6 translation elongation factors, all of which are essential actors in translating mRNAs into proteins. There are also many autoAgs related to protein folding and posttranslational protein modification, although it is not clear whether the S proteins are folded and posttranslationally modified before being processed and presented to immune cells in the currently used mRNA or vector vaccines for COVID-19. These potential autoAgs may confer clues to understanding the observed rare adverse events and should help guide the future development of even safer vaccines.

Conclusion
In this report, we compiled a master autoantigen-ome of 751 potential autoAgs, 657 of which are affected in SARS-CoV-2 infection, and 400 of which are confirmed autoAgs in a wide variety of autoimmune diseases and cancer. Our proposed model (Fig. 1) provides a plausible explanation for how a cascade of molecular changes associated with viral infection leads to cell stress, apoptosis, and subsequent autoimmune responses. The large number of autoAg candidates associated with SARS-CoV-2 infection provides a mechanistic rationale for the close monitoring of autoimmune diseases that may follow the COVID-19 pandemic. In addition, the coding gene characteristics of autoAgs described in this study provide further insights into the genetic origination of autoAgs. The significance of ubiquitination in apoptotic cell clearance and protein turnover and the X-linked escape expression of UBA1 might explain, in part, the predisposition of aging women to autoimmune diseases.

Funding Statement
This work was partially supported by Curandis, the US NIH, and a Cycle for Survival Innovation Grant (to MHR). MHR acknowledges NIH/NCI R21 CA251992 and MSKCC Cancer Center Support Grant P30 CA008748. The funding bodies were not involved in the design of the study and the collection, analysis, and interpretation of data.
Competing interest statement JYW is the founder and Chief Scientific Officer of Curandis. MHR is a member of the Scientific Advisory Boards of Trans-Hit, Proscia, and Universal DX, but these companies have no relation to the study.
Authors' contributions JYW conducted the study and wrote the manuscript. MWR and VBR assisted with the study and manuscript preparation. MHR consulted on the study and edited the manuscript. All authors have approved the manuscript.
. CC-BY 4.0 International license available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made . CC-BY 4.0 International license available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made . CC-BY 4.0 International license available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made . CC-BY 4.0 International license available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made . CC-BY 4.0 International license available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made . CC-BY 4.0 International license available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made . CC-BY 4.0 International license available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made . CC-BY 4.0 International license available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made . CC-BY 4.0 International license available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made . CC-BY 4.0 International license available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint this version posted August 4, 2021. ; https://doi.org/10.1101/2021.07.30.454526 doi: bioRxiv preprint Fig. 2. The master autoAg-ome of 751 DS-affinity proteins identified from 6 cell types forms a highly interacting connected network. Lines represent protein-protein interactions with the highest confidence cutoff. Colored proteins are associated with translation (104 proteins, red), RNA processing (120 proteins, pink), protein folding (53 proteins, blue), vesicle-mediated transport (141 proteins, green), chromosome organization (76 proteins, yellow), regulation of cell death (110 proteins, dark purple), and apoptosis (46 proteins, brown).
. CC-BY 4.0 International license available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint this version posted August 4, 2021. ; https://doi.org/10.1101/2021.07.30.454526 doi: bioRxiv preprint Fig. 3. Protein interaction network of the 400 confirmed autoAgs. Lines represent protein-protein interactions with highest confidence. Colored proteins are associated with translation (57 proteins, red), RNA processing (65 proteins, pink), vesicle-mediated transport (89 proteins, green), response to stress (125 proteins, blue), regulation of cell death (74 proteins, amber), and apoptosis (28 proteins, brown).
. CC-BY 4.0 International license available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made  . CC-BY 4.0 International license available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint this version posted August 4, 2021. ; https://doi.org/10.1101/2021.07.30.454526 doi: bioRxiv preprint Fig. 5. The COVID autoantigen-ome of 657 autoAg candidates. Lines represent protein-protein interactions with highest confident level. Colored proteins are associated with translation (87 proteins, red), RNA processing (103 proteins, blue), protein folding (51 proteins, pink), symbiont process (78 proteins, yellow), vesicle-mediated transport (125 proteins, green), and response to stress (161 proteins, brown).
. CC-BY 4.0 International license available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint this version posted August 4, 2021. ; https://doi.org/10.1101/2021.07.30.454526 doi: bioRxiv preprint Fig. 6. COVID-affected autoAgs that are found up-regulated only, down-regulated only, or interacting with SARS-Cov-2 proteins. Note the significant enrichment of proteins associated with translation, RNA processing and splicing, and other processes.  . CC-BY 4.0 International license available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made  (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint this version posted August 4, 2021. ; https://doi.org/10.1101/2021.07.30.454526 doi: bioRxiv preprint Fig. 9. Common autoAgs identified from all six cell types examined in this study. Colored are proteins associated with viral infection (13 proteins, red), regulation of apoptotic process (17 proteins, amber), response to stress (22 proteins, blue), and apoptosis (8 proteins, brown).
. CC-BY 4.0 International license available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made  . CC-BY 4.0 International license available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint this version posted August 4, 2021. ; https://doi.org/10.1101/2021.07.30.454526 doi: bioRxiv preprint