The HLA Ligand Atlas. A resource of natural HLA ligands presented on benign tissues

The human leukocyte antigen (HLA) complex regulates the adaptive immune response by showcasing the intracellular and extracellular protein content to the immune system, where T cells are able to distinguish between self and foreign. Therefore, a comprehensive map of the entirety of both HLA class I- and class II-presented peptides from different tissues, is a highly sought after resource1, as it enables the investigation of basic immunological questions beyond the exome level. In this work, we describe the HLA Ligand Atlas, a comprehensive collection of matched HLA class I and class II ligandomes from 29 non-malignant tissues and 13 human subjects (208 samples in total), covering 38 HLA class I, and 17 HLA*DRB alleles. Nearly 50% of HLA ligands have not been previously described. The generated data is relevant for basic research in diverse fields such as systems biology, general immunology, and molecular biology. Furthermore, the HLA Ligand Atlas provides essential information for translational applications by supporting the development of effective cancer immunotherapies. The characterization of HLA ligands from benign tissues, in particular, is necessary in informing proteogenomic HLA-dependent target discovery approaches. Thus, this data set provides a basis for novel insights into immune-associated processes in the context of tissue and organ transplantation and represents a valuable tool for researchers exploring autoimmunity. The HLA Ligand Atlas is publicly available as a raw data resource but also in the form of a user-friendly web interface that allows users to quickly formulate complex queries against the data set. Both downloadable data and the query interface are available at www.hla-ligand-atlas.org.


Introduction
Major advances in comprehensive biological analyses include sequencing of the human genome (genomics) 2,3 , entirely assessing human gene expression (transcriptomics) 4 and the study of the human proteome (proteomics) [5][6][7] . These discoveries are considered milestones that enable a multi-dimensional understanding of biological processes. In the context of the immune system, another consecutive layer can be defined as the HLA ligandome or the immunopeptidome.
Proteins encoded by the human leukocyte antigen (HLA) gene complex present peptides on the cell surface for recognition by T cells, which can distinguish self from foreign 8,9 . This mechanism plays a crucial role in adaptive immunity. Despite HLA class I ligands originating primarily from intracellular proteins, the correlation with their precursors (mRNA transcripts and proteins) is poor [10][11][12] , limiting approaches based on in silico HLA-binding predictions in combination with transcriptomics and proteomics data alone 13,14 .
Thus, direct evidence of naturally presented HLA ligands is required to prove visibility of target peptides to T cells. This is a challenge, for example, in the context of cancer immunotherapy approaches that aim to identify optimal tumor-specific HLA-presented target antigens 15 . While their discovery is assisted by the proteogenomic-based exploration of malignancies, a major impediment still resides in the selection of benign tissues as a reference for the definition of tumor specificity 12 .
Due to the scarce availability of benign human tissue, morphologically normal tissue adjacent to the tumor (NAT) is commonly used as a control in cancer research. However, NATs have been shown to pose unique challenges, since they may be affected by disease and have been suggested to represent a unique intermediate state between healthy and malignant tissues, with a pan-cancer-induced inflammatory response 16 . Consequently, in this study we employed benign tissues originating from research autopsies of donors that have not been diagnosed with any malignancy and have deceased of other causes, an approach previously described as a surrogate source of normal tissue 16,17 . Although these donors cannot be referred to as "healthy" since they may be affected by a range of non-cancerous disease processes, we designate them as benign to emphasize morphological normalcy and absence of malignancy. Similarly, the Genotype-Tissue Expression Consortium 4,18 provides an ample resource of RNA sequencing data of benign tissues originating from autopsy specimens. In accordance with this approach, we performed a large-scale mass spectrometry (MS)-based characterization of both HLA class I and class II ligands providing data from benign human tissues obtained at autopsy and implemented a user-friendly, web based interface to query and access the data at www.hla-ligand-atlas.org.
The increasing importance of investigating HLA ligandomes in health and disease using MS technologies has been well recognized 1 , especially to inform precision medicine 12,15,[19][20][21][22] and improve HLA-binding prediction algorithms 13,14,[23][24][25][26] . However, technical constraints and limited data availability have hampered progress in this field. In this context, we provide a novel, comprehensive resource comprising HLA class I and II ligandomes of 29 different benign tissue types obtained from 13 human subjects. To our knowledge, this is the first study to map peptides presented by both HLA class I and class II molecules in multiple tissues from the same human subject, allowing the investigation of new immunological questions.
Despite its unprecedented comprehensiveness, we recognize the caveat that only a limited number of human subjects are included in this data set. We envision that the scientific community will extend the knowledge of the HLA ligandome with more subjects and HLA allotypes covering additional tissues and cellular subpopulations. By integrating immunopeptidomics with proteomics and sequencing data, we anticipate a more holistic understanding of immunological processes.

Experimental model and subject details
Human tissue samples were obtained post mortem during autopsy performed for medical reasons at the University Hospital Zürich. The study was approved by the Cantonal Ethics None of the subjects included in this study was diagnosed with any malignant disease. Tissue samples were collected during autopsy, which was performed within 72 hours after death.
Tissue/organ annotation was performed by a board-certified pathologist. Tissue samples were immediately snap-frozen in liquid nitrogen.

HLA typing
High-resolution HLA typing was performed by next-generation sequencing on a GS Junior Sequencer using the GS GType HLA Primer Sets (both Roche, Basel, Switzerland) at the Department of Transfusion Medicine of the University Hospital of Tübingen. HLA typing was successful for HLA-A,-B, and -C alleles. However, HLA class II typing was only reliable for the HLA-DR locus, and incomplete for the HLA-DP and -DQ loci. The subject characteristics are summarized in Table 1 encompassing information on sex, age, the number of collected tissues and HLA class I and II alleles. respectively. Furthermore, the pan-HLA class II-specific antibody Tü39 was employed and also produced on house from a hybridoma clone as previously described 29 Table 2. This information is not available for seven tissues, annotated as n.d. in Table 2   fragmentation with 30% collision energy was employed for HLA class II peptides.

Database search with MHCquant
MS data obtained from HLA ligand extracts was analyzed using the nf-core 34  Technical replicates with purities lower than 50% for HLA class I ligands and lower than 10% for HLA class II ligands were not included in the data set. From the remaining sample replicates, only peptides that were predicted at least weak binders against one of the donor alleles were included in the data set.

Data storage web interface
HLA class I and class II peptides that were predicted to be HLA ligands according to the previously defined criteria, were complemented with their tissue association and stored in an SQL database. A public web server was implemented that allows users to formulate queries against the database, visualizes the results and allows data export for further analysis. The web front-end was implemented in HTML, CSS and JavaScript based on the front-end framework Bootstrap 4. The table plugin DataTables was used to provide rapid browsing and filtering for tabular data. Interactive plots were designed using Bokeh and ApexCharts.

Content and scope of the HLA Ligand Atlas data resource
To generate HLA ligandomics data of benign tissues, we employed a well described immunoprecipitation protocol 20,39 followed by peptide sequencing via LC-MS/MS. Database search was performed using the open-source, bioinformatics pipeline MHCquant and resulting peptide sequences were filtered according to their binding score and stored in the www.hlaligand-atlas.org database (Figure 1 A, 2 A).
Overall, we have acquired HLA ligandome data from tissues obtained from 13 autopsy subjects, encompassing 29 human tissue types as detailed in Figure   Apart from the query interface, the web front-end also displays various aggregate views of the data stored in the database.

Concluding remarks
In this work we provide a comprehensive HLA class I and class II ligandomics data set that can assist in answering multifaceted questions. HLA class I and class II binding prediction algorithms can be substantially improved by increasing the number of subjects with reliable HLA ligandomics data employed for training 13,25,44 . Further coverage of infrequent and poorly studied HLA allotypes, such as HLA-A*30:01, can increase the accuracy of binding prediction.
Furthermore, the study of non-mutated tumor-associated HLA ligands both from canonical 12,19 and non-canonical transcripts 45,46 , requires a comprehensive map of benign HLA ligandomes as a reference data set 20 . Thus, direct evidence of HLA ligands by MS is required to prove their presentation and visibility to T cells. Moreover, the investigation of patient-and tissue-specific peptides, hotspots of presentation 15,47,48