ABSTRACT
There is a strong need for procedures that enable context and application dependent validation of antibodies. Here we describe a high-throughput approach for the detailed assessment of the selectivity of antibodies in plasma by PLasma Immunocapture Mass Spectrometry (PLIMS). The utility of PLIMS is demonstrated by determining the enrichment profiles of 157 antibodies targeting 120 proteins in EDTA plasma. Applying four classification categories (ON-target, CO-target, OFF-target and NO-target), it was found that 60% (44/60) of antibodies directed against denoted plasma proteins qualified for plasma assays. Among these, 85% (60/71) co-enriched another protein besides their respective target. As shown for several antibodies against IGFBP2, PLIMS was furthermore capable to describe known and explore novel protein complexes in plasma. In summary, PLIMS provides detailed insights into antibody selectivity in the context of plasma, thus will contribute as a valuable procedure towards to the generation of more reliable affinity-based plasma proteomics data.
MAIN TEXT
Antibodies are important tools used in a wide range of assays within life science, but there is growing awareness about the importance to assess the quality of the data generated therewith (1). The recently formed International Working Group for Antibody Validation (IWGAV) has therefore proposed five strategies to assess the experimental performance of antibodies (2). The use of affinity reagents is indeed essential for a sensitive analysis of proteins in plasma or serum. Here, we describe a method using PLasma Immuno-capture Mass Spectrometry (PLIMS) to qualify antibodies for their any plasma proteomic assays (3). Similar efforts have been applied to evaluate the performances of antibodies for immunoprecipitation (IP) in cell lysates(4), but apart from few studies focused on specific targets there are no systematic and large methodical studies applying IMS for antibody validation in plasma (5, 6).
For PLIMS, we developed a systematic approach for the validation of 157 antibodies (targeting 120 proteins) in plasma based on the workflow depicted in Fig. 1a. The used antibodies were generated in different species and in their selection, priority was given to antibodies targeting proteins known to be part of the plasma proteome (Supplementary Excel Table). To evaluate the PLIMS procedure, we started with 9 antibodies obtained from commercial ELISA kits for plasma proteins. The target concentration was spanning the range between μg/ml (C2) to low ng/ml (KLK3; see Supplementary Fig. 1 a-d and Supplementary Note 1). In order to explore the performance of different antibodies raised against a common antigen, 25 of the 120 proteins (21%) were targeted by more than one antibody (Supplementary Fig. 2) In the main study, the majority of binders (63%, N=99) were raised against target proteins detected previously ‘in plasma’ (N=77, PaxDB, http://pax-db.org/) or predicted to be ‘extracellular’ (n=22, the Gene Ontology Consortium, http://www.geneontology.org/, Fig. 1b). Proteins annotated as ‘cellular’ (38%, N=58) were eventually expected to be less abundant or absent in plasma. As reference of protein abundance in plasma, we considered the entries of the PaxDB plasma integrated dataset (7)(Fig. 1b, Supplementary Excel Table).
Mass spectrometry (MS) provides in-depth information about the protein content of a sample. For PLIMS the discrimination between antibody-captured and contaminant proteins is an important aspects, because hundreds to thousands of proteins can be identified in a single immuno-capture experiment (8). This calls for a careful analysis and interpretation of data from PLIMS assays where many other proteins than the intended target can be identified in the same range of spectral counts or precursor intensities. As described in the context of cell lysates, the necessity to compare the outcome of several experiments, including negative controls or unrelated antibodies is essential (4, 9, 10). The described PLIMS experiments were performed in separate batches, and to reduce antibody and sample consumption, we applied an enrichment score to assess specificity. As explained more in details in Supplementary note 1 we combined MaxLFQ label free quantification to a z-score based population statistics. Data analysis performed per batch showed that hundreds of proteins were identified for each batch (Supplementary note 1). Non-specific background contaminants proteins were highly similar within and between batches.
We also applied heat-treatment to enhance epitope retrieval in plasma as used in multiplexed assays(11) and to apply such conditions also for the validation (see Supplementary note 1). We found that the non-specific background different between assays that used heat-treatment plasma (Supplementary Fig. 3-4). PLIMS experiments performed with heat-treatment (denoted HEAT) and untreated (denoted NO HEAT) were therefore analysed separately, hence z-scores were assigned to each protein identified per PLIMS experiment. The z-scores indicate the distance between Label Free Quantification (LFQ) intensity in a specific PLIMS assay and the mean of LFQ intensities in the population of hundreds of experiments. This allowed to identify proteins that were captured and enriched by each antibody (Fig. 1a, Fig. 2a-b, Supplementary Fig. 1e). We considered a protein enriched when the LFQ intensity was higher than 3 times the standard deviation with respect to the mean of the population. The antibodies were then classified according to the following categories: (i) ON-target, when a z-score > 3 was only assigned to the expected target; (ii) CO-target if additional proteins were detected with a z-score > 3. In contrast to these supportive categories, antibodies did not qualify for plasma assays if (iii) proteins other than the expected targets were enriched with z-score > 3 (denoted OFF-target), or if (iv) z-scores of all detected proteins were < 3 (denoted NO-target). As shown in Fig. 1c and Supplementary Excel Table 1, classification of 71 out of 153 antibodies (46%) was supportive. When assessing only antibodies targeting proteins previously detected in plasma by MS, the rate increased to 59% (Table 1). It is noteworthy that failed qualification could be due to limited affinity of the antibody, limited assay sensitivity or absence of the target in the used pools of plasma derived from healthy donors.
Current options for assessing antibodies for plasma assays can include protein arrays (12) or Western blot (WB) (13). For both assays, a surplus of antibodies is diluted in a solution and applied onto supports that present the antigens. Hence, there is no competition for binding sites between potential on- and off-targets. The composition of plasma (90% of protein content is assigned to 20 proteins) further poses a challenge for WB in terms of resolution in terms of separation efficiency. Nevertheless, we compared the classifications obtained with PLIMS and WB and found that the assessment of 13 out of 104 antibodies (12%) provided supportive evidence by both methods (Fig. 1 d-e, Table 2). For antibodies raised against plasma proteins, the success rates for PLIMS (53%) was though higher than for WB (32%) (Table 2). When considering cellular proteins, the success rates were more similar: 35% for PLIMS and 31% for WB. For WB however, uncertainty does remain. Even bands detected at the predicted molecular weight could still represent the recognition of off-targets. Consequently, PLIMS provides an unequivocal identification of the target and could elucidate ambiguous WB results (see C1orf64, CEP162, E2F7, Supplementary Excel Table). Moreover, we found application dependent recognition for five antibodies generated for IL6R. All five were classified as target-specific using protein arrays, however only three detected IL6R in plasma (Supplementary Excel Table). Acknowledging the requirements of identifying high-responding peptides for the MS analysis, the demand on instrumentation infrastructure and data analysis, as well as increased consumption of antibody (1 μg) and sample (100 μl) in per PLIMS assay, our data shows that PLIMS does provide more detailed and reliable information than WB.
Applying immuno-capture before MS analysis has been reported to improve the sensitivity of protein quantification (14-16). Hence, PLIMS may also be used to qualify antibodies for lower abundant proteins, including those that presently remain undetectable for other MS protocols. With PLIMS, 9 extracellular proteins (e.g. CXCL8, TGFA, BDNF) and 18 cellular proteins (e.g. S100PBP, CASP2, STIM1) were detected that were not found in the PaxDB plasma integrated dataset (Fig. 1e, Supplementary Excel Table). Peptides from such findings can now be applied to develop targeted MS assays (SRM or PRM). For almost 50 % of the polyclonal antibodies validated in our studies, the identified peptides aligned with the sequence of the protein fragments used to generate the antibodies (Supplementary Fig. 2). This will allow to use isotope labelled versions of these fragments as standards for quantification (17). Such targeted MS assays can indeed serve as cross-platform validation method of antibody based discoveries.
Besides evaluating antibodies in terms of on-target selectivity, we observed the possibility to use PLIMS to study co-enrichments. Antibodies for CCL16 (HPA042909) and SERPINA4 (HPA002869) also enriched CCL18 and SERPINA6 besides intended targets, respectively. Both co-targets belong to the same family as the expected target and share a high sequence homology (Supplementary note 2). For IGFBP2, three antibodies were tested (HPA077723, HPA045140, HPA004754) of which the latter two were raised against the same antigen. As shown in Fig. 2b, HPA077723 and HPA045140 both enriched IGFBP2 as well as known interactors IGF1 and IGF2. In addition, novel interactors DERA and BCHE were detected. For BCHE and IGF1, an interaction was previously hypothesised (18, 19), and PLIMS now provides supportive evidence in plasma (Supplementary Fig. 6). For the third binder (HPA004754), IFGBP2 was only enriched upon prior heat treatment plasma (Fig. 2b). The differential performance of polyclonal antibodies raised against the same antigen (HPA045140, HPA004754) confirms the necessity to investigate each of the different batches and lots of antibodies. For CDL5, evidence for known (IgM) and new (C1orf64 with Calm1) components of protein complexes (Supplementary Table 2), with PLIMS indicating an interaction between CD5L and IgM’s J-chain (20).
In summary, PIMS enabled the systematic assessment of 157 antibodies, of which 74 antibodies were validated for plasma assays. We included 127 antibodies targeting proteins with a previously described disease association, and using PLIMS, we detected analytes circulating at low abundance as well as those found in complexes. We purpose to use PLIMS and the classification scheme as a standard approach for the assessment and selection of antibodies for proteomics assays in plasma.
Appropriate validation schemes need to apply experimental condition resembling those of the intended application. With a context and assay dependent selectivity of antibodies, PLIMS provides the required detailed insights when choosing antibodies for development the development of solid phase immunoassay as well as when assessing affinity reagents emerging from highly multiplexed screening approaches. The large number of proposed candidate biomarkers that, however, did not reach a clinical highlights again the need to devote more attention to validation(21). Lack in robustness of the analytical method is one the major pitfalls that makes it difficult to proceed from discovery to targeted validation. PLIMS can serve as an important tool to qualify affinity reagents for their use in plasma proteomics and empowers the development and application of specific, robust and reliable immunoassays(1).
MATERIAL AND METHODS
2.1 Sample collection
Human EDTA plasma from healthy individuals (50% females) was obtained from Seralab (Sera Laboratories International Ltd). Aliquots of plasma (0.5 mL) were stored in cryogenic vials at −80 °C and thawed at 4 °C before use.
2.2 Target genes selection
Information about target proteins their functions and involvement in diseases were collected through literature (https://www.ncbi.nlm.nih.gov/pubmed); the Gene Ontology Consortium (GO, http://www.geneontology.org/); the Human Protein Atlas (HPA, http://www.proteinatlas.org/), the Early Detection Research Network (EDRN, https://edrn.nci.nih.gov/), and Protein Abundance Database (PaxDb, http://pax-db.org/).Proteins were classified as: Cellular or Extracellular. A protein was considered extracellular when one or more of the following terms appeared in their GO classification: extracellular region (GO:0005576); extracellular space(GO:0005615); extracellular EXOSOME(GO:0070062); proteinaceous extracellular matrix(GO:0005578); (see columns “GO CC Complete” and “Summary of GO CC”in the Excel sheet “Protein Annotation GO”).
Full list of antibodies and information was reported in Supplementary Excel Tables. The analysis included 157 antibodies :15 monoclonal antibodies (10, R&D Systems; 1, HyTest Ltd.; 1;3, Atlas Antibodies and 1, SigmaAldrich) and144 polyclonal antibodies developed within the Human Protein Atlas. In addition 3 normal IgG pools from rabbit (Bethyl Laboratories),mouse and rat (both Santa Cruz Biotechnology) were included as negative controls.
2.3 Antibody coupling to magnetic beads
Covalent coupling of antibody to magnetic beads (MagPlex, Luminex Corp.) was performed as previously described (22). Briefly antibodies are cross-linked to carboxylated beads through a sulfo-NHS (sulfo-NHS (n-hydroxysulfosuccinimide, Thermo) plus EDC (carbodiimide) reaction (EDC 10 mg, Thermo). After beads activation, antibodies diluted in MES buffer are incubated two hours at room temperature then washed and stored in blocking buffer at 4 °C.
2.4 Immunocapture-mass spectrometry
Aliquots of EDTA plasma were diluted in assay buffer: 0.5% w/v PVA (Sigma-Aldrich), 0.8% w/v PVP (Sigma-Aldrich), 0.1% w/v casein (Sigma-Aldrich) and 10 % of normal purified IgG (Bethyl Laboratories, Inc.) matching isotype and same species of the capture antibody. Samples undergoing heat treatment (HT) were heated at 56 °C were in a water bath, before combined with beads and an incubation overnight on a rotation shaker at 23°C. Beads coupled with normal IgG were included for rabbit (Bethyl Laboratories, Inc.), mouse (Santa Cruz Biotechnology), rat (Santa Cruz Biotechnology). After incubation, beads were separated from the sample, washed with PBS/Chaps 0.03% using a magnetic bead handler (KingFisherTM Flex Magnetic Particle Processors, Thermo Scientific) and then re-suspended in digestion buffer containing 50 mM ammonium bicarbonate (Sigma-Aldrich) and 0.25% sodium deoxycholate (Sigma-Aldrich). Proteins were reduced with 1 mM DTT (Sigma-Aldrich) at 56 °C, and alkylated in 4 mM by iodoacetamide (Sigma-Aldrich). Alkylation was quenched adding 1 mM DTT. Proteins were digested using a mixture of Trypsin and LysC (Promega, USA) overnight at 37 °C. Enzyme inactivation and sodium deoxycholate precipitation was obtained adding 0.005% TFA. Peptides were then separated from beads dried and re-suspended in solvent A containing 3% acetonitrile (ACN), 0.1% formic acid (FA).
2.5 LC-MS/MS
MS analysis was performed using a Q-Exactive HF (Thermo) operated in a data dependent mode, equipped with an Ultimate 3000 RSLC nanosystem (Dionex). Samples were injected into a C18 guard desalting column and then into a 50 cm x 75μm ID Easy spray analytical column packed with 2μm C18 (Thermo) for RPLC. Elution was performed in a linear gradient of Buffer B (90% ACN, 5% DMSO, 0.1% FA) from 3 % to 43% in 50 min at 250 nL/min. Buffer B was step increased to 45% in 5 min and to 99% in 2 minutes and then hold for 10 minutes. Buffer A for the chromatography was added of DMSO (90% water, 5% ACN, 5% DMSO, 0.1% FA). Full MS scan (300-1600 m/z) proceeded at resolution of 60,000. Precursors were isolated with a width of 2 m/z and put on the exclusion list for 60 s. The top five most abundant ions were selected for higher energy collision dissociation (HCD). Single and unassigned charge states were rejected from precursor selection. In MS/MS, a max ion injection time of 250 ms and AGC target of 1E5 were applied.
2.6 Data analysis
Shotgun MS data search was performed on MaxQuant(23) using the integrated algorithm MaxLFQ. Spectra were search against a human protein database from Uniprot (http://www.uniprot.org, updated 03/17/2016, Canonical and Isoforms, 20198 hits customized adding sequences of Immunoglobulins chain C from RABIT,RAT and MOUSE, LysC (PSEAE) and Trypsin (PIG)).Settings included: two missing cleavage allowed; methionine oxidation and N-term acetylation as variable modification and cysteine carbamidomethylation as fixed modification, Fast LFQ and match between runs applied, minimum number of neighbors 3, average number of neighbors 6. All the 414 raw data files included in the analysis of 153 antibodies and controls were analyzed in a single session, LFQ intensity values obtained were used for the following analysis. We considered missing values as not missing at random (NMAR), but missing because of concentrations below the limit of detection (LOD). We therefore used the value of 0 as minimum value detected of intensity (Single-value imputation approach) when calculating average and standard deviation for each protein identified over the population of all experiments (missing values of LFQ intensities were substituted to 1, when the log10 transformation was applied for Principal Component Analysis (PCA) and Clustering analysis. When duplicate and triplicate experiments were available, we considered only proteins identified in all replicates, and calculate average of intensities before calculating z-scores. Proteins were considered enriched when associated to a z-score > 3. To visualize the enriched proteins for each antibody, z-scores values were plotted against LFQ intensity.
Raw data produced to assess experimental conditions were analyzed using MaxQuant but excluding the function for LFQ. For a global evaluation of length of column and chromatography the median of intensity and number of proteins were calculated for 10-12 PLIMS experiments performed in each conditions. Values of intensity were normalized on the median intensity when injecting the same sample in different columns and conditions of chromatography.
Data analysis and representation was performed on the environment for statistical computing and graphics R(24) using “ggplot2”, "matrixStats",”pheatmap”. Alignments between protein and prEST sequences was performed using the Clustal Omega program available at EMBL-EBI 4.GO enrichment system was performed using the PANTHER Classification System (http://pantherdb.org/)
AUTHOR CONTRIBUTIONS
CF and JMS initiated, designed, and led the study with the scientific contribution and support of JL and PN. CF designed the experiments and performed the data analysis with inputs from SB, LSR, MI, DT, RMB and supervision by PN, JL and JMS. CF, SB, LSR and MI executed the experiments.CF and JMS wrote the manuscript with input from all authors.
COMPETING FINANCIAL INTERESTS
The authors have on conflicts of interest.
ABBREVIATIONS
- HPA,
- Human Protein Atlas
- LFQ,
- label free quantification
- SRM,
- selected reaction monitoring
- PRM,
- parallel reaction monitoring
- BCHE,
- Butyrylcholine esterase
- DERA,
- Deoxyribose-phosphate aldolase (DERA),
- IGF,
- insulin like growth factor
- IGFBP,
- insulin like growth factor-binding protein
- CCL,
- CC chemokine ligand
- SERPIN
- serine protease inhibitor
AKNOWLEDGMENTS
We greatly thank Mathias Uhlén, Fredrik Edfors, Björn Forström, Lucia Lourido and the Biobank profiling, Affinity Proteomics and Clinically Applied Proteomics groups at SciLifeLab in Stockholm for their continuous fruitful discussion and input to the presented work. We also thank everyone at the Human Protein Atlas for their support.
The KTH Center for Applied Precision Medicine (KCAP) funded by the Erling-Persson Family Foundation is acknowledged for financial support. This work was supported by grants for Science for Life Laboratory, the Knut and Alice Wallenberg Foundation. The work leading to this publication has received support from the Innovative Medicines Initiative Joint under grant agreement n°115317 (DIRECT), resources of which are composed of financial contribution from the European Union’s Seventh Framework Programme (FP7/2007-2013) and EFPIA companies’ in kind contribution. Funding from the Swedish Foundation for Strategic Research, Swedish Cancer Society and Swedish Research Council is gratefully acknowledged.