Glycopeptide variable window SWATH for improved Data Independent Acquisition glycoproteomics

N-glycosylation plays an essential role in regulating protein folding and function in eukaryotic cells. N-glycan structures and occupancy can be impacted by the physiological state of cells and in disease. Sequential window acquisition of all theoretical fragment ion spectra mass spectrometry (SWATH-MS) is a powerful data independent acquisition (DIA) MS method for qualitative and quantitative analysis of glycoproteins and their glycan modifications. By separating the entire m/z range into consecutive isolation windows, DIA-MS allows comprehensive MS data acquisition and high-sensitivity detection of molecules of interest. The use of variable width DIA windows allows optimal analyte measurement, as peptide ions are not evenly distributed across the full m/z range. However, the m/z distribution of glycopeptides is different to that of unmodified peptides because of their modification with large complex glycan structures. Here, we improved the performance of DIA glycoproteomics by using variable width windows optimized for glycopeptides. This method allocates narrow windows at m/z ranges rich in glycopeptides, improving analytical specificity and performance for DIA glycoproteomics. We demonstrate the utility of the new variable window DIA method by comparing the glycoproteomes of wild type and N-glycan biosynthesis pathway deficient yeast. Our results highlight the importance of appropriately optimized DIA methods for measurement of post-translationally modified peptides.


Introduction
Protein glycosylation is a common and complex co-and post-translational modification (PTM), which provides proteomic diversity, regulates protein folding, sorting and stability, and controls many protein functions [1][2][3][4]. Different forms of glycosylation are defined by the glycosidic linkages between glycans and proteins. N-glycosylation, occurring on selected asparagine (Asn) residues, is one of the most common forms of glycosylation in eukaryotic cells [5]. N-glycans have a conserved precursor oligosaccharide, a dolichol-linked 14monosaccharide molecule Glc 3 Man 9 GlcNAc 2 [6]. This common core is assembled on the endoplasmic reticulum (ER) membrane, catalyzed by a series of glycosyltransferases encoded by asparagine-linked glycosylation (ALG) genes [7]. Oligosaccharyltransferase (OTase) catalyzes the attachment of the precursor oligosaccharide to selected Asn residues in glycosylation sequons (N-X-S/T; X¹P) in nascent polypeptides [8]. Mutations in ALG genes can cause altered glycan structures due to the accumulation and transfer of truncated glycans, and altered site-specific glycan occupancy due to inefficient transfer of these truncated glycans by OTase [9]. Further trimming and maturation of glycans take place in the ER and Golgi with various enzymes involved, leading to diverse N-glycan structures on mature glycoproteins [10].
N-glycosylation depends on both the genetics and environment of a cell, as the N-glycosylation pathway can be influenced by the physiological state of the cell leading to changes in the sitespecific structures and heterogeneity of N-glycans across the glycoproteome [11,12]. Efficient and robust analysis of the heterogeneity of the glycoproteome is therefore critical [13]. In practice, glycoproteomic analyses have been used to better understand the molecular bases and consequences of various diseases [14][15][16], including congenital disorders of glycosylation (CDG), a family of diseases caused by defects in glycosylation biosynthesis pathways [17].
Mature glycoproteins can have an incredible diversity in the site-specific presence and structures of glycans [21]. This high diversity and relatively low intrinsic detectability of glycosylated peptides further increases the challenges in glycosylation analysis in mass spectrometry (MS) glyco/proteomic workflows. Further improvements in sensitive, robust, high-throughput, and efficient methods for site-specific analysis of glycoprotein structure and occupancy are therefore required. MS-based techniques are the current method of choice for glycoproteomic analyses, as they offer high speed, resolution, and sensitivity [22]. A wide range of MS strategies have been described for quantifying site-specific glycosylation in complex protein samples either in glycoproteomic workflows with glycans still attached to peptides [23][24][25], or in glycomic workflows that analyse glycans that have been released from glycoproteins [26], either labelled [27][28][29] or label-free [23,24,30].
A variety of MS data acquisition approaches are available for proteomic studies, depending on the experimental design and analytical goal [31][32][33]. Data Independent Acquisition (DIA) methods such as sequential window acquisition of all theoretical fragment ion spectra mass spectrometry (SWATH-MS) are especially useful for comprehensive measurement of all detectable analytes [34]. DIA/SWATH-MS has been useful in analysis of complex proteomes in many diverse systems, including investigations of fundamental cell biology, biomarker discovery, and analysis of proteoform diversity [34][35][36]. Glycoproteomic analyses investigating the role of the glycosylation machinery in glycoprotein synthesis and identification of cancer biomarkers have also successfully used DIA/SWATH-MS [23,37,38].
In a standard implementation of DIA/SWATH-MS, the m/z field is broken into consecutive 25 Da fixed width isolation windows (swaths) with a small overlap [34]. However, the distribution of peptides across the m/z range is uneven, leading to unbalanced precursor densities in each window, reducing analytical performance. The spectral complexity and chances of incorrect peak measurement in MS/MS spectra and extracted fragment ion chromatograms (XICs) are higher for the windows containing more precursor ions, while windows with few precursor ions do not make efficient use of MS instrument cycle time [39,40]. These problems have been overcome by the use of variable window DIA/SWATH-MS, an acquisition strategy that uses DIA isolation windows of different widths to equalize the average distribution of precursor ions in each window. This approach improves selectivity, detection confidence, and quantification reliability [39,40]. Several strategies for generating variable windows for DIA/SWATH-MS have been used to allow improved identification, profiling, and quantification of peptides and their modifications [40][41][42][43]. However, the high mass of glycopeptides means these previous approaches are not optimal for DIA/SWATH-MS glycoproteomic analyses. This is especially problematic for DIA/SWATH-MS measurement of glycopeptides that happen to fall in the same window, as differently glycosylated forms of the same underlying peptide share most fragment ions. Here, we describe a glycopeptidefocused variable window DIA/SWATH-MS method that improves measurement of Nglycosylated peptides in complex proteomes.

Cell Wall Protein Sample Preparation
Cell wall proteins were enriched as previously described [23,44]. In brief, yeast cells were grown in YPD at 30 °C with shaking until the OD 600nm reached 1.0 ± 0.1. Cells were harvested by centrifugation and completely lysed by vortex with glass beads. Proteins were denatured, cysteines were reduced with dithiothreitol and alkylated with acrylamide. Proteins covalently linked to the yeast polysaccharide cell wall were isolated by centrifugation, and non-covalently linked contaminating proteins were removed by washing with strongly denaturing buffer.
Proteins in the insoluble cell wall fraction were digested by trypsin and peptides and glycopeptides were desalted using C18 ZipTips (Millipore).

Variable Window SWATH Method Generation
A variable window SWATH method designed for glycopeptide measurement (gpvwSWATH) was generated by the program SWATH variable window calculator Version 1.0 (SCIEX) using a previous yeast cell wall MS proteomics dataset [23]. Peptides with MS/MS spectra containing HexNAc and Hex oxonium ions were considered as glycopeptides [45]. Glycopeptides in the m/z range of 400-1250 were selected as input for generating the gpvwSWATH design. In addition, a traditional variable window SWATH (vwSWATH) method was generated using all peptides within the same m/z range. Both variable window SWATH methods had 34 windows over the m/z range of 400-1250 Da, consistent with the fixed window SWATH. Window overlap was 1 Da, and the minimum window width was set as 3 Da.
Approximately 2 µg peptides were desalted on an Agilent C18 trap (300 Å pore size, 5 µm DIA/SWATH-MS as previously described [23,46], using identical LC parameters as for DDA analysis. For fixed window or variable window SWATH-MS, an MS-TOF scan was performed for 0.05 s, followed by MS/MS for 0.1 s for each isolation window.

Data Analysis
A previously generated ion library was used for glycopeptide quantification [23]. This ion library contained eight possible cell wall glycopeptides, and included fragment ions of nonglycosylated sequon-containing peptides and parent ion masses of every possible glycopeptide with glycan structures ranging from GlcNAc 2 to Man 15 GlcNAc 2 . Peptide measurement was performed with PeakView as previously described [23]. The mass spectrometry proteomics data have been deposited to the ProteomeXchange Consortium via the PRIDE [47] partner repository with the dataset identifier PXD015043. The abundance of each peptide was measured by summing the integrated areas of up to six fragment ions. Low-quality data was eliminated using a python script based on the co-generated FDR file as previously described [48], with only peptide intensity values with a corresponding FDR less than 0.01 accepted. The intensity of each peptide glycoform was normalized against the sum of all glycoforms of that peptide. Statistical analyses of peptide intensity differences between wild type and alg3D were performed using t-tests, with p < 0.05 considered to be significant. All experiments were performed in biological triplicate.

Generating a glycopeptide focussed variable window SWATH method
A general approach to generating a variable window SWATH method is using the distribution of m/z values and intensities of precursor ions of the analysts observed from LC-MS/MS analysis of a comparable experimental sample. This data can be used as input to a variable window calculator [42,43]. The SWATH variable window calculator allocates window widths to obtain a roughly equal density of precursor ions in each window. We generated two variable window SWATH methods based on input data from LC-MS/MS proteomic analysis of yeast cell wall peptides and glycopeptides from wild type and several N-glycosylation pathway deficient yeasts [23].
The first variable window SWATH method (vwSWATH) was optimized for all detectable peptides in the dataset [23]. Based on the traditional variable window SWATH generation approach, all MS1 peptide ions within the m/z range of 400 to 1250 Da were used (Fig. 1A). A pronounced uneven distribution of ions was observed with a peak density of precursor ions at ~550 m/z and the bulk of ions falling between 400-800 m/z. Splitting the full m/z range into 34 windows of variable width produced windows ranging from 13.5 to 87.3 m/z in width (Table   1). Most windows fell in the ion dense region between an m/z of 400 to 800, while windows outside of this region were much broader. This distribution is typical of LC-MS/MS peptide profiles.
Due to the presence of large and complex glycans in addition to the peptide backbone, glycopeptides are typically larger than peptides and have a different average m/z distribution.
Variable window SWATH methods optimized for peptides are therefore unlikely to be optimal for measurement of glycopeptides. We therefore next designed a variable window SWATH method optimized for measurement of glycopeptides (gpvwSWATH). We used the same previous analysis used for vwSWATH generation [23], and identified yeast cell wall glycopeptides in an unbiased and sensitive manner by selecting MS/MS spectra that contained glycan-specific oxonium fragment ions from HexNAc and Hex monosaccharides.
Glycopeptide ions within the m/z range of 400 to 1250 Da were then used to generate the gpvwSWATH method, with 34 windows across the m/z range of 400 to 1250 Da (Fig.1B). The m/z distribution of glycopeptides was even more uneven than for peptides, leading to widely divergent window widths. As almost all glycopeptide precursor ions were larger than 800 m/z, windows at low m/z values were often wider than 200 Da (Table 1), while windows between 900-1200 m/z were narrow due to the high glycopeptide precursor ion density in this region.
The shift of the precursor ion dense region from the centre of the mass range towards higher m/z values is consistent with attachment of glycans substantially increasing the average m/z distribution of glycopeptides.

Glycopeptide optimized variable window SWATH improves DIA glycopeptide measurement
We tested the performance of our glycopeptide-optimized variable window SWATH method in measuring the glycosylation status of cell wall glycoproteins from yeast cells with defects in the N-glycosylation biosynthetic machinery. Glycosyltransferases encoded by ALG genes are responsible for assembly of the lipid linked oligosaccharide (LLO) donor substrate for Nglycosylation. ALG3 encodes the Alg3p α-1,3-mannosyltransferase, which catalyzes the attachment of the sixth mannose to the growing LLO. alg3∆ mutant yeast cells accumulate Man 5 GlcNAc 2 [49], resulting in truncated N-glycan structures on cell wall glycoproteins [23].
We used three different SWATH methods, fixed window SWATH (hereafter SWATH), vwSWATH, and gpvwSWATH, to measure peptides and glycopeptides from cell wall proteins from wild type and alg3∆ yeast.  (Fig. 2A). These glycopeptides were also not resolved by LC retention time. It was therefore not possible to distinguish these glycopeptides by vwSWATH, although the overall signal intensity measured by this method was high because of the summed signal of the two glycopeptide forms (Fig. 2B). Also consistent with this lack of glycoform resolution, the oxonium ion intensities from vwSWATH were more than double that measured by the other two methods. The 25 Da window in the SWATH method was theoretically small enough to distinguish glycan structures on the same peptide. However, the window still contained signals from other co-eluting peptides, and narrower windows will therefore improve specificity. This would be particularly relevant for mammalian glycans, where mass differences between glycoforms can be much smaller than in high mannose yeast glycans. In alg3∆ yeast, glycan biosynthesis stops after five mannoses are added to the LLO, leading to accumulation of the Man 5 GlcNAc 2 LLO [9]. This incomplete glycan is still transferred to protein substrates and can be further modified, forming various glycan structures on mature glycoproteins [9]. We tested the ability of gpvwSWATH, vwSWATH, and fixed window SWATH to distinguish site-specific glycan structures in alg3∆ and wild type yeast. Previous work has established that in alg3∆ yeast, Gas1-N 40 is mostly modified with Man 5 GlcNAc 2 , with some Man 6 GlcNAc 2 and Man 7 GlcNAc 2 also present [23]. In contrast, Gas1-N 40 in wild type yeast is glycosylated with high occupancy, and mostly carries the Man 9 GlcNAc 2 glycan structure, as well as some Man 10 GlcNAc 2 and Man 8 GlcNAc 2 [23]. We found that all three SWATH methods successfully established that alg3∆ yeast had predominantly Man 5 GlcNAc 2 , and wild type yeast mainly Man 9 GlcNAc 2 (Fig. 3A). However, the apparent intensities of Man 9 GlcNAc 2 and Man 10 GlcNAc 2 glycans at Gas1-N 40 as measured by vwSWATH were identical (Fig. 3A). This incorrect assignment was due to both of these glycopeptides falling within the same wide window in vwSWATH ( Fig. 2A). Statistical comparison of wild type and alg3∆ yeast identified significant differences in the intensities of Gas1-N 40 glycopeptides containing 8, 9, and 10 mannoses when measured by SWATH or gpvwSWATH (Fig. 3B).
However, SWATH and vwSWATH were able to detect the significant difference in abundance of the Man 5 GlcNAc 2 glycopeptide, while gpvwSWATH could detect the difference in the Man 7 GlcNAc 2 glycopeptide (Fig. 3B). Inspection of glycosylation at Gas1-N 253 showed similar performance of the three methods. Wild type yeast has mostly Man 11 GlcNAc 2 on Gas1-N 253 , followed by Man 10 GlcNAc 2 and Man 12 GlcNAc 2 , while alg3∆ yeast accumulates Man 7 GlcNAc 2 and Man 8 GlcNAc 2 [23]. The performance of vwSWATH in measuring Gas1-N 253 glycopeptides was poor, as no glycopeptide carrying more than 10 mannoses was able to be clearly distinguished. Furthermore, Man 7 GlcNAc 2 , the dominant glycan structure in alg3∆ yeast, was not confidently detected at all by vwSWATH (Fig. 3C). Although the most intense glycan structures were accurately measured by both SWATH and gpvwSWATH, SWATH failed to detect several glycopeptides with low intensity, such as Man 9 GlcNAc 2 and Man 10 GlcNAc 2 in alg3∆ yeast (Fig. 3C). The failure of glycopeptides to be measured in these various samples was primarily due to difficulties in accurate automated peak picking in the more complex samples associated with wider windows. Finally, more glycopeptide forms were found to have significant differences in abundance between wild type and alg3∆ yeast when measured with gpvwSWATH than by SWATH, while no significant differences were detected using vwSWATH (Fig. 3D). In summary, we found that gpvwSWATH was the best approach for accurate and robust detection and measurement of glycopeptides, especially those with large glycans. and wild type yeast cell wall samples, measured by the three SWATH methods. White, not significant; blue, higher in alg3∆; red, higher in wild type.

Discussion
Improved analytical methods for detecting and measuring the abundance of glycopeptides are critical for efficient investigation of the many biological and industrial roles of protein glycosylation. DIA/SWATH glycoproteomics has been shown to be a powerful approach for analysing site-specific glycosylation in many diverse applications [23,38,50,51]. The original fixed window width implementation of SWATH was sufficient for general peptide measurement [34], but could be improved with variable window SWATH, which involved decreasing the window size at precursor-ion dense regions of the m/z range to increase signal:noise and measure spectra with reduced complexity [39,40]. However, with their large covalent oligosaccharide modifications, glycopeptides have a distinct m/z distribution to nonmodified peptides (Fig. 1). We therefore designed a variable window SWATH method optimized for detection and quantification of glycopeptides, and showed that it indeed demonstrated superior analytical performance ( Fig. 2 and 3).
To generate a set of variable SWATH windows optimized for glycopeptide analysis we used the SWATH variable window calculator with a list of precursor masses whose MS/MS fragmentation spectra contained oxonium ions, identifying the precursor as a glycopeptide. We obtained these data from DDA analyses of wild type and glycosylation-deficient yeast cell wall samples. It would be possible to generate alternative variable window methods based on a precursor mass list generated from samples enriched in glycopeptides, from alternative glycosylation mutants, or alternative species, depending on the characteristics of the samples to be analysed. The gpvwSWATH method we used here showed superior analytical performance especially for large glycopeptides. However, some glycopeptides with smaller oligosaccharides were primarily detected with m/z precursor ions that fell in relatively wide windows, causing suboptimal measurement of these glycopeptides. While this is unfortunately an unavoidable general analytical feature of variable window SWATH methods, several alternative approaches are possible. Increasing the total number of windows will improve precursor specificity, although at the cost of reduced temporal resolution with an increased MS cycle time, potentially leading to less robust LC peak detection. Narrowing the overall m/z measurement range to exclude m/z regions with few target glycopeptides can also partially overcome these problems. The precise experimental question at hand will determine the optimal choice of method.
In conclusion, glycopeptide variable window SWATH is a sensible method for detection and quantification of glycopeptides with diverse glycan structures. Appropriate instrument methods can be quickly and easily generated and flexibly adjusted depending on the sample types, analytes of interest, and experimental priorities.