Abstract
Electrospray ionization is a powerful and prevalent technique used to ionize analytes in mass spectrometry. The distribution of charges that an analyte receives (charge state distribution, CSD) is an important consideration for interpreting mass spectra. However, due to an incomplete understanding of the ionization mechanism, the analyte properties that influence CSDs are not fully understood. Here, we employ a machine learning-based high-throughput approach and analyze CSDs of hundreds of thousands of peptides. Interestingly, half of the peptides exhibit charges that differ from what one would naively expect (number of basic sites). We find that these peptides can be classified into two regimes—undercharging and overcharging—and that these two regimes display markedly different charging characteristics. Strikingly, peptides in the overcharging regime show minimal dependence on basic site count, and more generally, the two regimes exhibit distinct sequence determinants. These findings highlight the rich ionization behavior of peptides and the potential of CSDs for enhancing peptide identification.
Competing Interest Statement
The authors have declared no competing interest.