Abstract
Mass spectrometry (MS) has emerged as a powerful approach for the detection of Chinese hamster ovary (CHO) cell protein impurities in antibody drug products. The incomplete annotation of the Chinese hamster genome, however, limits the coverage of MS-based host cell protein (HCP) analysis. In this study, we performed ribosome footprint profiling (Ribo-seq) of translation initiation and elongation to refine the Chinese hamster genome annotation. Analysis of these data resulted in the identification of thousands of previously uncharacterised non-canonical proteoforms in CHO cells, such as N-terminally extended proteins and short open reading frames (sORFs) predicted to encode for microproteins. MS- based HCP analysis of adalimumab with the extended protein sequence database, resulted in the detection of CHO cell microprotein impurities in a mAb drug product for the first time. Further analysis revealed that the CHO cell microprotein population is altered over the course of cell culture and in response to a change in cell culture temperature. The annotation of non-canonical Chinese hamster proteoforms permits a more comprehensive characterisation of HCPs in antibody drug products using MS.
Highlights
Analysis of translation initiation and elongation using ribosome footprint profiling provides a refined annotation of the Chinese hamster genome.
7,769 novel Chinese proteoforms were identified including those initiating at near cognate start codons.
941 N-terminal extensions of annotated genes were identified.
5,553 short open reading frames (sORFs) predicted to encode microproteins (i.e., proteins < 100 aa) were also characterised.
The annotation of non-canonical proteins increases the coverage of MS-based host-cell protein analysis in monoclonal antibody drug products.
8 microproteins were found in adalimumab drug product.
Transcripts annotated as non-coding can contain short open reading frames (sORFs) predicted to encode peptides (or microproteins) which are found to undergo changes in expression and translational regulation at reduced cell culture temperature.
95 of the novel proteoforms of which 79 were microproteins were subsequently identified in a second CHO K1 cell line using LC-MS/MS based proteomics. A comparison of protein abundance revealed that 13 microproteins were found to be differentially expressed between the exponential growth and stationary phases of cell culture.
Competing Interest Statement
MCR, IT, PK, CT, FG, LS, BLK, MC, NB, JB and CC declare no competing interests. LZ is an employee of Pfizer Inc.
Footnotes
Abbreviations
- CDS
- coding sequence
- CHO
- Chinese hamster ovary
- CHX
- cycloheximide
- Harr
- Harringtonine
- HCP
- host cell protein
- mAb
- monoclonal antibody
- NGS
- Next generation sequencing
- NTS
- non-temperature shifted
- ORF
- open reading frame
- ouORF
- overlapping upstream ORF
- PAGE
- polyacrylamide gel
- Ribosome footprint profiling
- Ribo-seq
- RPF
- Ribosome protected fragment
- RPKM
- Reads per kilobase mapped
- sORF
- short open reading frame
- TS
- temperature shifted
- TE
- Translational efficiency
- uORF
- upstream open reading frame
- UTR
- untranslated region
- BPM
- Bins per million
- AGC
- Automatic Gain Control
- GO
- Gene Ontology
- LFQ
- Label Free Quantification
- DDA
- Data Dependent Acquisition
- IT
- Injection Time