ABSTRACT
Unveiling the complete proteome of viruses is crucial to our understanding of the viral life cycle and interaction with the host. We developed Massively Parallel Ribosome Profiling (MPRP) to experimentally determine open reading frames (ORFs) in 20,170 designed oligonucleotides across 679 human-associated viral genomes. We identified 5,381 ORFs, including 4,208 non-canonical ORFs, and show successful detection of both annotated coding sequences (CDSs) and reported non-canonical ORFs. By examining immunopeptidome datasets of infected cells, we found class I human leukocyte antigen (HLA-I) peptides originating from non-canonical ORFs identified through MPRP. By inspecting ribosome occupancies on the 5’UTR and CDS regions of annotated viral genes, we identified hundreds of upstream ORFs (uORFs) that negatively regulate the synthesis of canonical viral proteins. The unprecedented source of viral ORFs across a wide range of viral families, including highly pathogenic viruses, expands the repertoire of vaccine targets and exposes new cis-regulatory sequences in viral genomes.
Competing Interest Statement
S.W.-G., M.R.B, A.C.S., and P.C.S. are named co-inventors on a patent application related to this work filed by The Broad Institute. S.K is now an employee of Genentech. S.A.C. is a member of the scientific advisory boards of Kymera, PTM BioLabs, Seer and PrognomIQ. P.C.S. is a co- founder of and consultant to Sherlock Biosciences and Delve Biosciences and a board member of Danaher Corporation and holds equity in the companies. The remaining authors declare no competing interests. All other authors declare no competing interests.