Skip to main content
bioRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search
New Results

A validated strategy to infer protein biomarkers from RNA-Seq by combining multiple mRNA splice variants and time-delay

Rasmus Magnusson, Olof Rundquist, Min Jung Kim, Sandra Hellberg, Chan Hyun Na, Mikael Benson, David Gomez-Cabrero, Ingrid Kockum, Jesper Tegnér, Fredrik Piehl, Maja Jagodic, Johan Mellergård, Claudio Altafini, Jan Ernerudh, Maria C. Jenmalm, Colm E. Nestor, Min-Sik Kim, Mika Gustafsson
doi: https://doi.org/10.1101/599373
Rasmus Magnusson
1Bioinformatics, Department of Physics, Chemistry and Biology, Linköping University, Linköping, Sweden
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Olof Rundquist
1Bioinformatics, Department of Physics, Chemistry and Biology, Linköping University, Linköping, Sweden
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Min Jung Kim
2Department of Applied Chemistry, College of Applied Sciences, Kyung Hee University, Yong-in 446-701, Republic of Korea
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Sandra Hellberg
3Department of Clinical and Experimental Medicine, Linköping University, Linköping, Sweden
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Chan Hyun Na
4Department of Neurology, Institute for Cell Engineering, Johns Hopkins University School of Medicine, Baltimore, MD, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Mikael Benson
5Centre for Personalised Medicine, Linköping University, Linköping, Sweden
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
David Gomez-Cabrero
6Navarrabiomed, Complejo Hospitalario de Navarra, Universidad Pública de Navarra, IdiSNA, 31008 Pamplona, Spain
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Ingrid Kockum
7Department of Clinical Neuroscience, Center for Molecular Medicine, Karolinska Institute, 171 77, Stockholm, Sweden
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Jesper Tegnér
8Biological and Environmental Sciences and Engineering Division, Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal 23955–6900, Saudi Arabia
9Unit of Computational Medicine, Department of Medicine, Solna, Center for Molecular Medicine, Karolinska Institutet, Stockholm, Sweden
10Science for Life Laboratory, Solna, Sweden
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Fredrik Piehl
7Department of Clinical Neuroscience, Center for Molecular Medicine, Karolinska Institute, 171 77, Stockholm, Sweden
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Maja Jagodic
7Department of Clinical Neuroscience, Center for Molecular Medicine, Karolinska Institute, 171 77, Stockholm, Sweden
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Johan Mellergård
11Department of Neurology, Linköping University, Linköping, Sweden
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Claudio Altafini
12Department of Automatic Control, Linköping University, Linköping, Sweden
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Jan Ernerudh
13Department of Clinical Immunology and Transfusion Medicine and Department of Clinical and Experimental Medicine, Linköping University, Linköping, Sweden
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Maria C. Jenmalm
3Department of Clinical and Experimental Medicine, Linköping University, Linköping, Sweden
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • For correspondence: mika.gustafsson@liu maria.jenmalm@liu.se
Colm E. Nestor
3Department of Clinical and Experimental Medicine, Linköping University, Linköping, Sweden
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Min-Sik Kim
14Department of New Biology, Daegu Gyeongbuk Institute of Science and Technology, Daegu 711-873, Republic of Korea
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Mika Gustafsson
1Bioinformatics, Department of Physics, Chemistry and Biology, Linköping University, Linköping, Sweden
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • For correspondence: mika.gustafsson@liu maria.jenmalm@liu.se
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Supplementary material
  • Preview PDF
Loading

Abstract

Background Profiling of mRNA expression is an important method to identify biomarkers but complicated by limited correlations between mRNA expression and protein abundance. We hypothesised that these correlations could be improved by mathematical models based on measuring splice variants and time delay in protein translation.

Methods We characterised time-series of primary human naïve CD4+ T cells during early T-helper type 1 differentiation with RNA-sequencing and mass-spectrometry proteomics. We then performed computational time-series analysis in this system and in two other key human and murine immune cell types. Linear mathematical mixed time-delayed splice variant models were used to predict protein abundances, and the models were validated using out-of-sample predictions. Lastly, we re-analysed RNA-Seq datasets to evaluate biomarker discovery in five T-cell associated diseases, validating the findings for multiple sclerosis (MS) and asthma.

Results The new models demonstrated median correlations of mRNA-to-protein abundance of 0.79-0.94, significantly out-performing models not including the usage of multiple splice variants and time-delays, as shown in cross-validation tests. Our mathematical models provided more differentially expressed proteins between patients and controls in all five diseases. Moreover, analysis of these proteins in asthma and MS supported their relevance. One marker, sCD27, was clinically validated in MS using two independent cohorts, for treatment response and prognosis.

Conclusion Our splice variant and time-delay models substantially improved the prediction of protein abundance from mRNA data in three immune cell-types. The models provided valuable biomarker candidates, which were validated in clinical studies of MS and asthma. We propose that our strategy is generally applicable for biomarker discovery.

Footnotes

  • Mainly the format has been changed from a Nature letter type to that of a traditional research article. As a result, the paper has been significantly lengthened give room for additional background, extra figures and discussion.

Copyright 
The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY 4.0 International license.
Back to top
PreviousNext
Posted February 21, 2020.
Download PDF

Supplementary Material

Email

Thank you for your interest in spreading the word about bioRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
A validated strategy to infer protein biomarkers from RNA-Seq by combining multiple mRNA splice variants and time-delay
(Your Name) has forwarded a page to you from bioRxiv
(Your Name) thought you would like to see this page from the bioRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
A validated strategy to infer protein biomarkers from RNA-Seq by combining multiple mRNA splice variants and time-delay
Rasmus Magnusson, Olof Rundquist, Min Jung Kim, Sandra Hellberg, Chan Hyun Na, Mikael Benson, David Gomez-Cabrero, Ingrid Kockum, Jesper Tegnér, Fredrik Piehl, Maja Jagodic, Johan Mellergård, Claudio Altafini, Jan Ernerudh, Maria C. Jenmalm, Colm E. Nestor, Min-Sik Kim, Mika Gustafsson
bioRxiv 599373; doi: https://doi.org/10.1101/599373
Reddit logo Twitter logo Facebook logo LinkedIn logo Mendeley logo
Citation Tools
A validated strategy to infer protein biomarkers from RNA-Seq by combining multiple mRNA splice variants and time-delay
Rasmus Magnusson, Olof Rundquist, Min Jung Kim, Sandra Hellberg, Chan Hyun Na, Mikael Benson, David Gomez-Cabrero, Ingrid Kockum, Jesper Tegnér, Fredrik Piehl, Maja Jagodic, Johan Mellergård, Claudio Altafini, Jan Ernerudh, Maria C. Jenmalm, Colm E. Nestor, Min-Sik Kim, Mika Gustafsson
bioRxiv 599373; doi: https://doi.org/10.1101/599373

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Bioinformatics
Subject Areas
All Articles
  • Animal Behavior and Cognition (4655)
  • Biochemistry (10307)
  • Bioengineering (7618)
  • Bioinformatics (26203)
  • Biophysics (13453)
  • Cancer Biology (10625)
  • Cell Biology (15348)
  • Clinical Trials (138)
  • Developmental Biology (8456)
  • Ecology (12761)
  • Epidemiology (2067)
  • Evolutionary Biology (16777)
  • Genetics (11361)
  • Genomics (15407)
  • Immunology (10556)
  • Microbiology (25060)
  • Molecular Biology (10162)
  • Neuroscience (54128)
  • Paleontology (398)
  • Pathology (1655)
  • Pharmacology and Toxicology (2877)
  • Physiology (4315)
  • Plant Biology (9204)
  • Scientific Communication and Education (1582)
  • Synthetic Biology (2543)
  • Systems Biology (6753)
  • Zoology (1453)