Skip to main content
bioRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search
New Results

A Bayesian Mixture Modelling Approach For Spatial Proteomics

View ORCID ProfileOliver M. Crook, View ORCID ProfileClaire M. Mulvev, View ORCID ProfilePaul D.W. Kirk, View ORCID ProfileKathryn S. Lillev, View ORCID ProfileLaurent Gattot
doi: https://doi.org/10.1101/282269
Oliver M. Crook
1Computational Proteomics Unit, Department of Biochemistry, University of Cambridge, Cambridge, UK
2Cambridge Centre for Proteomics, Department of Biochemistry, University of Cambridge, Cambridge, UK
3MRC Biostatistics Unit, Cambridge Institute for Public Health, Cambridge, UK
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Oliver M. Crook
Claire M. Mulvev
2Cambridge Centre for Proteomics, Department of Biochemistry, University of Cambridge, Cambridge, UK
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Claire M. Mulvev
Paul D.W. Kirk
3MRC Biostatistics Unit, Cambridge Institute for Public Health, Cambridge, UK
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Paul D.W. Kirk
Kathryn S. Lillev
2Cambridge Centre for Proteomics, Department of Biochemistry, University of Cambridge, Cambridge, UK
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Kathryn S. Lillev
Laurent Gattot
1Computational Proteomics Unit, Department of Biochemistry, University of Cambridge, Cambridge, UK
2Cambridge Centre for Proteomics, Department of Biochemistry, University of Cambridge, Cambridge, UK
4Current address: de Duve Institute, UCLouvain, Avenue Hippocrate 75, 1200 Brussels, Belgium
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Laurent Gattot
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Preview PDF
Loading

Abstract

Abstract Analysis of the spatial sub-cellular distribution of proteins is of vital importance to fully understand context specific protein function. Some proteins can be found with a single location within a cell, but up to half of proteins may reside in multiple locations, can dynamically re-localise, or reside within an unknown functional compartment. These considerations lead to uncertainty in associating a protein to a single location. Currently, mass spectrometry (MS) based spatial proteomics relies on supervised machine learning algorithms to assign proteins to sub-cellular locations based on common gradient profiles. However, such methods fail to quantify uncertainty associated with sub-cellular class assignment. Here we reformulate the framework on which we perform statistical analysis. We propose a Bayesian generative classifier based on Gaussian mixture models to assign proteins probabilistically to sub-cellular niches, thus proteins have a probability distribution over sub-cellular locations, with Bayesian computation performed using the expectation-maximisation (EM) algorithm, as well as Markov-chain Monte-Carlo (MCMC). Our methodology allows proteome-wide uncertainty quantification, thus adding a further layer to the analysis of spatial proteomics. Our framework is flexible, allowing many different systems to be analysed and reveals new modelling opportunities for spatial proteomics. We find our methods perform competitively with current state-of-the art machine learning methods, whilst simultaneously providing more information. We highlight several examples where classification based on the support vector machine is unable to make any conclusions, while uncertainty quantification using our approach provides biologically intriguing results. To our knowledge this is the first Bayesian model of MS-based spatial proteomics data.

Author summary Sub-cellular localisation of proteins provides insights into sub-cellular biological processes. For a protein to carry out its intended function it must be localised to the correct sub-cellular environment, whether that be organelles, vesicles or any sub-cellular niche. Correct sub-cellular localisation ensures the biochemical conditions for the protein to carry out its molecular function are met, as well as being near its intended interaction partners. Therefore, mis-localisation of proteins alters cell biochemistry and can disrupt, for example, signalling pathways or inhibit the trafficking of material around the cell. The sub-cellular distribution of proteins is complicated by proteins that can reside in multiple micro-environments, or those that move dynamically within the cell. Methods that predict protein sub-cellular localisation often fail to quantify the uncertainty that arises from the complex and dynamic nature of the sub-cellular environment. Here we present a Bayesian methodology to analyse protein sub-cellular localisation. We explicitly model our data and use Bayesian inference to quantify uncertainty in our predictions. We find our method is competitive with state-of-the-art machine learning methods and additionally provides uncertainty quantification. We show that, with this additional information, we can make deeper insights into the fundamental biochemistry of the cell.

Footnotes

  • ↵* omc25{at}cam.ac.uk

  • ↵† lg390{at}cam.ac.uk

Copyright 
The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY 4.0 International license.
Back to top
PreviousNext
Posted August 14, 2018.
Download PDF
Email

Thank you for your interest in spreading the word about bioRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
A Bayesian Mixture Modelling Approach For Spatial Proteomics
(Your Name) has forwarded a page to you from bioRxiv
(Your Name) thought you would like to see this page from the bioRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
A Bayesian Mixture Modelling Approach For Spatial Proteomics
Oliver M. Crook, Claire M. Mulvev, Paul D.W. Kirk, Kathryn S. Lillev, Laurent Gattot
bioRxiv 282269; doi: https://doi.org/10.1101/282269
Digg logo Reddit logo Twitter logo CiteULike logo Facebook logo Google logo Mendeley logo
Citation Tools
A Bayesian Mixture Modelling Approach For Spatial Proteomics
Oliver M. Crook, Claire M. Mulvev, Paul D.W. Kirk, Kathryn S. Lillev, Laurent Gattot
bioRxiv 282269; doi: https://doi.org/10.1101/282269

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Systems Biology
Subject Areas
All Articles
  • Animal Behavior and Cognition (2655)
  • Biochemistry (5293)
  • Bioengineering (3706)
  • Bioinformatics (15851)
  • Biophysics (7295)
  • Cancer Biology (5659)
  • Cell Biology (8145)
  • Clinical Trials (138)
  • Developmental Biology (4796)
  • Ecology (7568)
  • Epidemiology (2059)
  • Evolutionary Biology (10628)
  • Genetics (7755)
  • Genomics (10177)
  • Immunology (5238)
  • Microbiology (13992)
  • Molecular Biology (5405)
  • Neuroscience (30935)
  • Paleontology (218)
  • Pathology (887)
  • Pharmacology and Toxicology (1529)
  • Physiology (2264)
  • Plant Biology (5049)
  • Scientific Communication and Education (1045)
  • Synthetic Biology (1405)
  • Systems Biology (4167)
  • Zoology (816)