Cargando…

A Bayesian mixture modelling approach for spatial proteomics

Analysis of the spatial sub-cellular distribution of proteins is of vital importance to fully understand context specific protein function. Some proteins can be found with a single location within a cell, but up to half of proteins may reside in multiple locations, can dynamically re-localise, or re...

Descripción completa

Detalles Bibliográficos
Autores principales: Crook, Oliver M., Mulvey, Claire M., Kirk, Paul D. W., Lilley, Kathryn S., Gatto, Laurent
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6258510/
https://www.ncbi.nlm.nih.gov/pubmed/30481170
http://dx.doi.org/10.1371/journal.pcbi.1006516
_version_ 1783374509045186560
author Crook, Oliver M.
Mulvey, Claire M.
Kirk, Paul D. W.
Lilley, Kathryn S.
Gatto, Laurent
author_facet Crook, Oliver M.
Mulvey, Claire M.
Kirk, Paul D. W.
Lilley, Kathryn S.
Gatto, Laurent
author_sort Crook, Oliver M.
collection PubMed
description Analysis of the spatial sub-cellular distribution of proteins is of vital importance to fully understand context specific protein function. Some proteins can be found with a single location within a cell, but up to half of proteins may reside in multiple locations, can dynamically re-localise, or reside within an unknown functional compartment. These considerations lead to uncertainty in associating a protein to a single location. Currently, mass spectrometry (MS) based spatial proteomics relies on supervised machine learning algorithms to assign proteins to sub-cellular locations based on common gradient profiles. However, such methods fail to quantify uncertainty associated with sub-cellular class assignment. Here we reformulate the framework on which we perform statistical analysis. We propose a Bayesian generative classifier based on Gaussian mixture models to assign proteins probabilistically to sub-cellular niches, thus proteins have a probability distribution over sub-cellular locations, with Bayesian computation performed using the expectation-maximisation (EM) algorithm, as well as Markov-chain Monte-Carlo (MCMC). Our methodology allows proteome-wide uncertainty quantification, thus adding a further layer to the analysis of spatial proteomics. Our framework is flexible, allowing many different systems to be analysed and reveals new modelling opportunities for spatial proteomics. We find our methods perform competitively with current state-of-the art machine learning methods, whilst simultaneously providing more information. We highlight several examples where classification based on the support vector machine is unable to make any conclusions, while uncertainty quantification using our approach provides biologically intriguing results. To our knowledge this is the first Bayesian model of MS-based spatial proteomics data.
format Online
Article
Text
id pubmed-6258510
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-62585102018-12-06 A Bayesian mixture modelling approach for spatial proteomics Crook, Oliver M. Mulvey, Claire M. Kirk, Paul D. W. Lilley, Kathryn S. Gatto, Laurent PLoS Comput Biol Research Article Analysis of the spatial sub-cellular distribution of proteins is of vital importance to fully understand context specific protein function. Some proteins can be found with a single location within a cell, but up to half of proteins may reside in multiple locations, can dynamically re-localise, or reside within an unknown functional compartment. These considerations lead to uncertainty in associating a protein to a single location. Currently, mass spectrometry (MS) based spatial proteomics relies on supervised machine learning algorithms to assign proteins to sub-cellular locations based on common gradient profiles. However, such methods fail to quantify uncertainty associated with sub-cellular class assignment. Here we reformulate the framework on which we perform statistical analysis. We propose a Bayesian generative classifier based on Gaussian mixture models to assign proteins probabilistically to sub-cellular niches, thus proteins have a probability distribution over sub-cellular locations, with Bayesian computation performed using the expectation-maximisation (EM) algorithm, as well as Markov-chain Monte-Carlo (MCMC). Our methodology allows proteome-wide uncertainty quantification, thus adding a further layer to the analysis of spatial proteomics. Our framework is flexible, allowing many different systems to be analysed and reveals new modelling opportunities for spatial proteomics. We find our methods perform competitively with current state-of-the art machine learning methods, whilst simultaneously providing more information. We highlight several examples where classification based on the support vector machine is unable to make any conclusions, while uncertainty quantification using our approach provides biologically intriguing results. To our knowledge this is the first Bayesian model of MS-based spatial proteomics data. Public Library of Science 2018-11-27 /pmc/articles/PMC6258510/ /pubmed/30481170 http://dx.doi.org/10.1371/journal.pcbi.1006516 Text en © 2018 Crook et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Crook, Oliver M.
Mulvey, Claire M.
Kirk, Paul D. W.
Lilley, Kathryn S.
Gatto, Laurent
A Bayesian mixture modelling approach for spatial proteomics
title A Bayesian mixture modelling approach for spatial proteomics
title_full A Bayesian mixture modelling approach for spatial proteomics
title_fullStr A Bayesian mixture modelling approach for spatial proteomics
title_full_unstemmed A Bayesian mixture modelling approach for spatial proteomics
title_short A Bayesian mixture modelling approach for spatial proteomics
title_sort bayesian mixture modelling approach for spatial proteomics
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6258510/
https://www.ncbi.nlm.nih.gov/pubmed/30481170
http://dx.doi.org/10.1371/journal.pcbi.1006516
work_keys_str_mv AT crookoliverm abayesianmixturemodellingapproachforspatialproteomics
AT mulveyclairem abayesianmixturemodellingapproachforspatialproteomics
AT kirkpauldw abayesianmixturemodellingapproachforspatialproteomics
AT lilleykathryns abayesianmixturemodellingapproachforspatialproteomics
AT gattolaurent abayesianmixturemodellingapproachforspatialproteomics
AT crookoliverm bayesianmixturemodellingapproachforspatialproteomics
AT mulveyclairem bayesianmixturemodellingapproachforspatialproteomics
AT kirkpauldw bayesianmixturemodellingapproachforspatialproteomics
AT lilleykathryns bayesianmixturemodellingapproachforspatialproteomics
AT gattolaurent bayesianmixturemodellingapproachforspatialproteomics