Cargando…

Bayesian variable selection logistic regression with paired proteomic measurements

We explore the problem of variable selection in a case‐control setting with mass spectrometry proteomic data consisting of paired measurements. Each pair corresponds to a distinct isotope cluster and each component within pair represents a summary of isotopic expression based on either the intensity...

Descripción completa

Detalles Bibliográficos
Autores principales: Kakourou, Alexia, Mertens, Bart
Formato: Online Artículo Texto
Lenguaje:English
Publicado: John Wiley and Sons Inc. 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6175404/
https://www.ncbi.nlm.nih.gov/pubmed/29943441
http://dx.doi.org/10.1002/bimj.201700182
_version_ 1783361502859755520
author Kakourou, Alexia
Mertens, Bart
author_facet Kakourou, Alexia
Mertens, Bart
author_sort Kakourou, Alexia
collection PubMed
description We explore the problem of variable selection in a case‐control setting with mass spectrometry proteomic data consisting of paired measurements. Each pair corresponds to a distinct isotope cluster and each component within pair represents a summary of isotopic expression based on either the intensity or the shape of the cluster. Our objective is to identify a collection of isotope clusters associated with the disease outcome and at the same time assess the predictive added‐value of shape beyond intensity while maintaining predictive performance. We propose a Bayesian model that exploits the paired structure of our data and utilizes prior information on the relative predictive power of each source by introducing multiple layers of selection. This allows us to make simultaneous inference on which are the most informative pairs and for which—and to what extent—shape has a complementary value in separating the two groups. We evaluate the Bayesian model on pancreatic cancer data. Results from the fitted model show that most predictive potential is achieved with a subset of just six (out of 1289) pairs while the contribution of the intensity components is much higher than the shape components. To demonstrate how the method behaves under a controlled setting we consider a simulation study. Results from this study indicate that the proposed approach can successfully select the truly predictive pairs and accurately estimate the effects of both components although, in some cases, the model tends to overestimate the inclusion probability of the second component.
format Online
Article
Text
id pubmed-6175404
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher John Wiley and Sons Inc.
record_format MEDLINE/PubMed
spelling pubmed-61754042018-10-19 Bayesian variable selection logistic regression with paired proteomic measurements Kakourou, Alexia Mertens, Bart Biom J General Biometry We explore the problem of variable selection in a case‐control setting with mass spectrometry proteomic data consisting of paired measurements. Each pair corresponds to a distinct isotope cluster and each component within pair represents a summary of isotopic expression based on either the intensity or the shape of the cluster. Our objective is to identify a collection of isotope clusters associated with the disease outcome and at the same time assess the predictive added‐value of shape beyond intensity while maintaining predictive performance. We propose a Bayesian model that exploits the paired structure of our data and utilizes prior information on the relative predictive power of each source by introducing multiple layers of selection. This allows us to make simultaneous inference on which are the most informative pairs and for which—and to what extent—shape has a complementary value in separating the two groups. We evaluate the Bayesian model on pancreatic cancer data. Results from the fitted model show that most predictive potential is achieved with a subset of just six (out of 1289) pairs while the contribution of the intensity components is much higher than the shape components. To demonstrate how the method behaves under a controlled setting we consider a simulation study. Results from this study indicate that the proposed approach can successfully select the truly predictive pairs and accurately estimate the effects of both components although, in some cases, the model tends to overestimate the inclusion probability of the second component. John Wiley and Sons Inc. 2018-06-25 2018-09 /pmc/articles/PMC6175404/ /pubmed/29943441 http://dx.doi.org/10.1002/bimj.201700182 Text en © 2018 The Authors. Biometrical Journal published by WILEY‐VCH Verlag GmbH & Co. KGaA, Weinheim. This is an open access article under the terms of the http://creativecommons.org/licenses/by/4.0/ License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited.
spellingShingle General Biometry
Kakourou, Alexia
Mertens, Bart
Bayesian variable selection logistic regression with paired proteomic measurements
title Bayesian variable selection logistic regression with paired proteomic measurements
title_full Bayesian variable selection logistic regression with paired proteomic measurements
title_fullStr Bayesian variable selection logistic regression with paired proteomic measurements
title_full_unstemmed Bayesian variable selection logistic regression with paired proteomic measurements
title_short Bayesian variable selection logistic regression with paired proteomic measurements
title_sort bayesian variable selection logistic regression with paired proteomic measurements
topic General Biometry
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6175404/
https://www.ncbi.nlm.nih.gov/pubmed/29943441
http://dx.doi.org/10.1002/bimj.201700182
work_keys_str_mv AT kakouroualexia bayesianvariableselectionlogisticregressionwithpairedproteomicmeasurements
AT mertensbart bayesianvariableselectionlogisticregressionwithpairedproteomicmeasurements