Cargando…

Integration of Multiple Genomic Data Sources in a Bayesian Cox Model for Variable Selection and Prediction

Bayesian variable selection becomes more and more important in statistical analyses, in particular when performing variable selection in high dimensions. For survival time models and in the presence of genomic data, the state of the art is still quite unexploited. One of the more recent approaches s...

Descripción completa

Detalles Bibliográficos
Autores principales: Treppmann, Tabea, Ickstadt, Katja, Zucknick, Manuela
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Hindawi 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5554576/
https://www.ncbi.nlm.nih.gov/pubmed/28828032
http://dx.doi.org/10.1155/2017/7340565
_version_ 1783256817384554496
author Treppmann, Tabea
Ickstadt, Katja
Zucknick, Manuela
author_facet Treppmann, Tabea
Ickstadt, Katja
Zucknick, Manuela
author_sort Treppmann, Tabea
collection PubMed
description Bayesian variable selection becomes more and more important in statistical analyses, in particular when performing variable selection in high dimensions. For survival time models and in the presence of genomic data, the state of the art is still quite unexploited. One of the more recent approaches suggests a Bayesian semiparametric proportional hazards model for right censored time-to-event data. We extend this model to directly include variable selection, based on a stochastic search procedure within a Markov chain Monte Carlo sampler for inference. This equips us with an intuitive and flexible approach and provides a way for integrating additional data sources and further extensions. We make use of the possibility of implementing parallel tempering to help improve the mixing of the Markov chains. In our examples, we use this Bayesian approach to integrate copy number variation data into a gene-expression-based survival prediction model. This is achieved by formulating an informed prior based on copy number variation. We perform a simulation study to investigate the model's behavior and prediction performance in different situations before applying it to a dataset of glioblastoma patients and evaluating the biological relevance of the findings.
format Online
Article
Text
id pubmed-5554576
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher Hindawi
record_format MEDLINE/PubMed
spelling pubmed-55545762017-08-21 Integration of Multiple Genomic Data Sources in a Bayesian Cox Model for Variable Selection and Prediction Treppmann, Tabea Ickstadt, Katja Zucknick, Manuela Comput Math Methods Med Research Article Bayesian variable selection becomes more and more important in statistical analyses, in particular when performing variable selection in high dimensions. For survival time models and in the presence of genomic data, the state of the art is still quite unexploited. One of the more recent approaches suggests a Bayesian semiparametric proportional hazards model for right censored time-to-event data. We extend this model to directly include variable selection, based on a stochastic search procedure within a Markov chain Monte Carlo sampler for inference. This equips us with an intuitive and flexible approach and provides a way for integrating additional data sources and further extensions. We make use of the possibility of implementing parallel tempering to help improve the mixing of the Markov chains. In our examples, we use this Bayesian approach to integrate copy number variation data into a gene-expression-based survival prediction model. This is achieved by formulating an informed prior based on copy number variation. We perform a simulation study to investigate the model's behavior and prediction performance in different situations before applying it to a dataset of glioblastoma patients and evaluating the biological relevance of the findings. Hindawi 2017 2017-07-30 /pmc/articles/PMC5554576/ /pubmed/28828032 http://dx.doi.org/10.1155/2017/7340565 Text en Copyright © 2017 Tabea Treppmann et al. https://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Treppmann, Tabea
Ickstadt, Katja
Zucknick, Manuela
Integration of Multiple Genomic Data Sources in a Bayesian Cox Model for Variable Selection and Prediction
title Integration of Multiple Genomic Data Sources in a Bayesian Cox Model for Variable Selection and Prediction
title_full Integration of Multiple Genomic Data Sources in a Bayesian Cox Model for Variable Selection and Prediction
title_fullStr Integration of Multiple Genomic Data Sources in a Bayesian Cox Model for Variable Selection and Prediction
title_full_unstemmed Integration of Multiple Genomic Data Sources in a Bayesian Cox Model for Variable Selection and Prediction
title_short Integration of Multiple Genomic Data Sources in a Bayesian Cox Model for Variable Selection and Prediction
title_sort integration of multiple genomic data sources in a bayesian cox model for variable selection and prediction
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5554576/
https://www.ncbi.nlm.nih.gov/pubmed/28828032
http://dx.doi.org/10.1155/2017/7340565
work_keys_str_mv AT treppmanntabea integrationofmultiplegenomicdatasourcesinabayesiancoxmodelforvariableselectionandprediction
AT ickstadtkatja integrationofmultiplegenomicdatasourcesinabayesiancoxmodelforvariableselectionandprediction
AT zucknickmanuela integrationofmultiplegenomicdatasourcesinabayesiancoxmodelforvariableselectionandprediction