Cargando…
Integration of Multiple Genomic Data Sources in a Bayesian Cox Model for Variable Selection and Prediction
Bayesian variable selection becomes more and more important in statistical analyses, in particular when performing variable selection in high dimensions. For survival time models and in the presence of genomic data, the state of the art is still quite unexploited. One of the more recent approaches s...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Hindawi
2017
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5554576/ https://www.ncbi.nlm.nih.gov/pubmed/28828032 http://dx.doi.org/10.1155/2017/7340565 |
_version_ | 1783256817384554496 |
---|---|
author | Treppmann, Tabea Ickstadt, Katja Zucknick, Manuela |
author_facet | Treppmann, Tabea Ickstadt, Katja Zucknick, Manuela |
author_sort | Treppmann, Tabea |
collection | PubMed |
description | Bayesian variable selection becomes more and more important in statistical analyses, in particular when performing variable selection in high dimensions. For survival time models and in the presence of genomic data, the state of the art is still quite unexploited. One of the more recent approaches suggests a Bayesian semiparametric proportional hazards model for right censored time-to-event data. We extend this model to directly include variable selection, based on a stochastic search procedure within a Markov chain Monte Carlo sampler for inference. This equips us with an intuitive and flexible approach and provides a way for integrating additional data sources and further extensions. We make use of the possibility of implementing parallel tempering to help improve the mixing of the Markov chains. In our examples, we use this Bayesian approach to integrate copy number variation data into a gene-expression-based survival prediction model. This is achieved by formulating an informed prior based on copy number variation. We perform a simulation study to investigate the model's behavior and prediction performance in different situations before applying it to a dataset of glioblastoma patients and evaluating the biological relevance of the findings. |
format | Online Article Text |
id | pubmed-5554576 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2017 |
publisher | Hindawi |
record_format | MEDLINE/PubMed |
spelling | pubmed-55545762017-08-21 Integration of Multiple Genomic Data Sources in a Bayesian Cox Model for Variable Selection and Prediction Treppmann, Tabea Ickstadt, Katja Zucknick, Manuela Comput Math Methods Med Research Article Bayesian variable selection becomes more and more important in statistical analyses, in particular when performing variable selection in high dimensions. For survival time models and in the presence of genomic data, the state of the art is still quite unexploited. One of the more recent approaches suggests a Bayesian semiparametric proportional hazards model for right censored time-to-event data. We extend this model to directly include variable selection, based on a stochastic search procedure within a Markov chain Monte Carlo sampler for inference. This equips us with an intuitive and flexible approach and provides a way for integrating additional data sources and further extensions. We make use of the possibility of implementing parallel tempering to help improve the mixing of the Markov chains. In our examples, we use this Bayesian approach to integrate copy number variation data into a gene-expression-based survival prediction model. This is achieved by formulating an informed prior based on copy number variation. We perform a simulation study to investigate the model's behavior and prediction performance in different situations before applying it to a dataset of glioblastoma patients and evaluating the biological relevance of the findings. Hindawi 2017 2017-07-30 /pmc/articles/PMC5554576/ /pubmed/28828032 http://dx.doi.org/10.1155/2017/7340565 Text en Copyright © 2017 Tabea Treppmann et al. https://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Article Treppmann, Tabea Ickstadt, Katja Zucknick, Manuela Integration of Multiple Genomic Data Sources in a Bayesian Cox Model for Variable Selection and Prediction |
title | Integration of Multiple Genomic Data Sources in a Bayesian Cox Model for Variable Selection and Prediction |
title_full | Integration of Multiple Genomic Data Sources in a Bayesian Cox Model for Variable Selection and Prediction |
title_fullStr | Integration of Multiple Genomic Data Sources in a Bayesian Cox Model for Variable Selection and Prediction |
title_full_unstemmed | Integration of Multiple Genomic Data Sources in a Bayesian Cox Model for Variable Selection and Prediction |
title_short | Integration of Multiple Genomic Data Sources in a Bayesian Cox Model for Variable Selection and Prediction |
title_sort | integration of multiple genomic data sources in a bayesian cox model for variable selection and prediction |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5554576/ https://www.ncbi.nlm.nih.gov/pubmed/28828032 http://dx.doi.org/10.1155/2017/7340565 |
work_keys_str_mv | AT treppmanntabea integrationofmultiplegenomicdatasourcesinabayesiancoxmodelforvariableselectionandprediction AT ickstadtkatja integrationofmultiplegenomicdatasourcesinabayesiancoxmodelforvariableselectionandprediction AT zucknickmanuela integrationofmultiplegenomicdatasourcesinabayesiancoxmodelforvariableselectionandprediction |