Cargando…

Joint network and node selection for pathway-based genomic data analysis

Motivation: By capturing various biochemical interactions, biological pathways provide insight into underlying biological processes. Given high-dimensional microarray or RNA-sequencing data, a critical challenge is how to integrate them with rich information from pathway databases to jointly select...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhe, Shandian, Naqvi, Syed A. Z., Yang, Yifan, Qi, Yuan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3722525/
https://www.ncbi.nlm.nih.gov/pubmed/23749986
http://dx.doi.org/10.1093/bioinformatics/btt335
_version_ 1782278202428751872
author Zhe, Shandian
Naqvi, Syed A. Z.
Yang, Yifan
Qi, Yuan
author_facet Zhe, Shandian
Naqvi, Syed A. Z.
Yang, Yifan
Qi, Yuan
author_sort Zhe, Shandian
collection PubMed
description Motivation: By capturing various biochemical interactions, biological pathways provide insight into underlying biological processes. Given high-dimensional microarray or RNA-sequencing data, a critical challenge is how to integrate them with rich information from pathway databases to jointly select relevant pathways and genes for phenotype prediction or disease prognosis. Addressing this challenge can help us deepen biological understanding of phenotypes and diseases from a systems perspective. Results: In this article, we propose a novel sparse Bayesian model for joint network and node selection. This model integrates information from networks (e.g. pathways) and nodes (e.g. genes) by a hybrid of conditional and generative components. For the conditional component, we propose a sparse prior based on graph Laplacian matrices, each of which encodes detailed correlation structures between network nodes. For the generative component, we use a spike and slab prior over network nodes. The integration of these two components, coupled with efficient variational inference, enables the selection of networks as well as correlated network nodes in the selected networks. Simulation results demonstrate improved predictive performance and selection accuracy of our method over alternative methods. Based on three expression datasets for cancer study and the KEGG pathway database, we selected relevant genes and pathways, many of which are supported by biological literature. In addition to pathway analysis, our method is expected to have a wide range of applications in selecting relevant groups of correlated high-dimensional biomarkers. Availability: The code can be downloaded at www.cs.purdue.edu/homes/szhe/software.html. Contact: alanqi@purdue.edu
format Online
Article
Text
id pubmed-3722525
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-37225252013-07-25 Joint network and node selection for pathway-based genomic data analysis Zhe, Shandian Naqvi, Syed A. Z. Yang, Yifan Qi, Yuan Bioinformatics Original Papers Motivation: By capturing various biochemical interactions, biological pathways provide insight into underlying biological processes. Given high-dimensional microarray or RNA-sequencing data, a critical challenge is how to integrate them with rich information from pathway databases to jointly select relevant pathways and genes for phenotype prediction or disease prognosis. Addressing this challenge can help us deepen biological understanding of phenotypes and diseases from a systems perspective. Results: In this article, we propose a novel sparse Bayesian model for joint network and node selection. This model integrates information from networks (e.g. pathways) and nodes (e.g. genes) by a hybrid of conditional and generative components. For the conditional component, we propose a sparse prior based on graph Laplacian matrices, each of which encodes detailed correlation structures between network nodes. For the generative component, we use a spike and slab prior over network nodes. The integration of these two components, coupled with efficient variational inference, enables the selection of networks as well as correlated network nodes in the selected networks. Simulation results demonstrate improved predictive performance and selection accuracy of our method over alternative methods. Based on three expression datasets for cancer study and the KEGG pathway database, we selected relevant genes and pathways, many of which are supported by biological literature. In addition to pathway analysis, our method is expected to have a wide range of applications in selecting relevant groups of correlated high-dimensional biomarkers. Availability: The code can be downloaded at www.cs.purdue.edu/homes/szhe/software.html. Contact: alanqi@purdue.edu Oxford University Press 2013-08-15 2013-06-08 /pmc/articles/PMC3722525/ /pubmed/23749986 http://dx.doi.org/10.1093/bioinformatics/btt335 Text en © The Author 2013. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/3.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Original Papers
Zhe, Shandian
Naqvi, Syed A. Z.
Yang, Yifan
Qi, Yuan
Joint network and node selection for pathway-based genomic data analysis
title Joint network and node selection for pathway-based genomic data analysis
title_full Joint network and node selection for pathway-based genomic data analysis
title_fullStr Joint network and node selection for pathway-based genomic data analysis
title_full_unstemmed Joint network and node selection for pathway-based genomic data analysis
title_short Joint network and node selection for pathway-based genomic data analysis
title_sort joint network and node selection for pathway-based genomic data analysis
topic Original Papers
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3722525/
https://www.ncbi.nlm.nih.gov/pubmed/23749986
http://dx.doi.org/10.1093/bioinformatics/btt335
work_keys_str_mv AT zheshandian jointnetworkandnodeselectionforpathwaybasedgenomicdataanalysis
AT naqvisyedaz jointnetworkandnodeselectionforpathwaybasedgenomicdataanalysis
AT yangyifan jointnetworkandnodeselectionforpathwaybasedgenomicdataanalysis
AT qiyuan jointnetworkandnodeselectionforpathwaybasedgenomicdataanalysis