Cargando…

Knowledge-guided multi-scale independent component analysis for biomarker identification

BACKGROUND: Many statistical methods have been proposed to identify disease biomarkers from gene expression profiles. However, from gene expression profile data alone, statistical methods often fail to identify biologically meaningful biomarkers related to a specific disease under study. In this pap...

Descripción completa

Detalles Bibliográficos
Autores principales: Chen, Li, Xuan, Jianhua, Wang, Chen, Shih, Ie-Ming, Wang, Yue, Zhang, Zhen, Hoffman, Eric, Clarke, Robert
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2008
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2576264/
https://www.ncbi.nlm.nih.gov/pubmed/18837990
http://dx.doi.org/10.1186/1471-2105-9-416
_version_ 1782160380594749440
author Chen, Li
Xuan, Jianhua
Wang, Chen
Shih, Ie-Ming
Wang, Yue
Zhang, Zhen
Hoffman, Eric
Clarke, Robert
author_facet Chen, Li
Xuan, Jianhua
Wang, Chen
Shih, Ie-Ming
Wang, Yue
Zhang, Zhen
Hoffman, Eric
Clarke, Robert
author_sort Chen, Li
collection PubMed
description BACKGROUND: Many statistical methods have been proposed to identify disease biomarkers from gene expression profiles. However, from gene expression profile data alone, statistical methods often fail to identify biologically meaningful biomarkers related to a specific disease under study. In this paper, we develop a novel strategy, namely knowledge-guided multi-scale independent component analysis (ICA), to first infer regulatory signals and then identify biologically relevant biomarkers from microarray data. RESULTS: Since gene expression levels reflect the joint effect of several underlying biological functions, disease-specific biomarkers may be involved in several distinct biological functions. To identify disease-specific biomarkers that provide unique mechanistic insights, a meta-data "knowledge gene pool" (KGP) is first constructed from multiple data sources to provide important information on the likely functions (such as gene ontology information) and regulatory events (such as promoter responsive elements) associated with potential genes of interest. The gene expression and biological meta data associated with the members of the KGP can then be used to guide subsequent analysis. ICA is then applied to multi-scale gene clusters to reveal regulatory modes reflecting the underlying biological mechanisms. Finally disease-specific biomarkers are extracted by their weighted connectivity scores associated with the extracted regulatory modes. A statistical significance test is used to evaluate the significance of transcription factor enrichment for the extracted gene set based on motif information. We applied the proposed method to yeast cell cycle microarray data and Rsf-1-induced ovarian cancer microarray data. The results show that our knowledge-guided ICA approach can extract biologically meaningful regulatory modes and outperform several baseline methods for biomarker identification. CONCLUSION: We have proposed a novel method, namely knowledge-guided multi-scale ICA, to identify disease-specific biomarkers. The goal is to infer knowledge-relevant regulatory signals and then identify corresponding biomarkers through a multi-scale strategy. The approach has been successfully applied to two expression profiling experiments to demonstrate its improved performance in extracting biologically meaningful and disease-related biomarkers. More importantly, the proposed approach shows promising results to infer novel biomarkers for ovarian cancer and extend current knowledge.
format Text
id pubmed-2576264
institution National Center for Biotechnology Information
language English
publishDate 2008
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-25762642008-10-31 Knowledge-guided multi-scale independent component analysis for biomarker identification Chen, Li Xuan, Jianhua Wang, Chen Shih, Ie-Ming Wang, Yue Zhang, Zhen Hoffman, Eric Clarke, Robert BMC Bioinformatics Methodology Article BACKGROUND: Many statistical methods have been proposed to identify disease biomarkers from gene expression profiles. However, from gene expression profile data alone, statistical methods often fail to identify biologically meaningful biomarkers related to a specific disease under study. In this paper, we develop a novel strategy, namely knowledge-guided multi-scale independent component analysis (ICA), to first infer regulatory signals and then identify biologically relevant biomarkers from microarray data. RESULTS: Since gene expression levels reflect the joint effect of several underlying biological functions, disease-specific biomarkers may be involved in several distinct biological functions. To identify disease-specific biomarkers that provide unique mechanistic insights, a meta-data "knowledge gene pool" (KGP) is first constructed from multiple data sources to provide important information on the likely functions (such as gene ontology information) and regulatory events (such as promoter responsive elements) associated with potential genes of interest. The gene expression and biological meta data associated with the members of the KGP can then be used to guide subsequent analysis. ICA is then applied to multi-scale gene clusters to reveal regulatory modes reflecting the underlying biological mechanisms. Finally disease-specific biomarkers are extracted by their weighted connectivity scores associated with the extracted regulatory modes. A statistical significance test is used to evaluate the significance of transcription factor enrichment for the extracted gene set based on motif information. We applied the proposed method to yeast cell cycle microarray data and Rsf-1-induced ovarian cancer microarray data. The results show that our knowledge-guided ICA approach can extract biologically meaningful regulatory modes and outperform several baseline methods for biomarker identification. CONCLUSION: We have proposed a novel method, namely knowledge-guided multi-scale ICA, to identify disease-specific biomarkers. The goal is to infer knowledge-relevant regulatory signals and then identify corresponding biomarkers through a multi-scale strategy. The approach has been successfully applied to two expression profiling experiments to demonstrate its improved performance in extracting biologically meaningful and disease-related biomarkers. More importantly, the proposed approach shows promising results to infer novel biomarkers for ovarian cancer and extend current knowledge. BioMed Central 2008-10-06 /pmc/articles/PMC2576264/ /pubmed/18837990 http://dx.doi.org/10.1186/1471-2105-9-416 Text en Copyright © 2008 Chen et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methodology Article
Chen, Li
Xuan, Jianhua
Wang, Chen
Shih, Ie-Ming
Wang, Yue
Zhang, Zhen
Hoffman, Eric
Clarke, Robert
Knowledge-guided multi-scale independent component analysis for biomarker identification
title Knowledge-guided multi-scale independent component analysis for biomarker identification
title_full Knowledge-guided multi-scale independent component analysis for biomarker identification
title_fullStr Knowledge-guided multi-scale independent component analysis for biomarker identification
title_full_unstemmed Knowledge-guided multi-scale independent component analysis for biomarker identification
title_short Knowledge-guided multi-scale independent component analysis for biomarker identification
title_sort knowledge-guided multi-scale independent component analysis for biomarker identification
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2576264/
https://www.ncbi.nlm.nih.gov/pubmed/18837990
http://dx.doi.org/10.1186/1471-2105-9-416
work_keys_str_mv AT chenli knowledgeguidedmultiscaleindependentcomponentanalysisforbiomarkeridentification
AT xuanjianhua knowledgeguidedmultiscaleindependentcomponentanalysisforbiomarkeridentification
AT wangchen knowledgeguidedmultiscaleindependentcomponentanalysisforbiomarkeridentification
AT shihieming knowledgeguidedmultiscaleindependentcomponentanalysisforbiomarkeridentification
AT wangyue knowledgeguidedmultiscaleindependentcomponentanalysisforbiomarkeridentification
AT zhangzhen knowledgeguidedmultiscaleindependentcomponentanalysisforbiomarkeridentification
AT hoffmaneric knowledgeguidedmultiscaleindependentcomponentanalysisforbiomarkeridentification
AT clarkerobert knowledgeguidedmultiscaleindependentcomponentanalysisforbiomarkeridentification