Cargando…

Response projected clustering for direct association with physiological and clinical response data

BACKGROUND: Microarray gene expression data are often analyzed together with corresponding physiological response and clinical metadata of biological subjects, e.g. patients' residual tumor sizes after chemotherapy or glucose levels at various stages of diabetic patients. Current clustering ana...

Descripción completa

Detalles Bibliográficos
Autores principales: Yi, Sung-Gon, Park, Taesung, Lee, Jae K
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2008
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2275250/
https://www.ncbi.nlm.nih.gov/pubmed/18237428
http://dx.doi.org/10.1186/1471-2105-9-76
_version_ 1782151841866317824
author Yi, Sung-Gon
Park, Taesung
Lee, Jae K
author_facet Yi, Sung-Gon
Park, Taesung
Lee, Jae K
author_sort Yi, Sung-Gon
collection PubMed
description BACKGROUND: Microarray gene expression data are often analyzed together with corresponding physiological response and clinical metadata of biological subjects, e.g. patients' residual tumor sizes after chemotherapy or glucose levels at various stages of diabetic patients. Current clustering analysis cannot directly incorporate such quantitative metadata into the clustering heatmap of gene expression. It will be quite useful if these clinical response data can be effectively summarized in the high-dimensional clustering display so that important groups of genes can be intuitively discovered with different degrees of relevance to target disease phenotypes. RESULTS: We introduced a novel clustering analysis approach, response projected clustering (RPC), which uses a high-dimensional geometrical projection of response data to the gene expression space. The projected response vector, which becomes the origin in the projected space, is then clustered together with the projected gene vectors based on their different degrees of association with the response vector. A bootstrap-counting based RPC analysis is also performed to evaluate statistical tightness of identified gene clusters. Our RPC analysis was applied to the in vitro growth-inhibition and microarray profiling data on the NCI-60 cancer cell lines and the microarray gene expression study of macrophage differentiation in atherogenesis. These RPC applications enabled us to identify many known and novel gene factors and their potential pathway associations which are highly relevant to the drug's chemosensitivity activities and atherogenesis. CONCLUSION: We have shown that RPC can effectively discover gene networks with different degrees of association with clinical metadata. Performed on each gene's response projected vector based on its degree of association with the response data, RPC effectively summarizes individual genes' association with metadata as well as their own expression patterns. Thus, RPC greatly enhances the utility of clustering analysis on investigating high-dimensional microarray gene expression data with quantitative metadata.
format Text
id pubmed-2275250
institution National Center for Biotechnology Information
language English
publishDate 2008
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-22752502008-03-26 Response projected clustering for direct association with physiological and clinical response data Yi, Sung-Gon Park, Taesung Lee, Jae K BMC Bioinformatics Methodology Article BACKGROUND: Microarray gene expression data are often analyzed together with corresponding physiological response and clinical metadata of biological subjects, e.g. patients' residual tumor sizes after chemotherapy or glucose levels at various stages of diabetic patients. Current clustering analysis cannot directly incorporate such quantitative metadata into the clustering heatmap of gene expression. It will be quite useful if these clinical response data can be effectively summarized in the high-dimensional clustering display so that important groups of genes can be intuitively discovered with different degrees of relevance to target disease phenotypes. RESULTS: We introduced a novel clustering analysis approach, response projected clustering (RPC), which uses a high-dimensional geometrical projection of response data to the gene expression space. The projected response vector, which becomes the origin in the projected space, is then clustered together with the projected gene vectors based on their different degrees of association with the response vector. A bootstrap-counting based RPC analysis is also performed to evaluate statistical tightness of identified gene clusters. Our RPC analysis was applied to the in vitro growth-inhibition and microarray profiling data on the NCI-60 cancer cell lines and the microarray gene expression study of macrophage differentiation in atherogenesis. These RPC applications enabled us to identify many known and novel gene factors and their potential pathway associations which are highly relevant to the drug's chemosensitivity activities and atherogenesis. CONCLUSION: We have shown that RPC can effectively discover gene networks with different degrees of association with clinical metadata. Performed on each gene's response projected vector based on its degree of association with the response data, RPC effectively summarizes individual genes' association with metadata as well as their own expression patterns. Thus, RPC greatly enhances the utility of clustering analysis on investigating high-dimensional microarray gene expression data with quantitative metadata. BioMed Central 2008-01-31 /pmc/articles/PMC2275250/ /pubmed/18237428 http://dx.doi.org/10.1186/1471-2105-9-76 Text en Copyright © 2008 Yi et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methodology Article
Yi, Sung-Gon
Park, Taesung
Lee, Jae K
Response projected clustering for direct association with physiological and clinical response data
title Response projected clustering for direct association with physiological and clinical response data
title_full Response projected clustering for direct association with physiological and clinical response data
title_fullStr Response projected clustering for direct association with physiological and clinical response data
title_full_unstemmed Response projected clustering for direct association with physiological and clinical response data
title_short Response projected clustering for direct association with physiological and clinical response data
title_sort response projected clustering for direct association with physiological and clinical response data
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2275250/
https://www.ncbi.nlm.nih.gov/pubmed/18237428
http://dx.doi.org/10.1186/1471-2105-9-76
work_keys_str_mv AT yisunggon responseprojectedclusteringfordirectassociationwithphysiologicalandclinicalresponsedata
AT parktaesung responseprojectedclusteringfordirectassociationwithphysiologicalandclinicalresponsedata
AT leejaek responseprojectedclusteringfordirectassociationwithphysiologicalandclinicalresponsedata