Cargando…

Cluster stability scores for microarray data in cancer studies

BACKGROUND: A potential benefit of profiling of tissue samples using microarrays is the generation of molecular fingerprints that will define subtypes of disease. Hierarchical clustering has been the primary analytical tool used to define disease subtypes from microarray experiments in cancer settin...

Descripción completa

Detalles Bibliográficos
Autores principales: Smolkin, Mark, Ghosh, Debashis
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2003
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC200969/
https://www.ncbi.nlm.nih.gov/pubmed/12959646
http://dx.doi.org/10.1186/1471-2105-4-36
_version_ 1782120941510197248
author Smolkin, Mark
Ghosh, Debashis
author_facet Smolkin, Mark
Ghosh, Debashis
author_sort Smolkin, Mark
collection PubMed
description BACKGROUND: A potential benefit of profiling of tissue samples using microarrays is the generation of molecular fingerprints that will define subtypes of disease. Hierarchical clustering has been the primary analytical tool used to define disease subtypes from microarray experiments in cancer settings. Assessing cluster reliability poses a major complication in analyzing output from clustering procedures. While most work has focused on estimating the number of clusters in a dataset, the question of stability of individual-level clusters has not been addressed. RESULTS: We address this problem by developing cluster stability scores using subsampling techniques. These scores exploit the redundancy in biologically discriminatory information on the chip. Our approach is generic and can be used with any clustering method. We propose procedures for calculating cluster stability scores for situations involving both known and unknown numbers of clusters. We also develop cluster-size adjusted stability scores. The method is illustrated by application to data three cancer studies; one involving childhood cancers, the second involving B-cell lymphoma, and the final is from a malignant melanoma study. AVAILABILITY: Code implementing the proposed analytic method can be obtained at the second author's website.
format Text
id pubmed-200969
institution National Center for Biotechnology Information
language English
publishDate 2003
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-2009692003-09-30 Cluster stability scores for microarray data in cancer studies Smolkin, Mark Ghosh, Debashis BMC Bioinformatics Methodology Article BACKGROUND: A potential benefit of profiling of tissue samples using microarrays is the generation of molecular fingerprints that will define subtypes of disease. Hierarchical clustering has been the primary analytical tool used to define disease subtypes from microarray experiments in cancer settings. Assessing cluster reliability poses a major complication in analyzing output from clustering procedures. While most work has focused on estimating the number of clusters in a dataset, the question of stability of individual-level clusters has not been addressed. RESULTS: We address this problem by developing cluster stability scores using subsampling techniques. These scores exploit the redundancy in biologically discriminatory information on the chip. Our approach is generic and can be used with any clustering method. We propose procedures for calculating cluster stability scores for situations involving both known and unknown numbers of clusters. We also develop cluster-size adjusted stability scores. The method is illustrated by application to data three cancer studies; one involving childhood cancers, the second involving B-cell lymphoma, and the final is from a malignant melanoma study. AVAILABILITY: Code implementing the proposed analytic method can be obtained at the second author's website. BioMed Central 2003-09-06 /pmc/articles/PMC200969/ /pubmed/12959646 http://dx.doi.org/10.1186/1471-2105-4-36 Text en Copyright © 2003 Smolkin and Ghosh; licensee BioMed Central Ltd. This is an Open Access article: verbatim copying and redistribution of this article are permitted in all media for any purpose, provided this notice is preserved along with the article's original URL.
spellingShingle Methodology Article
Smolkin, Mark
Ghosh, Debashis
Cluster stability scores for microarray data in cancer studies
title Cluster stability scores for microarray data in cancer studies
title_full Cluster stability scores for microarray data in cancer studies
title_fullStr Cluster stability scores for microarray data in cancer studies
title_full_unstemmed Cluster stability scores for microarray data in cancer studies
title_short Cluster stability scores for microarray data in cancer studies
title_sort cluster stability scores for microarray data in cancer studies
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC200969/
https://www.ncbi.nlm.nih.gov/pubmed/12959646
http://dx.doi.org/10.1186/1471-2105-4-36
work_keys_str_mv AT smolkinmark clusterstabilityscoresformicroarraydataincancerstudies
AT ghoshdebashis clusterstabilityscoresformicroarraydataincancerstudies