Cargando…
Cluster stability scores for microarray data in cancer studies
BACKGROUND: A potential benefit of profiling of tissue samples using microarrays is the generation of molecular fingerprints that will define subtypes of disease. Hierarchical clustering has been the primary analytical tool used to define disease subtypes from microarray experiments in cancer settin...
Autores principales: | , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2003
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC200969/ https://www.ncbi.nlm.nih.gov/pubmed/12959646 http://dx.doi.org/10.1186/1471-2105-4-36 |
_version_ | 1782120941510197248 |
---|---|
author | Smolkin, Mark Ghosh, Debashis |
author_facet | Smolkin, Mark Ghosh, Debashis |
author_sort | Smolkin, Mark |
collection | PubMed |
description | BACKGROUND: A potential benefit of profiling of tissue samples using microarrays is the generation of molecular fingerprints that will define subtypes of disease. Hierarchical clustering has been the primary analytical tool used to define disease subtypes from microarray experiments in cancer settings. Assessing cluster reliability poses a major complication in analyzing output from clustering procedures. While most work has focused on estimating the number of clusters in a dataset, the question of stability of individual-level clusters has not been addressed. RESULTS: We address this problem by developing cluster stability scores using subsampling techniques. These scores exploit the redundancy in biologically discriminatory information on the chip. Our approach is generic and can be used with any clustering method. We propose procedures for calculating cluster stability scores for situations involving both known and unknown numbers of clusters. We also develop cluster-size adjusted stability scores. The method is illustrated by application to data three cancer studies; one involving childhood cancers, the second involving B-cell lymphoma, and the final is from a malignant melanoma study. AVAILABILITY: Code implementing the proposed analytic method can be obtained at the second author's website. |
format | Text |
id | pubmed-200969 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2003 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-2009692003-09-30 Cluster stability scores for microarray data in cancer studies Smolkin, Mark Ghosh, Debashis BMC Bioinformatics Methodology Article BACKGROUND: A potential benefit of profiling of tissue samples using microarrays is the generation of molecular fingerprints that will define subtypes of disease. Hierarchical clustering has been the primary analytical tool used to define disease subtypes from microarray experiments in cancer settings. Assessing cluster reliability poses a major complication in analyzing output from clustering procedures. While most work has focused on estimating the number of clusters in a dataset, the question of stability of individual-level clusters has not been addressed. RESULTS: We address this problem by developing cluster stability scores using subsampling techniques. These scores exploit the redundancy in biologically discriminatory information on the chip. Our approach is generic and can be used with any clustering method. We propose procedures for calculating cluster stability scores for situations involving both known and unknown numbers of clusters. We also develop cluster-size adjusted stability scores. The method is illustrated by application to data three cancer studies; one involving childhood cancers, the second involving B-cell lymphoma, and the final is from a malignant melanoma study. AVAILABILITY: Code implementing the proposed analytic method can be obtained at the second author's website. BioMed Central 2003-09-06 /pmc/articles/PMC200969/ /pubmed/12959646 http://dx.doi.org/10.1186/1471-2105-4-36 Text en Copyright © 2003 Smolkin and Ghosh; licensee BioMed Central Ltd. This is an Open Access article: verbatim copying and redistribution of this article are permitted in all media for any purpose, provided this notice is preserved along with the article's original URL. |
spellingShingle | Methodology Article Smolkin, Mark Ghosh, Debashis Cluster stability scores for microarray data in cancer studies |
title | Cluster stability scores for microarray data in cancer studies |
title_full | Cluster stability scores for microarray data in cancer studies |
title_fullStr | Cluster stability scores for microarray data in cancer studies |
title_full_unstemmed | Cluster stability scores for microarray data in cancer studies |
title_short | Cluster stability scores for microarray data in cancer studies |
title_sort | cluster stability scores for microarray data in cancer studies |
topic | Methodology Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC200969/ https://www.ncbi.nlm.nih.gov/pubmed/12959646 http://dx.doi.org/10.1186/1471-2105-4-36 |
work_keys_str_mv | AT smolkinmark clusterstabilityscoresformicroarraydataincancerstudies AT ghoshdebashis clusterstabilityscoresformicroarraydataincancerstudies |