Cargando…

Selection of informative clusters from hierarchical cluster tree with gene classes

BACKGROUND: A common clustering method in the analysis of gene expression data has been hierarchical clustering. Usually the analysis involves selection of clusters by cutting the tree at a suitable level and/or analysis of a sorted gene list that is obtained with the tree. Cutting of the hierarchic...

Descripción completa

Detalles Bibliográficos
Autor principal: Toronen, Petri
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2004
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC407846/
https://www.ncbi.nlm.nih.gov/pubmed/15043761
http://dx.doi.org/10.1186/1471-2105-5-32
_version_ 1782121394072453120
author Toronen, Petri
author_facet Toronen, Petri
author_sort Toronen, Petri
collection PubMed
description BACKGROUND: A common clustering method in the analysis of gene expression data has been hierarchical clustering. Usually the analysis involves selection of clusters by cutting the tree at a suitable level and/or analysis of a sorted gene list that is obtained with the tree. Cutting of the hierarchical tree requires the selection of a suitable level and it results in the loss of information on the other level. Sorted gene lists depend on the sorting method of the joined clusters. Author proposes that the clusters should be selected using the gene classifications. RESULTS: This article presents a simple method for searching for clusters with the strongest enrichment of gene classes from a cluster tree. The clusters found are presented in the estimated order of importance. The method is demonstrated with a yeast gene expression data set and with two database classifications. The obtained clusters demonstrated a very strong enrichment of functional classes. The obtained clusters are also able to present similar gene groups to those that were observed from the data set in the original analysis and also many gene groups that were not reported in the original analysis. Visualization of the results on top of a cluster tree shows that the method finds informative clusters from several levels of the cluster tree and indicates that the clusters found could not have been obtained by simply cutting the cluster tree. Results were also used in the comparison of cluster trees from different clustering methods. CONCLUSION: The presented method should facilitate the exploratory analysis of big data sets when the associated categorical data is available.
format Text
id pubmed-407846
institution National Center for Biotechnology Information
language English
publishDate 2004
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-4078462004-05-15 Selection of informative clusters from hierarchical cluster tree with gene classes Toronen, Petri BMC Bioinformatics Research Article BACKGROUND: A common clustering method in the analysis of gene expression data has been hierarchical clustering. Usually the analysis involves selection of clusters by cutting the tree at a suitable level and/or analysis of a sorted gene list that is obtained with the tree. Cutting of the hierarchical tree requires the selection of a suitable level and it results in the loss of information on the other level. Sorted gene lists depend on the sorting method of the joined clusters. Author proposes that the clusters should be selected using the gene classifications. RESULTS: This article presents a simple method for searching for clusters with the strongest enrichment of gene classes from a cluster tree. The clusters found are presented in the estimated order of importance. The method is demonstrated with a yeast gene expression data set and with two database classifications. The obtained clusters demonstrated a very strong enrichment of functional classes. The obtained clusters are also able to present similar gene groups to those that were observed from the data set in the original analysis and also many gene groups that were not reported in the original analysis. Visualization of the results on top of a cluster tree shows that the method finds informative clusters from several levels of the cluster tree and indicates that the clusters found could not have been obtained by simply cutting the cluster tree. Results were also used in the comparison of cluster trees from different clustering methods. CONCLUSION: The presented method should facilitate the exploratory analysis of big data sets when the associated categorical data is available. BioMed Central 2004-03-25 /pmc/articles/PMC407846/ /pubmed/15043761 http://dx.doi.org/10.1186/1471-2105-5-32 Text en Copyright © 2004 Toronen; licensee BioMed Central Ltd. This is an Open Access article: verbatim copying and redistribution of this article are permitted in all media for any purpose, provided this notice is preserved along with the article's original URL.
spellingShingle Research Article
Toronen, Petri
Selection of informative clusters from hierarchical cluster tree with gene classes
title Selection of informative clusters from hierarchical cluster tree with gene classes
title_full Selection of informative clusters from hierarchical cluster tree with gene classes
title_fullStr Selection of informative clusters from hierarchical cluster tree with gene classes
title_full_unstemmed Selection of informative clusters from hierarchical cluster tree with gene classes
title_short Selection of informative clusters from hierarchical cluster tree with gene classes
title_sort selection of informative clusters from hierarchical cluster tree with gene classes
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC407846/
https://www.ncbi.nlm.nih.gov/pubmed/15043761
http://dx.doi.org/10.1186/1471-2105-5-32
work_keys_str_mv AT toronenpetri selectionofinformativeclustersfromhierarchicalclustertreewithgeneclasses