Cargando…

Methods for simultaneously identifying coherent local clusters with smooth global patterns in gene expression profiles

BACKGROUND: The hierarchical clustering tree (HCT) with a dendrogram [1] and the singular value decomposition (SVD) with a dimension-reduced representative map [2] are popular methods for two-way sorting the gene-by-array matrix map employed in gene expression profiling. While HCT dendrograms tend t...

Descripción completa

Detalles Bibliográficos
Autores principales: Tien, Yin-Jing, Lee, Yun-Shien, Wu, Han-Ming, Chen, Chun-Houh
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2008
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2322988/
https://www.ncbi.nlm.nih.gov/pubmed/18366693
http://dx.doi.org/10.1186/1471-2105-9-155
_version_ 1782152609003470848
author Tien, Yin-Jing
Lee, Yun-Shien
Wu, Han-Ming
Chen, Chun-Houh
author_facet Tien, Yin-Jing
Lee, Yun-Shien
Wu, Han-Ming
Chen, Chun-Houh
author_sort Tien, Yin-Jing
collection PubMed
description BACKGROUND: The hierarchical clustering tree (HCT) with a dendrogram [1] and the singular value decomposition (SVD) with a dimension-reduced representative map [2] are popular methods for two-way sorting the gene-by-array matrix map employed in gene expression profiling. While HCT dendrograms tend to optimize local coherent clustering patterns, SVD leading eigenvectors usually identify better global grouping and transitional structures. RESULTS: This study proposes a flipping mechanism for a conventional agglomerative HCT using a rank-two ellipse (R2E, an improved SVD algorithm for sorting purpose) seriation by Chen [3] as an external reference. While HCTs always produce permutations with good local behaviour, the rank-two ellipse seriation gives the best global grouping patterns and smooth transitional trends. The resulting algorithm automatically integrates the desirable properties of each method so that users have access to a clustering and visualization environment for gene expression profiles that preserves coherent local clusters and identifies global grouping trends. CONCLUSION: We demonstrate, through four examples, that the proposed method not only possesses better numerical and statistical properties, it also provides more meaningful biomedical insights than other sorting algorithms. We suggest that sorted proximity matrices for genes and arrays, in addition to the gene-by-array expression matrix, can greatly aid in the search for comprehensive understanding of gene expression structures. Software for the proposed methods can be obtained at .
format Text
id pubmed-2322988
institution National Center for Biotechnology Information
language English
publishDate 2008
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-23229882008-04-18 Methods for simultaneously identifying coherent local clusters with smooth global patterns in gene expression profiles Tien, Yin-Jing Lee, Yun-Shien Wu, Han-Ming Chen, Chun-Houh BMC Bioinformatics Methodology Article BACKGROUND: The hierarchical clustering tree (HCT) with a dendrogram [1] and the singular value decomposition (SVD) with a dimension-reduced representative map [2] are popular methods for two-way sorting the gene-by-array matrix map employed in gene expression profiling. While HCT dendrograms tend to optimize local coherent clustering patterns, SVD leading eigenvectors usually identify better global grouping and transitional structures. RESULTS: This study proposes a flipping mechanism for a conventional agglomerative HCT using a rank-two ellipse (R2E, an improved SVD algorithm for sorting purpose) seriation by Chen [3] as an external reference. While HCTs always produce permutations with good local behaviour, the rank-two ellipse seriation gives the best global grouping patterns and smooth transitional trends. The resulting algorithm automatically integrates the desirable properties of each method so that users have access to a clustering and visualization environment for gene expression profiles that preserves coherent local clusters and identifies global grouping trends. CONCLUSION: We demonstrate, through four examples, that the proposed method not only possesses better numerical and statistical properties, it also provides more meaningful biomedical insights than other sorting algorithms. We suggest that sorted proximity matrices for genes and arrays, in addition to the gene-by-array expression matrix, can greatly aid in the search for comprehensive understanding of gene expression structures. Software for the proposed methods can be obtained at . BioMed Central 2008-03-20 /pmc/articles/PMC2322988/ /pubmed/18366693 http://dx.doi.org/10.1186/1471-2105-9-155 Text en Copyright © 2008 Tien et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methodology Article
Tien, Yin-Jing
Lee, Yun-Shien
Wu, Han-Ming
Chen, Chun-Houh
Methods for simultaneously identifying coherent local clusters with smooth global patterns in gene expression profiles
title Methods for simultaneously identifying coherent local clusters with smooth global patterns in gene expression profiles
title_full Methods for simultaneously identifying coherent local clusters with smooth global patterns in gene expression profiles
title_fullStr Methods for simultaneously identifying coherent local clusters with smooth global patterns in gene expression profiles
title_full_unstemmed Methods for simultaneously identifying coherent local clusters with smooth global patterns in gene expression profiles
title_short Methods for simultaneously identifying coherent local clusters with smooth global patterns in gene expression profiles
title_sort methods for simultaneously identifying coherent local clusters with smooth global patterns in gene expression profiles
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2322988/
https://www.ncbi.nlm.nih.gov/pubmed/18366693
http://dx.doi.org/10.1186/1471-2105-9-155
work_keys_str_mv AT tienyinjing methodsforsimultaneouslyidentifyingcoherentlocalclusterswithsmoothglobalpatternsingeneexpressionprofiles
AT leeyunshien methodsforsimultaneouslyidentifyingcoherentlocalclusterswithsmoothglobalpatternsingeneexpressionprofiles
AT wuhanming methodsforsimultaneouslyidentifyingcoherentlocalclusterswithsmoothglobalpatternsingeneexpressionprofiles
AT chenchunhouh methodsforsimultaneouslyidentifyingcoherentlocalclusterswithsmoothglobalpatternsingeneexpressionprofiles