Cargando…

Visualization methods for statistical analysis of microarray clusters

BACKGROUND: The most common method of identifying groups of functionally related genes in microarray data is to apply a clustering algorithm. However, it is impossible to determine which clustering algorithm is most appropriate to apply, and it is difficult to verify the results of any algorithm due...

Descripción completa

Detalles Bibliográficos
Autores principales: Hibbs, Matthew A, Dirksen, Nathaniel C, Li, Kai, Troyanskaya, Olga G
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2005
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1156867/
https://www.ncbi.nlm.nih.gov/pubmed/15890080
http://dx.doi.org/10.1186/1471-2105-6-115
_version_ 1782124321529921536
author Hibbs, Matthew A
Dirksen, Nathaniel C
Li, Kai
Troyanskaya, Olga G
author_facet Hibbs, Matthew A
Dirksen, Nathaniel C
Li, Kai
Troyanskaya, Olga G
author_sort Hibbs, Matthew A
collection PubMed
description BACKGROUND: The most common method of identifying groups of functionally related genes in microarray data is to apply a clustering algorithm. However, it is impossible to determine which clustering algorithm is most appropriate to apply, and it is difficult to verify the results of any algorithm due to the lack of a gold-standard. Appropriate data visualization tools can aid this analysis process, but existing visualization methods do not specifically address this issue. RESULTS: We present several visualization techniques that incorporate meaningful statistics that are noise-robust for the purpose of analyzing the results of clustering algorithms on microarray data. This includes a rank-based visualization method that is more robust to noise, a difference display method to aid assessments of cluster quality and detection of outliers, and a projection of high dimensional data into a three dimensional space in order to examine relationships between clusters. Our methods are interactive and are dynamically linked together for comprehensive analysis. Further, our approach applies to both protein and gene expression microarrays, and our architecture is scalable for use on both desktop/laptop screens and large-scale display devices. This methodology is implemented in GeneVAnD (Genomic Visual ANalysis of Datasets) and is available at . CONCLUSION: Incorporating relevant statistical information into data visualizations is key for analysis of large biological datasets, particularly because of high levels of noise and the lack of a gold-standard for comparisons. We developed several new visualization techniques and demonstrated their effectiveness for evaluating cluster quality and relationships between clusters.
format Text
id pubmed-1156867
institution National Center for Biotechnology Information
language English
publishDate 2005
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-11568672005-06-22 Visualization methods for statistical analysis of microarray clusters Hibbs, Matthew A Dirksen, Nathaniel C Li, Kai Troyanskaya, Olga G BMC Bioinformatics Methodology Article BACKGROUND: The most common method of identifying groups of functionally related genes in microarray data is to apply a clustering algorithm. However, it is impossible to determine which clustering algorithm is most appropriate to apply, and it is difficult to verify the results of any algorithm due to the lack of a gold-standard. Appropriate data visualization tools can aid this analysis process, but existing visualization methods do not specifically address this issue. RESULTS: We present several visualization techniques that incorporate meaningful statistics that are noise-robust for the purpose of analyzing the results of clustering algorithms on microarray data. This includes a rank-based visualization method that is more robust to noise, a difference display method to aid assessments of cluster quality and detection of outliers, and a projection of high dimensional data into a three dimensional space in order to examine relationships between clusters. Our methods are interactive and are dynamically linked together for comprehensive analysis. Further, our approach applies to both protein and gene expression microarrays, and our architecture is scalable for use on both desktop/laptop screens and large-scale display devices. This methodology is implemented in GeneVAnD (Genomic Visual ANalysis of Datasets) and is available at . CONCLUSION: Incorporating relevant statistical information into data visualizations is key for analysis of large biological datasets, particularly because of high levels of noise and the lack of a gold-standard for comparisons. We developed several new visualization techniques and demonstrated their effectiveness for evaluating cluster quality and relationships between clusters. BioMed Central 2005-05-12 /pmc/articles/PMC1156867/ /pubmed/15890080 http://dx.doi.org/10.1186/1471-2105-6-115 Text en Copyright © 2005 Hibbs et al; licensee BioMed Central Ltd.
spellingShingle Methodology Article
Hibbs, Matthew A
Dirksen, Nathaniel C
Li, Kai
Troyanskaya, Olga G
Visualization methods for statistical analysis of microarray clusters
title Visualization methods for statistical analysis of microarray clusters
title_full Visualization methods for statistical analysis of microarray clusters
title_fullStr Visualization methods for statistical analysis of microarray clusters
title_full_unstemmed Visualization methods for statistical analysis of microarray clusters
title_short Visualization methods for statistical analysis of microarray clusters
title_sort visualization methods for statistical analysis of microarray clusters
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1156867/
https://www.ncbi.nlm.nih.gov/pubmed/15890080
http://dx.doi.org/10.1186/1471-2105-6-115
work_keys_str_mv AT hibbsmatthewa visualizationmethodsforstatisticalanalysisofmicroarrayclusters
AT dirksennathanielc visualizationmethodsforstatisticalanalysisofmicroarrayclusters
AT likai visualizationmethodsforstatisticalanalysisofmicroarrayclusters
AT troyanskayaolgag visualizationmethodsforstatisticalanalysisofmicroarrayclusters