Cargando…

Comparison of Clustering Methods for Investigation of Genome-Wide Methylation Array Data

The use of genome-wide methylation arrays has proved very informative to investigate both clinical and biological questions in human epigenomics. The use of clustering methods either for exploration of these data or to compare to an a priori grouping, e.g., normal versus disease allows assessment of...

Descripción completa

Detalles Bibliográficos
Autores principales:	Clifford, Harry, Wessely, Frank, Pendurthi, Satish, Emes, Richard D.
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Frontiers Research Foundation 2011
Materias:	Genetics
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3268382/ https://www.ncbi.nlm.nih.gov/pubmed/22303382 http://dx.doi.org/10.3389/fgene.2011.00088

_version_	1782222372878680064
author	Clifford, Harry Wessely, Frank Pendurthi, Satish Emes, Richard D.
author_facet	Clifford, Harry Wessely, Frank Pendurthi, Satish Emes, Richard D.
author_sort	Clifford, Harry
collection	PubMed
description	The use of genome-wide methylation arrays has proved very informative to investigate both clinical and biological questions in human epigenomics. The use of clustering methods either for exploration of these data or to compare to an a priori grouping, e.g., normal versus disease allows assessment of groupings of data without user bias. However no consensus on the methods to use for clustering of methylation array approaches has been reached. To determine the most appropriate clustering method for analysis of illumina array methylation data, a collection of data sets was simulated and used to compare clustering methods. Both hierarchical clustering and non-hierarchical clustering methods (k-means, k-medoids, and fuzzy clustering algorithms) were compared using a range of distance and linkage methods. As no single method consistently outperformed others across different simulations, we propose a method to capture the best clustering outcome based on an additional measure, the silhouette width. This approach produced a consistently higher cluster accuracy compared to using any one method in isolation.
format	Online Article Text
id	pubmed-3268382
institution	National Center for Biotechnology Information
language	English
publishDate	2011
publisher	Frontiers Research Foundation
record_format	MEDLINE/PubMed
spelling	pubmed-32683822012-02-02 Comparison of Clustering Methods for Investigation of Genome-Wide Methylation Array Data Clifford, Harry Wessely, Frank Pendurthi, Satish Emes, Richard D. Front Genet Genetics The use of genome-wide methylation arrays has proved very informative to investigate both clinical and biological questions in human epigenomics. The use of clustering methods either for exploration of these data or to compare to an a priori grouping, e.g., normal versus disease allows assessment of groupings of data without user bias. However no consensus on the methods to use for clustering of methylation array approaches has been reached. To determine the most appropriate clustering method for analysis of illumina array methylation data, a collection of data sets was simulated and used to compare clustering methods. Both hierarchical clustering and non-hierarchical clustering methods (k-means, k-medoids, and fuzzy clustering algorithms) were compared using a range of distance and linkage methods. As no single method consistently outperformed others across different simulations, we propose a method to capture the best clustering outcome based on an additional measure, the silhouette width. This approach produced a consistently higher cluster accuracy compared to using any one method in isolation. Frontiers Research Foundation 2011-12-07 /pmc/articles/PMC3268382/ /pubmed/22303382 http://dx.doi.org/10.3389/fgene.2011.00088 Text en Copyright © 2011 Clifford, Wessely, Pendurthi and Emes. http://www.frontiersin.org/licenseagreement This is an open-access article distributed under the terms of the Creative Commons Attribution Non Commercial License, which permits non-commercial use, distribution, and reproduction in other forums, provided the original authors and source are credited.
spellingShingle	Genetics Clifford, Harry Wessely, Frank Pendurthi, Satish Emes, Richard D. Comparison of Clustering Methods for Investigation of Genome-Wide Methylation Array Data
title	Comparison of Clustering Methods for Investigation of Genome-Wide Methylation Array Data
title_full	Comparison of Clustering Methods for Investigation of Genome-Wide Methylation Array Data
title_fullStr	Comparison of Clustering Methods for Investigation of Genome-Wide Methylation Array Data
title_full_unstemmed	Comparison of Clustering Methods for Investigation of Genome-Wide Methylation Array Data
title_short	Comparison of Clustering Methods for Investigation of Genome-Wide Methylation Array Data
title_sort	comparison of clustering methods for investigation of genome-wide methylation array data
topic	Genetics
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3268382/ https://www.ncbi.nlm.nih.gov/pubmed/22303382 http://dx.doi.org/10.3389/fgene.2011.00088
work_keys_str_mv	AT cliffordharry comparisonofclusteringmethodsforinvestigationofgenomewidemethylationarraydata AT wesselyfrank comparisonofclusteringmethodsforinvestigationofgenomewidemethylationarraydata AT pendurthisatish comparisonofclusteringmethodsforinvestigationofgenomewidemethylationarraydata AT emesrichardd comparisonofclusteringmethodsforinvestigationofgenomewidemethylationarraydata

Comparison of Clustering Methods for Investigation of Genome-Wide Methylation Array Data

Ejemplares similares