Cargando…

Detection of Significant Groups in Hierarchical Clustering by Resampling

Hierarchical clustering is a simple and reproducible technique to rearrange data of multiple variables and sample units and visualize possible groups in the data. Despite the name, hierarchical clustering does not provide clusters automatically, and “tree-cutting” procedures are often used to identi...

Descripción completa

Detalles Bibliográficos
Autores principales: Sebastiani, Paola, Perls, Thomas T.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4976109/
https://www.ncbi.nlm.nih.gov/pubmed/27551289
http://dx.doi.org/10.3389/fgene.2016.00144
_version_ 1782446809761710080
author Sebastiani, Paola
Perls, Thomas T.
author_facet Sebastiani, Paola
Perls, Thomas T.
author_sort Sebastiani, Paola
collection PubMed
description Hierarchical clustering is a simple and reproducible technique to rearrange data of multiple variables and sample units and visualize possible groups in the data. Despite the name, hierarchical clustering does not provide clusters automatically, and “tree-cutting” procedures are often used to identify subgroups in the data by cutting the dendrogram that represents the similarities among groups used in the agglomerative procedure. We introduce a resampling-based technique that can be used to identify cut-points of a dendrogram with a significance level based on a reference distribution for the heights of the branch points. The evaluation on synthetic data shows that the technique is robust in a variety of situations. An example with real biomarker data from the Long Life Family Study shows the usefulness of the method.
format Online
Article
Text
id pubmed-4976109
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-49761092016-08-22 Detection of Significant Groups in Hierarchical Clustering by Resampling Sebastiani, Paola Perls, Thomas T. Front Genet Genetics Hierarchical clustering is a simple and reproducible technique to rearrange data of multiple variables and sample units and visualize possible groups in the data. Despite the name, hierarchical clustering does not provide clusters automatically, and “tree-cutting” procedures are often used to identify subgroups in the data by cutting the dendrogram that represents the similarities among groups used in the agglomerative procedure. We introduce a resampling-based technique that can be used to identify cut-points of a dendrogram with a significance level based on a reference distribution for the heights of the branch points. The evaluation on synthetic data shows that the technique is robust in a variety of situations. An example with real biomarker data from the Long Life Family Study shows the usefulness of the method. Frontiers Media S.A. 2016-08-08 /pmc/articles/PMC4976109/ /pubmed/27551289 http://dx.doi.org/10.3389/fgene.2016.00144 Text en Copyright © 2016 Sebastiani and Perls. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Genetics
Sebastiani, Paola
Perls, Thomas T.
Detection of Significant Groups in Hierarchical Clustering by Resampling
title Detection of Significant Groups in Hierarchical Clustering by Resampling
title_full Detection of Significant Groups in Hierarchical Clustering by Resampling
title_fullStr Detection of Significant Groups in Hierarchical Clustering by Resampling
title_full_unstemmed Detection of Significant Groups in Hierarchical Clustering by Resampling
title_short Detection of Significant Groups in Hierarchical Clustering by Resampling
title_sort detection of significant groups in hierarchical clustering by resampling
topic Genetics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4976109/
https://www.ncbi.nlm.nih.gov/pubmed/27551289
http://dx.doi.org/10.3389/fgene.2016.00144
work_keys_str_mv AT sebastianipaola detectionofsignificantgroupsinhierarchicalclusteringbyresampling
AT perlsthomast detectionofsignificantgroupsinhierarchicalclusteringbyresampling