Cargando…
Clustering Acoustic Segments Using Multi-Stage Agglomerative Hierarchical Clustering
Agglomerative hierarchical clustering becomes infeasible when applied to large datasets due to its O(N (2)) storage requirements. We present a multi-stage agglomerative hierarchical clustering (MAHC) approach aimed at large datasets of speech segments. The algorithm is based on an iterative divide-a...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2015
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4627777/ https://www.ncbi.nlm.nih.gov/pubmed/26517376 http://dx.doi.org/10.1371/journal.pone.0141756 |
_version_ | 1782398331478081536 |
---|---|
author | Lerato, Lerato Niesler, Thomas |
author_facet | Lerato, Lerato Niesler, Thomas |
author_sort | Lerato, Lerato |
collection | PubMed |
description | Agglomerative hierarchical clustering becomes infeasible when applied to large datasets due to its O(N (2)) storage requirements. We present a multi-stage agglomerative hierarchical clustering (MAHC) approach aimed at large datasets of speech segments. The algorithm is based on an iterative divide-and-conquer strategy. The data is first split into independent subsets, each of which is clustered separately. Thus reduces the storage required for sequential implementations, and allows concurrent computation on parallel computing hardware. The resultant clusters are merged and subsequently re-divided into subsets, which are passed to the following iteration. We show that MAHC can match and even surpass the performance of the exact implementation when applied to datasets of speech segments. |
format | Online Article Text |
id | pubmed-4627777 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2015 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-46277772015-11-06 Clustering Acoustic Segments Using Multi-Stage Agglomerative Hierarchical Clustering Lerato, Lerato Niesler, Thomas PLoS One Research Article Agglomerative hierarchical clustering becomes infeasible when applied to large datasets due to its O(N (2)) storage requirements. We present a multi-stage agglomerative hierarchical clustering (MAHC) approach aimed at large datasets of speech segments. The algorithm is based on an iterative divide-and-conquer strategy. The data is first split into independent subsets, each of which is clustered separately. Thus reduces the storage required for sequential implementations, and allows concurrent computation on parallel computing hardware. The resultant clusters are merged and subsequently re-divided into subsets, which are passed to the following iteration. We show that MAHC can match and even surpass the performance of the exact implementation when applied to datasets of speech segments. Public Library of Science 2015-10-30 /pmc/articles/PMC4627777/ /pubmed/26517376 http://dx.doi.org/10.1371/journal.pone.0141756 Text en © 2015 Lerato, Niesler http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited. |
spellingShingle | Research Article Lerato, Lerato Niesler, Thomas Clustering Acoustic Segments Using Multi-Stage Agglomerative Hierarchical Clustering |
title | Clustering Acoustic Segments Using Multi-Stage Agglomerative Hierarchical Clustering |
title_full | Clustering Acoustic Segments Using Multi-Stage Agglomerative Hierarchical Clustering |
title_fullStr | Clustering Acoustic Segments Using Multi-Stage Agglomerative Hierarchical Clustering |
title_full_unstemmed | Clustering Acoustic Segments Using Multi-Stage Agglomerative Hierarchical Clustering |
title_short | Clustering Acoustic Segments Using Multi-Stage Agglomerative Hierarchical Clustering |
title_sort | clustering acoustic segments using multi-stage agglomerative hierarchical clustering |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4627777/ https://www.ncbi.nlm.nih.gov/pubmed/26517376 http://dx.doi.org/10.1371/journal.pone.0141756 |
work_keys_str_mv | AT leratolerato clusteringacousticsegmentsusingmultistageagglomerativehierarchicalclustering AT nieslerthomas clusteringacousticsegmentsusingmultistageagglomerativehierarchicalclustering |