Cargando…
optCluster: An R Package for Determining the Optimal Clustering Algorithm
There exist numerous programs and packages that perform validation for a given clustering solution; however, clustering algorithms fare differently as judged by different validation measures. If more than one performance measure is used to evaluate multiple clustering partitions, an optimal result i...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Biomedical Informatics
2017
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5450252/ https://www.ncbi.nlm.nih.gov/pubmed/28584451 http://dx.doi.org/10.6026/97320630013101 |
_version_ | 1783239933270425600 |
---|---|
author | Sekula, Michael Datta, Somnath Datta, Susmita |
author_facet | Sekula, Michael Datta, Somnath Datta, Susmita |
author_sort | Sekula, Michael |
collection | PubMed |
description | There exist numerous programs and packages that perform validation for a given clustering solution; however, clustering algorithms fare differently as judged by different validation measures. If more than one performance measure is used to evaluate multiple clustering partitions, an optimal result is often difficult to determine by visual inspection alone. This paper introduces optCluster, an R package that uses a single function to simultaneously compare numerous clustering partitions (created by different algorithms and/or numbers of clusters) and obtain a “best” option for a given dataset. The method of weighted rank aggregation is utilized by this package to objectively aggregate various performance measure scores, thereby taking away the guesswork that often follows a visual inspection of cluster results. The optCluster package contains biological validation measures as well as clustering algorithms developed specifically for RNA sequencing data, making it a useful tool for clustering genomic data. AVAILABILITY: This package is available for free through the Comprehensive R Archive Network (CRAN) at http://cran.rproject.org/web/packages/optCluster/ |
format | Online Article Text |
id | pubmed-5450252 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2017 |
publisher | Biomedical Informatics |
record_format | MEDLINE/PubMed |
spelling | pubmed-54502522017-06-05 optCluster: An R Package for Determining the Optimal Clustering Algorithm Sekula, Michael Datta, Somnath Datta, Susmita Bioinformation Software There exist numerous programs and packages that perform validation for a given clustering solution; however, clustering algorithms fare differently as judged by different validation measures. If more than one performance measure is used to evaluate multiple clustering partitions, an optimal result is often difficult to determine by visual inspection alone. This paper introduces optCluster, an R package that uses a single function to simultaneously compare numerous clustering partitions (created by different algorithms and/or numbers of clusters) and obtain a “best” option for a given dataset. The method of weighted rank aggregation is utilized by this package to objectively aggregate various performance measure scores, thereby taking away the guesswork that often follows a visual inspection of cluster results. The optCluster package contains biological validation measures as well as clustering algorithms developed specifically for RNA sequencing data, making it a useful tool for clustering genomic data. AVAILABILITY: This package is available for free through the Comprehensive R Archive Network (CRAN) at http://cran.rproject.org/web/packages/optCluster/ Biomedical Informatics 2017-03-31 /pmc/articles/PMC5450252/ /pubmed/28584451 http://dx.doi.org/10.6026/97320630013101 Text en © 2017 Biomedical Informatics http://creativecommons.org/licenses/by/3.0/ This is an Open Access article which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. This is distributed under the terms of the Creative Commons Attribution License. |
spellingShingle | Software Sekula, Michael Datta, Somnath Datta, Susmita optCluster: An R Package for Determining the Optimal Clustering Algorithm |
title | optCluster: An R Package for Determining the Optimal Clustering Algorithm |
title_full | optCluster: An R Package for Determining the Optimal Clustering Algorithm |
title_fullStr | optCluster: An R Package for Determining the Optimal Clustering Algorithm |
title_full_unstemmed | optCluster: An R Package for Determining the Optimal Clustering Algorithm |
title_short | optCluster: An R Package for Determining the Optimal Clustering Algorithm |
title_sort | optcluster: an r package for determining the optimal clustering algorithm |
topic | Software |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5450252/ https://www.ncbi.nlm.nih.gov/pubmed/28584451 http://dx.doi.org/10.6026/97320630013101 |
work_keys_str_mv | AT sekulamichael optclusteranrpackagefordeterminingtheoptimalclusteringalgorithm AT dattasomnath optclusteranrpackagefordeterminingtheoptimalclusteringalgorithm AT dattasusmita optclusteranrpackagefordeterminingtheoptimalclusteringalgorithm |