Cargando…
Clustering by fast search and merge of local density peaks for gene expression microarray data
Clustering is an unsupervised approach to classify elements based on their similarity, and it is used to find the intrinsic patterns of data. There are enormous applications of clustering in bioinformatics, pattern recognition, and astronomy. This paper presents a clustering approach based on the id...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group
2017
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5395818/ https://www.ncbi.nlm.nih.gov/pubmed/28422088 http://dx.doi.org/10.1038/srep45602 |
_version_ | 1783229944307908608 |
---|---|
author | Mehmood, Rashid El-Ashram, Saeed Bie, Rongfang Dawood, Hussain Kos, Anton |
author_facet | Mehmood, Rashid El-Ashram, Saeed Bie, Rongfang Dawood, Hussain Kos, Anton |
author_sort | Mehmood, Rashid |
collection | PubMed |
description | Clustering is an unsupervised approach to classify elements based on their similarity, and it is used to find the intrinsic patterns of data. There are enormous applications of clustering in bioinformatics, pattern recognition, and astronomy. This paper presents a clustering approach based on the idea that density wise single or multiple connected regions make a cluster, in which density maxima point represents the center of the corresponding density region. More precisely, our approach firstly finds the local density regions and subsequently merges the density connected regions to form the meaningful clusters. This idea empowers the clustering procedure, in which outliers are automatically detected, higher dense regions are intuitively determined and merged to form clusters of arbitrary shape, and clusters are identified regardless the dimensionality of space in which they are embedded. Extensive experiments are performed on several complex data sets to analyze and compare our approach with the state-of-the-art clustering methods. In addition, we benchmarked the algorithm on gene expression microarray data sets for cancer subtyping; to distinguish normal tissues from tumor; and to classify multiple tissue data sets. |
format | Online Article Text |
id | pubmed-5395818 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2017 |
publisher | Nature Publishing Group |
record_format | MEDLINE/PubMed |
spelling | pubmed-53958182017-04-20 Clustering by fast search and merge of local density peaks for gene expression microarray data Mehmood, Rashid El-Ashram, Saeed Bie, Rongfang Dawood, Hussain Kos, Anton Sci Rep Article Clustering is an unsupervised approach to classify elements based on their similarity, and it is used to find the intrinsic patterns of data. There are enormous applications of clustering in bioinformatics, pattern recognition, and astronomy. This paper presents a clustering approach based on the idea that density wise single or multiple connected regions make a cluster, in which density maxima point represents the center of the corresponding density region. More precisely, our approach firstly finds the local density regions and subsequently merges the density connected regions to form the meaningful clusters. This idea empowers the clustering procedure, in which outliers are automatically detected, higher dense regions are intuitively determined and merged to form clusters of arbitrary shape, and clusters are identified regardless the dimensionality of space in which they are embedded. Extensive experiments are performed on several complex data sets to analyze and compare our approach with the state-of-the-art clustering methods. In addition, we benchmarked the algorithm on gene expression microarray data sets for cancer subtyping; to distinguish normal tissues from tumor; and to classify multiple tissue data sets. Nature Publishing Group 2017-04-19 /pmc/articles/PMC5395818/ /pubmed/28422088 http://dx.doi.org/10.1038/srep45602 Text en Copyright © 2017, The Author(s) http://creativecommons.org/licenses/by/4.0/ This work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ |
spellingShingle | Article Mehmood, Rashid El-Ashram, Saeed Bie, Rongfang Dawood, Hussain Kos, Anton Clustering by fast search and merge of local density peaks for gene expression microarray data |
title | Clustering by fast search and merge of local density peaks for gene expression microarray data |
title_full | Clustering by fast search and merge of local density peaks for gene expression microarray data |
title_fullStr | Clustering by fast search and merge of local density peaks for gene expression microarray data |
title_full_unstemmed | Clustering by fast search and merge of local density peaks for gene expression microarray data |
title_short | Clustering by fast search and merge of local density peaks for gene expression microarray data |
title_sort | clustering by fast search and merge of local density peaks for gene expression microarray data |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5395818/ https://www.ncbi.nlm.nih.gov/pubmed/28422088 http://dx.doi.org/10.1038/srep45602 |
work_keys_str_mv | AT mehmoodrashid clusteringbyfastsearchandmergeoflocaldensitypeaksforgeneexpressionmicroarraydata AT elashramsaeed clusteringbyfastsearchandmergeoflocaldensitypeaksforgeneexpressionmicroarraydata AT bierongfang clusteringbyfastsearchandmergeoflocaldensitypeaksforgeneexpressionmicroarraydata AT dawoodhussain clusteringbyfastsearchandmergeoflocaldensitypeaksforgeneexpressionmicroarraydata AT kosanton clusteringbyfastsearchandmergeoflocaldensitypeaksforgeneexpressionmicroarraydata |