Cargando…

Clustering by fast search and merge of local density peaks for gene expression microarray data

Clustering is an unsupervised approach to classify elements based on their similarity, and it is used to find the intrinsic patterns of data. There are enormous applications of clustering in bioinformatics, pattern recognition, and astronomy. This paper presents a clustering approach based on the id...

Descripción completa

Detalles Bibliográficos
Autores principales: Mehmood, Rashid, El-Ashram, Saeed, Bie, Rongfang, Dawood, Hussain, Kos, Anton
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5395818/
https://www.ncbi.nlm.nih.gov/pubmed/28422088
http://dx.doi.org/10.1038/srep45602
_version_ 1783229944307908608
author Mehmood, Rashid
El-Ashram, Saeed
Bie, Rongfang
Dawood, Hussain
Kos, Anton
author_facet Mehmood, Rashid
El-Ashram, Saeed
Bie, Rongfang
Dawood, Hussain
Kos, Anton
author_sort Mehmood, Rashid
collection PubMed
description Clustering is an unsupervised approach to classify elements based on their similarity, and it is used to find the intrinsic patterns of data. There are enormous applications of clustering in bioinformatics, pattern recognition, and astronomy. This paper presents a clustering approach based on the idea that density wise single or multiple connected regions make a cluster, in which density maxima point represents the center of the corresponding density region. More precisely, our approach firstly finds the local density regions and subsequently merges the density connected regions to form the meaningful clusters. This idea empowers the clustering procedure, in which outliers are automatically detected, higher dense regions are intuitively determined and merged to form clusters of arbitrary shape, and clusters are identified regardless the dimensionality of space in which they are embedded. Extensive experiments are performed on several complex data sets to analyze and compare our approach with the state-of-the-art clustering methods. In addition, we benchmarked the algorithm on gene expression microarray data sets for cancer subtyping; to distinguish normal tissues from tumor; and to classify multiple tissue data sets.
format Online
Article
Text
id pubmed-5395818
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher Nature Publishing Group
record_format MEDLINE/PubMed
spelling pubmed-53958182017-04-20 Clustering by fast search and merge of local density peaks for gene expression microarray data Mehmood, Rashid El-Ashram, Saeed Bie, Rongfang Dawood, Hussain Kos, Anton Sci Rep Article Clustering is an unsupervised approach to classify elements based on their similarity, and it is used to find the intrinsic patterns of data. There are enormous applications of clustering in bioinformatics, pattern recognition, and astronomy. This paper presents a clustering approach based on the idea that density wise single or multiple connected regions make a cluster, in which density maxima point represents the center of the corresponding density region. More precisely, our approach firstly finds the local density regions and subsequently merges the density connected regions to form the meaningful clusters. This idea empowers the clustering procedure, in which outliers are automatically detected, higher dense regions are intuitively determined and merged to form clusters of arbitrary shape, and clusters are identified regardless the dimensionality of space in which they are embedded. Extensive experiments are performed on several complex data sets to analyze and compare our approach with the state-of-the-art clustering methods. In addition, we benchmarked the algorithm on gene expression microarray data sets for cancer subtyping; to distinguish normal tissues from tumor; and to classify multiple tissue data sets. Nature Publishing Group 2017-04-19 /pmc/articles/PMC5395818/ /pubmed/28422088 http://dx.doi.org/10.1038/srep45602 Text en Copyright © 2017, The Author(s) http://creativecommons.org/licenses/by/4.0/ This work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/
spellingShingle Article
Mehmood, Rashid
El-Ashram, Saeed
Bie, Rongfang
Dawood, Hussain
Kos, Anton
Clustering by fast search and merge of local density peaks for gene expression microarray data
title Clustering by fast search and merge of local density peaks for gene expression microarray data
title_full Clustering by fast search and merge of local density peaks for gene expression microarray data
title_fullStr Clustering by fast search and merge of local density peaks for gene expression microarray data
title_full_unstemmed Clustering by fast search and merge of local density peaks for gene expression microarray data
title_short Clustering by fast search and merge of local density peaks for gene expression microarray data
title_sort clustering by fast search and merge of local density peaks for gene expression microarray data
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5395818/
https://www.ncbi.nlm.nih.gov/pubmed/28422088
http://dx.doi.org/10.1038/srep45602
work_keys_str_mv AT mehmoodrashid clusteringbyfastsearchandmergeoflocaldensitypeaksforgeneexpressionmicroarraydata
AT elashramsaeed clusteringbyfastsearchandmergeoflocaldensitypeaksforgeneexpressionmicroarraydata
AT bierongfang clusteringbyfastsearchandmergeoflocaldensitypeaksforgeneexpressionmicroarraydata
AT dawoodhussain clusteringbyfastsearchandmergeoflocaldensitypeaksforgeneexpressionmicroarraydata
AT kosanton clusteringbyfastsearchandmergeoflocaldensitypeaksforgeneexpressionmicroarraydata