Cargando…

NIFTI: An evolutionary approach for finding number of clusters in microarray data

BACKGROUND: Clustering techniques are routinely used in gene expression data analysis to organize the massive data. Clustering techniques arrange a large number of genes or assays into a few clusters while maximizing the intra-cluster similarity and inter-cluster separation. While clustering of gene...

Descripción completa

Detalles Bibliográficos
Autores principales:	Jonnalagadda, Sudhakar, Srinivasan, Rajagopalan
Formato:	Texto
Lenguaje:	English
Publicado:	BioMed Central 2009
Materias:	Methodology Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2669482/ https://www.ncbi.nlm.nih.gov/pubmed/19178750 http://dx.doi.org/10.1186/1471-2105-10-40

_version_	1782166258560532480
author	Jonnalagadda, Sudhakar Srinivasan, Rajagopalan
author_facet	Jonnalagadda, Sudhakar Srinivasan, Rajagopalan
author_sort	Jonnalagadda, Sudhakar
collection	PubMed
description	BACKGROUND: Clustering techniques are routinely used in gene expression data analysis to organize the massive data. Clustering techniques arrange a large number of genes or assays into a few clusters while maximizing the intra-cluster similarity and inter-cluster separation. While clustering of genes facilitates learning the functions of un-characterized genes using their association with known genes, clustering of assays reveals the disease stages and subtypes. Many clustering algorithms require the user to specify the number of clusters a priori. A wrong specification of number of clusters generally leads to either failure to detect novel clusters (disease subtypes) or unnecessary splitting of natural clusters. RESULTS: We have developed a novel method to find the number of clusters in gene expression data. Our procedure evaluates different partitions (each with different number of clusters) from the clustering algorithm and finds the partition that best describes the data. In contrast to the existing methods that evaluate the partitions independently, our procedure considers the dynamic rearrangement of cluster members when a new cluster is added. Partition quality is measured based on a new index called Net InFormation Transfer Index (NIFTI) that measures the information change when an additional cluster is introduced. Information content of a partition increases when clusters do not intersect and decreases if they are not clearly separated. A partition with the highest Total Information Content (TIC) is selected as the optimal one. We illustrate our method using four publicly available microarray datasets. CONCLUSION: In all four case studies, the proposed method correctly identified the number of clusters and performs better than other well known methods. Our method also showed invariance to the clustering techniques.
format	Text
id	pubmed-2669482
institution	National Center for Biotechnology Information
language	English
publishDate	2009
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-26694822009-04-16 NIFTI: An evolutionary approach for finding number of clusters in microarray data Jonnalagadda, Sudhakar Srinivasan, Rajagopalan BMC Bioinformatics Methodology Article BACKGROUND: Clustering techniques are routinely used in gene expression data analysis to organize the massive data. Clustering techniques arrange a large number of genes or assays into a few clusters while maximizing the intra-cluster similarity and inter-cluster separation. While clustering of genes facilitates learning the functions of un-characterized genes using their association with known genes, clustering of assays reveals the disease stages and subtypes. Many clustering algorithms require the user to specify the number of clusters a priori. A wrong specification of number of clusters generally leads to either failure to detect novel clusters (disease subtypes) or unnecessary splitting of natural clusters. RESULTS: We have developed a novel method to find the number of clusters in gene expression data. Our procedure evaluates different partitions (each with different number of clusters) from the clustering algorithm and finds the partition that best describes the data. In contrast to the existing methods that evaluate the partitions independently, our procedure considers the dynamic rearrangement of cluster members when a new cluster is added. Partition quality is measured based on a new index called Net InFormation Transfer Index (NIFTI) that measures the information change when an additional cluster is introduced. Information content of a partition increases when clusters do not intersect and decreases if they are not clearly separated. A partition with the highest Total Information Content (TIC) is selected as the optimal one. We illustrate our method using four publicly available microarray datasets. CONCLUSION: In all four case studies, the proposed method correctly identified the number of clusters and performs better than other well known methods. Our method also showed invariance to the clustering techniques. BioMed Central 2009-01-30 /pmc/articles/PMC2669482/ /pubmed/19178750 http://dx.doi.org/10.1186/1471-2105-10-40 Text en Copyright © 2009 Jonnalagadda and Srinivasan; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Methodology Article Jonnalagadda, Sudhakar Srinivasan, Rajagopalan NIFTI: An evolutionary approach for finding number of clusters in microarray data
title	NIFTI: An evolutionary approach for finding number of clusters in microarray data
title_full	NIFTI: An evolutionary approach for finding number of clusters in microarray data
title_fullStr	NIFTI: An evolutionary approach for finding number of clusters in microarray data
title_full_unstemmed	NIFTI: An evolutionary approach for finding number of clusters in microarray data
title_short	NIFTI: An evolutionary approach for finding number of clusters in microarray data
title_sort	nifti: an evolutionary approach for finding number of clusters in microarray data
topic	Methodology Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2669482/ https://www.ncbi.nlm.nih.gov/pubmed/19178750 http://dx.doi.org/10.1186/1471-2105-10-40
work_keys_str_mv	AT jonnalagaddasudhakar niftianevolutionaryapproachforfindingnumberofclustersinmicroarraydata AT srinivasanrajagopalan niftianevolutionaryapproachforfindingnumberofclustersinmicroarraydata

NIFTI: An evolutionary approach for finding number of clusters in microarray data

Ejemplares similares