Cargando…

Density based pruning for identification of differentially expressed genes from microarray data

MOTIVATION: Identification of differentially expressed genes from microarray datasets is one of the most important analyses for microarray data mining. Popular algorithms such as statistical t-test rank genes based on a single statistics. The false positive rate of these methods can be improved by c...

Descripción completa

Detalles Bibliográficos
Autores principales:	Hu, Jianjun, Xu, Jia
Formato:	Texto
Lenguaje:	English
Publicado:	BioMed Central 2010
Materias:	Research
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2975422/ https://www.ncbi.nlm.nih.gov/pubmed/21047384 http://dx.doi.org/10.1186/1471-2164-11-S2-S3

_version_	1782190945701199872
author	Hu, Jianjun Xu, Jia
author_facet	Hu, Jianjun Xu, Jia
author_sort	Hu, Jianjun
collection	PubMed
description	MOTIVATION: Identification of differentially expressed genes from microarray datasets is one of the most important analyses for microarray data mining. Popular algorithms such as statistical t-test rank genes based on a single statistics. The false positive rate of these methods can be improved by considering other features of differentially expressed genes. RESULTS: We proposed a pattern recognition strategy for identifying differentially expressed genes. Genes are mapped to a two dimension feature space composed of average difference of gene expression and average expression levels. A density based pruning algorithm (DB Pruning) is developed to screen out potential differentially expressed genes usually located in the sparse boundary region. Biases of popular algorithms for identifying differentially expressed genes are visually characterized. Experiments on 17 datasets from Gene Omnibus Database (GEO) with experimentally verified differentially expressed genes showed that DB pruning can significantly improve the prediction accuracy of popular identification algorithms such as t-test, rank product, and fold change. CONCLUSIONS: Density based pruning of non-differentially expressed genes is an effective method for enhancing statistical testing based algorithms for identifying differentially expressed genes. It improves t-test, rank product, and fold change by 11% to 50% in the numbers of identified true differentially expressed genes. The source code of DB pruning is freely available on our website http://mleg.cse.sc.edu/degprune
format	Text
id	pubmed-2975422
institution	National Center for Biotechnology Information
language	English
publishDate	2010
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-29754222010-11-09 Density based pruning for identification of differentially expressed genes from microarray data Hu, Jianjun Xu, Jia BMC Genomics Research MOTIVATION: Identification of differentially expressed genes from microarray datasets is one of the most important analyses for microarray data mining. Popular algorithms such as statistical t-test rank genes based on a single statistics. The false positive rate of these methods can be improved by considering other features of differentially expressed genes. RESULTS: We proposed a pattern recognition strategy for identifying differentially expressed genes. Genes are mapped to a two dimension feature space composed of average difference of gene expression and average expression levels. A density based pruning algorithm (DB Pruning) is developed to screen out potential differentially expressed genes usually located in the sparse boundary region. Biases of popular algorithms for identifying differentially expressed genes are visually characterized. Experiments on 17 datasets from Gene Omnibus Database (GEO) with experimentally verified differentially expressed genes showed that DB pruning can significantly improve the prediction accuracy of popular identification algorithms such as t-test, rank product, and fold change. CONCLUSIONS: Density based pruning of non-differentially expressed genes is an effective method for enhancing statistical testing based algorithms for identifying differentially expressed genes. It improves t-test, rank product, and fold change by 11% to 50% in the numbers of identified true differentially expressed genes. The source code of DB pruning is freely available on our website http://mleg.cse.sc.edu/degprune BioMed Central 2010-11-02 /pmc/articles/PMC2975422/ /pubmed/21047384 http://dx.doi.org/10.1186/1471-2164-11-S2-S3 Text en Copyright ©2010 Hu and Xu; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Research Hu, Jianjun Xu, Jia Density based pruning for identification of differentially expressed genes from microarray data
title	Density based pruning for identification of differentially expressed genes from microarray data
title_full	Density based pruning for identification of differentially expressed genes from microarray data
title_fullStr	Density based pruning for identification of differentially expressed genes from microarray data
title_full_unstemmed	Density based pruning for identification of differentially expressed genes from microarray data
title_short	Density based pruning for identification of differentially expressed genes from microarray data
title_sort	density based pruning for identification of differentially expressed genes from microarray data
topic	Research
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2975422/ https://www.ncbi.nlm.nih.gov/pubmed/21047384 http://dx.doi.org/10.1186/1471-2164-11-S2-S3
work_keys_str_mv	AT hujianjun densitybasedpruningforidentificationofdifferentiallyexpressedgenesfrommicroarraydata AT xujia densitybasedpruningforidentificationofdifferentiallyexpressedgenesfrommicroarraydata

Density based pruning for identification of differentially expressed genes from microarray data

Ejemplares similares