Cargando…

Evaluation of gene importance in microarray data based upon probability of selection

BACKGROUND: Microarray devices permit a genome-scale evaluation of gene function. This technology has catalyzed biomedical research and development in recent years. As many important diseases can be traced down to the gene level, a long-standing research problem is to identify specific gene expressi...

Descripción completa

Detalles Bibliográficos
Autores principales:	Fu, Li M, Fu-Liu, Casey S
Formato:	Texto
Lenguaje:	English
Publicado:	BioMed Central 2005
Materias:	Methodology Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1274261/ https://www.ncbi.nlm.nih.gov/pubmed/15784140 http://dx.doi.org/10.1186/1471-2105-6-67

_version_	1782125975634444288
author	Fu, Li M Fu-Liu, Casey S
author_facet	Fu, Li M Fu-Liu, Casey S
author_sort	Fu, Li M
collection	PubMed
description	BACKGROUND: Microarray devices permit a genome-scale evaluation of gene function. This technology has catalyzed biomedical research and development in recent years. As many important diseases can be traced down to the gene level, a long-standing research problem is to identify specific gene expression patterns linking to metabolic characteristics that contribute to disease development and progression. The microarray approach offers an expedited solution to this problem. However, it has posed a challenging issue to recognize disease-related genes expression patterns embedded in the microarray data. In selecting a small set of biologically significant genes for classifier design, the nature of high data dimensionality inherent in this problem creates substantial amount of uncertainty. RESULTS: Here we present a model for probability analysis of selected genes in order to determine their importance. Our contribution is that we show how to derive the P value of each selected gene in multiple gene selection trials based on different combinations of data samples and how to conduct a reliability analysis accordingly. The importance of a gene is indicated by its associated P value in that a smaller value implies higher information content from information theory. On the microarray data concerning the subtype classification of small round blue cell tumors, we demonstrate that the method is capable of finding the smallest set of genes (19 genes) with optimal classification performance, compared with results reported in the literature. CONCLUSION: In classifier design based on microarray data, the probability value derived from gene selection based on multiple combinations of data samples enables an effective mechanism for reducing the tendency of fitting local data particularities.
format	Text
id	pubmed-1274261
institution	National Center for Biotechnology Information
language	English
publishDate	2005
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-12742612005-10-29 Evaluation of gene importance in microarray data based upon probability of selection Fu, Li M Fu-Liu, Casey S BMC Bioinformatics Methodology Article BACKGROUND: Microarray devices permit a genome-scale evaluation of gene function. This technology has catalyzed biomedical research and development in recent years. As many important diseases can be traced down to the gene level, a long-standing research problem is to identify specific gene expression patterns linking to metabolic characteristics that contribute to disease development and progression. The microarray approach offers an expedited solution to this problem. However, it has posed a challenging issue to recognize disease-related genes expression patterns embedded in the microarray data. In selecting a small set of biologically significant genes for classifier design, the nature of high data dimensionality inherent in this problem creates substantial amount of uncertainty. RESULTS: Here we present a model for probability analysis of selected genes in order to determine their importance. Our contribution is that we show how to derive the P value of each selected gene in multiple gene selection trials based on different combinations of data samples and how to conduct a reliability analysis accordingly. The importance of a gene is indicated by its associated P value in that a smaller value implies higher information content from information theory. On the microarray data concerning the subtype classification of small round blue cell tumors, we demonstrate that the method is capable of finding the smallest set of genes (19 genes) with optimal classification performance, compared with results reported in the literature. CONCLUSION: In classifier design based on microarray data, the probability value derived from gene selection based on multiple combinations of data samples enables an effective mechanism for reducing the tendency of fitting local data particularities. BioMed Central 2005-03-22 /pmc/articles/PMC1274261/ /pubmed/15784140 http://dx.doi.org/10.1186/1471-2105-6-67 Text en Copyright © 2005 Fu and Fu-Liu; licensee BioMed Central Ltd.
spellingShingle	Methodology Article Fu, Li M Fu-Liu, Casey S Evaluation of gene importance in microarray data based upon probability of selection
title	Evaluation of gene importance in microarray data based upon probability of selection
title_full	Evaluation of gene importance in microarray data based upon probability of selection
title_fullStr	Evaluation of gene importance in microarray data based upon probability of selection
title_full_unstemmed	Evaluation of gene importance in microarray data based upon probability of selection
title_short	Evaluation of gene importance in microarray data based upon probability of selection
title_sort	evaluation of gene importance in microarray data based upon probability of selection
topic	Methodology Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1274261/ https://www.ncbi.nlm.nih.gov/pubmed/15784140 http://dx.doi.org/10.1186/1471-2105-6-67
work_keys_str_mv	AT fulim evaluationofgeneimportanceinmicroarraydatabaseduponprobabilityofselection AT fuliucaseys evaluationofgeneimportanceinmicroarraydatabaseduponprobabilityofselection

Evaluation of gene importance in microarray data based upon probability of selection

Ejemplares similares