Cargando…

Feature selection environment for genomic applications

BACKGROUND: Feature selection is a pattern recognition approach to choose important variables according to some criteria in order to distinguish or explain certain phenomena (i.e., for dimensionality reduction). There are many genomic and proteomic applications that rely on feature selection to answ...

Descripción completa

Detalles Bibliográficos
Autores principales: Lopes, Fabrício Martins, Martins, David Corrêa, Cesar, Roberto M
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2008
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2655091/
https://www.ncbi.nlm.nih.gov/pubmed/18945362
http://dx.doi.org/10.1186/1471-2105-9-451
_version_ 1782165436577611776
author Lopes, Fabrício Martins
Martins, David Corrêa
Cesar, Roberto M
author_facet Lopes, Fabrício Martins
Martins, David Corrêa
Cesar, Roberto M
author_sort Lopes, Fabrício Martins
collection PubMed
description BACKGROUND: Feature selection is a pattern recognition approach to choose important variables according to some criteria in order to distinguish or explain certain phenomena (i.e., for dimensionality reduction). There are many genomic and proteomic applications that rely on feature selection to answer questions such as selecting signature genes which are informative about some biological state, e.g., normal tissues and several types of cancer; or inferring a prediction network among elements such as genes, proteins and external stimuli. In these applications, a recurrent problem is the lack of samples to perform an adequate estimate of the joint probabilities between element states. A myriad of feature selection algorithms and criterion functions have been proposed, although it is difficult to point the best solution for each application. RESULTS: The intent of this work is to provide an open-source multiplataform graphical environment for bioinformatics problems, which supports many feature selection algorithms, criterion functions and graphic visualization tools such as scatterplots, parallel coordinates and graphs. A feature selection approach for growing genetic networks from seed genes (targets or predictors) is also implemented in the system. CONCLUSION: The proposed feature selection environment allows data analysis using several algorithms, criterion functions and graphic visualization tools. Our experiments have shown the software effectiveness in two distinct types of biological problems. Besides, the environment can be used in different pattern recognition applications, although the main concern regards bioinformatics tasks.
format Text
id pubmed-2655091
institution National Center for Biotechnology Information
language English
publishDate 2008
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-26550912009-03-17 Feature selection environment for genomic applications Lopes, Fabrício Martins Martins, David Corrêa Cesar, Roberto M BMC Bioinformatics Software BACKGROUND: Feature selection is a pattern recognition approach to choose important variables according to some criteria in order to distinguish or explain certain phenomena (i.e., for dimensionality reduction). There are many genomic and proteomic applications that rely on feature selection to answer questions such as selecting signature genes which are informative about some biological state, e.g., normal tissues and several types of cancer; or inferring a prediction network among elements such as genes, proteins and external stimuli. In these applications, a recurrent problem is the lack of samples to perform an adequate estimate of the joint probabilities between element states. A myriad of feature selection algorithms and criterion functions have been proposed, although it is difficult to point the best solution for each application. RESULTS: The intent of this work is to provide an open-source multiplataform graphical environment for bioinformatics problems, which supports many feature selection algorithms, criterion functions and graphic visualization tools such as scatterplots, parallel coordinates and graphs. A feature selection approach for growing genetic networks from seed genes (targets or predictors) is also implemented in the system. CONCLUSION: The proposed feature selection environment allows data analysis using several algorithms, criterion functions and graphic visualization tools. Our experiments have shown the software effectiveness in two distinct types of biological problems. Besides, the environment can be used in different pattern recognition applications, although the main concern regards bioinformatics tasks. BioMed Central 2008-10-22 /pmc/articles/PMC2655091/ /pubmed/18945362 http://dx.doi.org/10.1186/1471-2105-9-451 Text en Copyright © 2008 Lopes et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Software
Lopes, Fabrício Martins
Martins, David Corrêa
Cesar, Roberto M
Feature selection environment for genomic applications
title Feature selection environment for genomic applications
title_full Feature selection environment for genomic applications
title_fullStr Feature selection environment for genomic applications
title_full_unstemmed Feature selection environment for genomic applications
title_short Feature selection environment for genomic applications
title_sort feature selection environment for genomic applications
topic Software
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2655091/
https://www.ncbi.nlm.nih.gov/pubmed/18945362
http://dx.doi.org/10.1186/1471-2105-9-451
work_keys_str_mv AT lopesfabriciomartins featureselectionenvironmentforgenomicapplications
AT martinsdavidcorrea featureselectionenvironmentforgenomicapplications
AT cesarrobertom featureselectionenvironmentforgenomicapplications