Cargando…

DFP: a Bioconductor package for fuzzy profile identification and gene reduction of microarray data

BACKGROUND: Expression profiling assays done by using DNA microarray technology generate enormous data sets that are not amenable to simple analysis. The greatest challenge in maximizing the use of this huge amount of data is to develop algorithms to interpret and interconnect results from different...

Descripción completa

Detalles Bibliográficos
Autores principales: Glez-Peña, Daniel, Álvarez, Rodrigo, Díaz, Fernando, Fdez-Riverola, Florentino
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2009
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2637236/
https://www.ncbi.nlm.nih.gov/pubmed/19178723
http://dx.doi.org/10.1186/1471-2105-10-37
_version_ 1782164339670646784
author Glez-Peña, Daniel
Álvarez, Rodrigo
Díaz, Fernando
Fdez-Riverola, Florentino
author_facet Glez-Peña, Daniel
Álvarez, Rodrigo
Díaz, Fernando
Fdez-Riverola, Florentino
author_sort Glez-Peña, Daniel
collection PubMed
description BACKGROUND: Expression profiling assays done by using DNA microarray technology generate enormous data sets that are not amenable to simple analysis. The greatest challenge in maximizing the use of this huge amount of data is to develop algorithms to interpret and interconnect results from different genes under different conditions. In this context, fuzzy logic can provide a systematic and unbiased way to both (i) find biologically significant insights relating to meaningful genes, thereby removing the need for expert knowledge in preliminary steps of microarray data analyses and (ii) reduce the cost and complexity of later applied machine learning techniques being able to achieve interpretable models. RESULTS: DFP is a new Bioconductor R package that implements a method for discretizing and selecting differentially expressed genes based on the application of fuzzy logic. DFP takes advantage of fuzzy membership functions to assign linguistic labels to gene expression levels. The technique builds a reduced set of relevant genes (FP, Fuzzy Pattern) able to summarize and represent each underlying class (pathology). A last step constructs a biased set of genes (DFP, Discriminant Fuzzy Pattern) by intersecting existing fuzzy patterns in order to detect discriminative elements. In addition, the software provides new functions and visualisation tools that summarize achieved results and aid in the interpretation of differentially expressed genes from multiple microarray experiments. CONCLUSION: DFP integrates with other packages of the Bioconductor project, uses common data structures and is accompanied by ample documentation. It has the advantage that its parameters are highly configurable, facilitating the discovery of biologically relevant connections between sets of genes belonging to different pathologies. This information makes it possible to automatically filter irrelevant genes thereby reducing the large volume of data supplied by microarray experiments. Based on these contributions GENECBR, a successful tool for cancer diagnosis using microarray datasets, has recently been released.
format Text
id pubmed-2637236
institution National Center for Biotechnology Information
language English
publishDate 2009
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-26372362009-02-07 DFP: a Bioconductor package for fuzzy profile identification and gene reduction of microarray data Glez-Peña, Daniel Álvarez, Rodrigo Díaz, Fernando Fdez-Riverola, Florentino BMC Bioinformatics Software BACKGROUND: Expression profiling assays done by using DNA microarray technology generate enormous data sets that are not amenable to simple analysis. The greatest challenge in maximizing the use of this huge amount of data is to develop algorithms to interpret and interconnect results from different genes under different conditions. In this context, fuzzy logic can provide a systematic and unbiased way to both (i) find biologically significant insights relating to meaningful genes, thereby removing the need for expert knowledge in preliminary steps of microarray data analyses and (ii) reduce the cost and complexity of later applied machine learning techniques being able to achieve interpretable models. RESULTS: DFP is a new Bioconductor R package that implements a method for discretizing and selecting differentially expressed genes based on the application of fuzzy logic. DFP takes advantage of fuzzy membership functions to assign linguistic labels to gene expression levels. The technique builds a reduced set of relevant genes (FP, Fuzzy Pattern) able to summarize and represent each underlying class (pathology). A last step constructs a biased set of genes (DFP, Discriminant Fuzzy Pattern) by intersecting existing fuzzy patterns in order to detect discriminative elements. In addition, the software provides new functions and visualisation tools that summarize achieved results and aid in the interpretation of differentially expressed genes from multiple microarray experiments. CONCLUSION: DFP integrates with other packages of the Bioconductor project, uses common data structures and is accompanied by ample documentation. It has the advantage that its parameters are highly configurable, facilitating the discovery of biologically relevant connections between sets of genes belonging to different pathologies. This information makes it possible to automatically filter irrelevant genes thereby reducing the large volume of data supplied by microarray experiments. Based on these contributions GENECBR, a successful tool for cancer diagnosis using microarray datasets, has recently been released. BioMed Central 2009-01-29 /pmc/articles/PMC2637236/ /pubmed/19178723 http://dx.doi.org/10.1186/1471-2105-10-37 Text en Copyright © 2009 Glez-Peña et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Software
Glez-Peña, Daniel
Álvarez, Rodrigo
Díaz, Fernando
Fdez-Riverola, Florentino
DFP: a Bioconductor package for fuzzy profile identification and gene reduction of microarray data
title DFP: a Bioconductor package for fuzzy profile identification and gene reduction of microarray data
title_full DFP: a Bioconductor package for fuzzy profile identification and gene reduction of microarray data
title_fullStr DFP: a Bioconductor package for fuzzy profile identification and gene reduction of microarray data
title_full_unstemmed DFP: a Bioconductor package for fuzzy profile identification and gene reduction of microarray data
title_short DFP: a Bioconductor package for fuzzy profile identification and gene reduction of microarray data
title_sort dfp: a bioconductor package for fuzzy profile identification and gene reduction of microarray data
topic Software
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2637236/
https://www.ncbi.nlm.nih.gov/pubmed/19178723
http://dx.doi.org/10.1186/1471-2105-10-37
work_keys_str_mv AT glezpenadaniel dfpabioconductorpackageforfuzzyprofileidentificationandgenereductionofmicroarraydata
AT alvarezrodrigo dfpabioconductorpackageforfuzzyprofileidentificationandgenereductionofmicroarraydata
AT diazfernando dfpabioconductorpackageforfuzzyprofileidentificationandgenereductionofmicroarraydata
AT fdezriverolaflorentino dfpabioconductorpackageforfuzzyprofileidentificationandgenereductionofmicroarraydata