Cargando…

iRDA: a new filter towards predictive, stable, and enriched candidate genes

BACKGROUND: Gene expression profiling using high-throughput screening (HTS) technologies allows clinical researchers to find prognosis gene signatures that could better discriminate between different phenotypes and serve as potential biological markers in disease diagnoses. In recent years, many fea...

Descripción completa

Detalles Bibliográficos
Autores principales:	Lai, Hung-Ming, Albrecht, Andreas A., Steinhöfel, Kathleen K.
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2015
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4673793/ https://www.ncbi.nlm.nih.gov/pubmed/26647162 http://dx.doi.org/10.1186/s12864-015-2129-5

_version_	1782404809817587712
author	Lai, Hung-Ming Albrecht, Andreas A. Steinhöfel, Kathleen K.
author_facet	Lai, Hung-Ming Albrecht, Andreas A. Steinhöfel, Kathleen K.
author_sort	Lai, Hung-Ming
collection	PubMed
description	BACKGROUND: Gene expression profiling using high-throughput screening (HTS) technologies allows clinical researchers to find prognosis gene signatures that could better discriminate between different phenotypes and serve as potential biological markers in disease diagnoses. In recent years, many feature selection methods have been devised for finding such discriminative genes, and more recently information theoretic filters have also been introduced for capturing feature-to-class relevance and feature-to-feature correlations in microarray-based classification. METHODS: In this paper, we present and fully formulate a new multivariate filter, iRDA, for the discovery of HTS gene-expression candidate genes. The filter constitutes a four-step framework and includes feature relevance, feature redundancy, and feature interdependence in the context of feature-pairs. The method is based upon approximate Markov blankets, information theory, several heuristic search strategies with forward, backward and insertion phases, and the method is aiming at higher order gene interactions. RESULTS: To show the strengths of iRDA, three performance measures, two evaluation schemes, two stability index sets, and the gene set enrichment analysis (GSEA) are all employed in our experimental studies. Its effectiveness has been validated by using seven well-known cancer gene-expression benchmarks and four other disease experiments, including a comparison to three popular information theoretic filters. In terms of classification performance, candidate genes selected by iRDA perform better than the sets discovered by the other three filters. Two stability measures indicate that iRDA is the most robust with the least variance. GSEA shows that iRDA produces more statistically enriched gene sets on five out of the six benchmark datasets. CONCLUSIONS: Through the classification performance, the stability performance, and the enrichment analysis, iRDA is a promising filter to find predictive, stable, and enriched gene-expression candidate genes. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12864-015-2129-5) contains supplementary material, which is available to authorized users.
format	Online Article Text
id	pubmed-4673793
institution	National Center for Biotechnology Information
language	English
publishDate	2015
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-46737932015-12-10 iRDA: a new filter towards predictive, stable, and enriched candidate genes Lai, Hung-Ming Albrecht, Andreas A. Steinhöfel, Kathleen K. BMC Genomics Research Article BACKGROUND: Gene expression profiling using high-throughput screening (HTS) technologies allows clinical researchers to find prognosis gene signatures that could better discriminate between different phenotypes and serve as potential biological markers in disease diagnoses. In recent years, many feature selection methods have been devised for finding such discriminative genes, and more recently information theoretic filters have also been introduced for capturing feature-to-class relevance and feature-to-feature correlations in microarray-based classification. METHODS: In this paper, we present and fully formulate a new multivariate filter, iRDA, for the discovery of HTS gene-expression candidate genes. The filter constitutes a four-step framework and includes feature relevance, feature redundancy, and feature interdependence in the context of feature-pairs. The method is based upon approximate Markov blankets, information theory, several heuristic search strategies with forward, backward and insertion phases, and the method is aiming at higher order gene interactions. RESULTS: To show the strengths of iRDA, three performance measures, two evaluation schemes, two stability index sets, and the gene set enrichment analysis (GSEA) are all employed in our experimental studies. Its effectiveness has been validated by using seven well-known cancer gene-expression benchmarks and four other disease experiments, including a comparison to three popular information theoretic filters. In terms of classification performance, candidate genes selected by iRDA perform better than the sets discovered by the other three filters. Two stability measures indicate that iRDA is the most robust with the least variance. GSEA shows that iRDA produces more statistically enriched gene sets on five out of the six benchmark datasets. CONCLUSIONS: Through the classification performance, the stability performance, and the enrichment analysis, iRDA is a promising filter to find predictive, stable, and enriched gene-expression candidate genes. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12864-015-2129-5) contains supplementary material, which is available to authorized users. BioMed Central 2015-12-09 /pmc/articles/PMC4673793/ /pubmed/26647162 http://dx.doi.org/10.1186/s12864-015-2129-5 Text en © Lai et al. 2015 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle	Research Article Lai, Hung-Ming Albrecht, Andreas A. Steinhöfel, Kathleen K. iRDA: a new filter towards predictive, stable, and enriched candidate genes
title	iRDA: a new filter towards predictive, stable, and enriched candidate genes
title_full	iRDA: a new filter towards predictive, stable, and enriched candidate genes
title_fullStr	iRDA: a new filter towards predictive, stable, and enriched candidate genes
title_full_unstemmed	iRDA: a new filter towards predictive, stable, and enriched candidate genes
title_short	iRDA: a new filter towards predictive, stable, and enriched candidate genes
title_sort	irda: a new filter towards predictive, stable, and enriched candidate genes
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4673793/ https://www.ncbi.nlm.nih.gov/pubmed/26647162 http://dx.doi.org/10.1186/s12864-015-2129-5
work_keys_str_mv	AT laihungming irdaanewfiltertowardspredictivestableandenrichedcandidategenes AT albrechtandreasa irdaanewfiltertowardspredictivestableandenrichedcandidategenes AT steinhofelkathleenk irdaanewfiltertowardspredictivestableandenrichedcandidategenes

iRDA: a new filter towards predictive, stable, and enriched candidate genes

Ejemplares similares