Cargando…

A Robust and Accurate Method for Feature Selection and Prioritization from Multi-Class OMICs Data

Selecting relevant features is a common task in most OMICs data analysis, where the aim is to identify a small set of key features to be used as biomarkers. To this end, two alternative but equally valid methods are mainly available, namely the univariate (filter) or the multivariate (wrapper) appro...

Descripción completa

Detalles Bibliográficos
Autores principales: Fortino, Vittorio, Kinaret, Pia, Fyhrquist, Nanna, Alenius, Harri, Greco, Dario
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4172658/
https://www.ncbi.nlm.nih.gov/pubmed/25247789
http://dx.doi.org/10.1371/journal.pone.0107801
_version_ 1782336056179294208
author Fortino, Vittorio
Kinaret, Pia
Fyhrquist, Nanna
Alenius, Harri
Greco, Dario
author_facet Fortino, Vittorio
Kinaret, Pia
Fyhrquist, Nanna
Alenius, Harri
Greco, Dario
author_sort Fortino, Vittorio
collection PubMed
description Selecting relevant features is a common task in most OMICs data analysis, where the aim is to identify a small set of key features to be used as biomarkers. To this end, two alternative but equally valid methods are mainly available, namely the univariate (filter) or the multivariate (wrapper) approach. The stability of the selected lists of features is an often neglected but very important requirement. If the same features are selected in multiple independent iterations, they more likely are reliable biomarkers. In this study, we developed and evaluated the performance of a novel method for feature selection and prioritization, aiming at generating robust and stable sets of features with high predictive power. The proposed method uses the fuzzy logic for a first unbiased feature selection and a Random Forest built from conditional inference trees to prioritize the candidate discriminant features. Analyzing several multi-class gene expression microarray data sets, we demonstrate that our technique provides equal or better classification performance and a greater stability as compared to other Random Forest-based feature selection methods.
format Online
Article
Text
id pubmed-4172658
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-41726582014-10-02 A Robust and Accurate Method for Feature Selection and Prioritization from Multi-Class OMICs Data Fortino, Vittorio Kinaret, Pia Fyhrquist, Nanna Alenius, Harri Greco, Dario PLoS One Research Article Selecting relevant features is a common task in most OMICs data analysis, where the aim is to identify a small set of key features to be used as biomarkers. To this end, two alternative but equally valid methods are mainly available, namely the univariate (filter) or the multivariate (wrapper) approach. The stability of the selected lists of features is an often neglected but very important requirement. If the same features are selected in multiple independent iterations, they more likely are reliable biomarkers. In this study, we developed and evaluated the performance of a novel method for feature selection and prioritization, aiming at generating robust and stable sets of features with high predictive power. The proposed method uses the fuzzy logic for a first unbiased feature selection and a Random Forest built from conditional inference trees to prioritize the candidate discriminant features. Analyzing several multi-class gene expression microarray data sets, we demonstrate that our technique provides equal or better classification performance and a greater stability as compared to other Random Forest-based feature selection methods. Public Library of Science 2014-09-23 /pmc/articles/PMC4172658/ /pubmed/25247789 http://dx.doi.org/10.1371/journal.pone.0107801 Text en © 2014 Fortino et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Fortino, Vittorio
Kinaret, Pia
Fyhrquist, Nanna
Alenius, Harri
Greco, Dario
A Robust and Accurate Method for Feature Selection and Prioritization from Multi-Class OMICs Data
title A Robust and Accurate Method for Feature Selection and Prioritization from Multi-Class OMICs Data
title_full A Robust and Accurate Method for Feature Selection and Prioritization from Multi-Class OMICs Data
title_fullStr A Robust and Accurate Method for Feature Selection and Prioritization from Multi-Class OMICs Data
title_full_unstemmed A Robust and Accurate Method for Feature Selection and Prioritization from Multi-Class OMICs Data
title_short A Robust and Accurate Method for Feature Selection and Prioritization from Multi-Class OMICs Data
title_sort robust and accurate method for feature selection and prioritization from multi-class omics data
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4172658/
https://www.ncbi.nlm.nih.gov/pubmed/25247789
http://dx.doi.org/10.1371/journal.pone.0107801
work_keys_str_mv AT fortinovittorio arobustandaccuratemethodforfeatureselectionandprioritizationfrommulticlassomicsdata
AT kinaretpia arobustandaccuratemethodforfeatureselectionandprioritizationfrommulticlassomicsdata
AT fyhrquistnanna arobustandaccuratemethodforfeatureselectionandprioritizationfrommulticlassomicsdata
AT aleniusharri arobustandaccuratemethodforfeatureselectionandprioritizationfrommulticlassomicsdata
AT grecodario arobustandaccuratemethodforfeatureselectionandprioritizationfrommulticlassomicsdata
AT fortinovittorio robustandaccuratemethodforfeatureselectionandprioritizationfrommulticlassomicsdata
AT kinaretpia robustandaccuratemethodforfeatureselectionandprioritizationfrommulticlassomicsdata
AT fyhrquistnanna robustandaccuratemethodforfeatureselectionandprioritizationfrommulticlassomicsdata
AT aleniusharri robustandaccuratemethodforfeatureselectionandprioritizationfrommulticlassomicsdata
AT grecodario robustandaccuratemethodforfeatureselectionandprioritizationfrommulticlassomicsdata