Cargando…
A Partitioning Based Adaptive Method for Robust Removal of Irrelevant Features from High-dimensional Biomedical Datasets
We propose a novel method called Partitioning based Adaptive Irrelevant Feature Eliminator (PAIFE) for dimensionality reduction in high-dimensional biomedical datasets. PAIFE evaluates feature-target relationships over not only a whole dataset, but also the partitioned subsets and is extremely effec...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
American Medical Informatics Association
2012
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3392052/ https://www.ncbi.nlm.nih.gov/pubmed/22779051 |
_version_ | 1782237586856607744 |
---|---|
author | Liu1, Guodong Kong, Lan Gopalakrishnan, Vanathi |
author_facet | Liu1, Guodong Kong, Lan Gopalakrishnan, Vanathi |
author_sort | Liu1, Guodong |
collection | PubMed |
description | We propose a novel method called Partitioning based Adaptive Irrelevant Feature Eliminator (PAIFE) for dimensionality reduction in high-dimensional biomedical datasets. PAIFE evaluates feature-target relationships over not only a whole dataset, but also the partitioned subsets and is extremely effective in identifying features whose relevancies to the target are conditional on certain other features. PAIFE adaptively employs the most appropriate feature evaluation strategy, statistical test and parameter instantiation. We envision PAIFE to be used as a third-party data pre-processing tool for dimensionality reduction of high-dimensional clinical datasets. Experiments on synthetic datasets showed that PAIFE consistently outperformed state-of-the-art feature selection methods in removing irrelevant features while retaining relevant features. Experiments on genomic and proteomic datasets demonstrated that PAIFE was able to remove significant numbers of irrelevant features in real-world biomedical datasets. Classification models constructed from the retained features either matched or improved the classification performances of the models constructed using all features. |
format | Online Article Text |
id | pubmed-3392052 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2012 |
publisher | American Medical Informatics Association |
record_format | MEDLINE/PubMed |
spelling | pubmed-33920522012-07-09 A Partitioning Based Adaptive Method for Robust Removal of Irrelevant Features from High-dimensional Biomedical Datasets Liu1, Guodong Kong, Lan Gopalakrishnan, Vanathi AMIA Jt Summits Transl Sci Proc Articles We propose a novel method called Partitioning based Adaptive Irrelevant Feature Eliminator (PAIFE) for dimensionality reduction in high-dimensional biomedical datasets. PAIFE evaluates feature-target relationships over not only a whole dataset, but also the partitioned subsets and is extremely effective in identifying features whose relevancies to the target are conditional on certain other features. PAIFE adaptively employs the most appropriate feature evaluation strategy, statistical test and parameter instantiation. We envision PAIFE to be used as a third-party data pre-processing tool for dimensionality reduction of high-dimensional clinical datasets. Experiments on synthetic datasets showed that PAIFE consistently outperformed state-of-the-art feature selection methods in removing irrelevant features while retaining relevant features. Experiments on genomic and proteomic datasets demonstrated that PAIFE was able to remove significant numbers of irrelevant features in real-world biomedical datasets. Classification models constructed from the retained features either matched or improved the classification performances of the models constructed using all features. American Medical Informatics Association 2012-03-19 /pmc/articles/PMC3392052/ /pubmed/22779051 Text en ©2012 AMIA - All rights reserved. This is an Open Access article: verbatim copying and redistribution of this article are permitted in all media for any purpose |
spellingShingle | Articles Liu1, Guodong Kong, Lan Gopalakrishnan, Vanathi A Partitioning Based Adaptive Method for Robust Removal of Irrelevant Features from High-dimensional Biomedical Datasets |
title | A Partitioning Based Adaptive Method for Robust Removal of Irrelevant Features from High-dimensional Biomedical Datasets |
title_full | A Partitioning Based Adaptive Method for Robust Removal of Irrelevant Features from High-dimensional Biomedical Datasets |
title_fullStr | A Partitioning Based Adaptive Method for Robust Removal of Irrelevant Features from High-dimensional Biomedical Datasets |
title_full_unstemmed | A Partitioning Based Adaptive Method for Robust Removal of Irrelevant Features from High-dimensional Biomedical Datasets |
title_short | A Partitioning Based Adaptive Method for Robust Removal of Irrelevant Features from High-dimensional Biomedical Datasets |
title_sort | partitioning based adaptive method for robust removal of irrelevant features from high-dimensional biomedical datasets |
topic | Articles |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3392052/ https://www.ncbi.nlm.nih.gov/pubmed/22779051 |
work_keys_str_mv | AT liu1guodong apartitioningbasedadaptivemethodforrobustremovalofirrelevantfeaturesfromhighdimensionalbiomedicaldatasets AT konglan apartitioningbasedadaptivemethodforrobustremovalofirrelevantfeaturesfromhighdimensionalbiomedicaldatasets AT gopalakrishnanvanathi apartitioningbasedadaptivemethodforrobustremovalofirrelevantfeaturesfromhighdimensionalbiomedicaldatasets AT liu1guodong partitioningbasedadaptivemethodforrobustremovalofirrelevantfeaturesfromhighdimensionalbiomedicaldatasets AT konglan partitioningbasedadaptivemethodforrobustremovalofirrelevantfeaturesfromhighdimensionalbiomedicaldatasets AT gopalakrishnanvanathi partitioningbasedadaptivemethodforrobustremovalofirrelevantfeaturesfromhighdimensionalbiomedicaldatasets |