Cargando…

Group Based Unsupervised Feature Selection

Unsupervised feature selection is an important task in machine learning applications, yet challenging due to the unavailability of class labels. Although a few unsupervised methods take advantage of external sources of correlations within feature groups in feature selection, they are limited to geno...

Descripción completa

Detalles Bibliográficos
Autores principales: Perera, Kushani, Chan, Jeffrey, Karunasekera, Shanika
Formato: Online Artículo Texto
Lenguaje:English
Publicado: 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7206179/
http://dx.doi.org/10.1007/978-3-030-47426-3_62
_version_ 1783530363536015360
author Perera, Kushani
Chan, Jeffrey
Karunasekera, Shanika
author_facet Perera, Kushani
Chan, Jeffrey
Karunasekera, Shanika
author_sort Perera, Kushani
collection PubMed
description Unsupervised feature selection is an important task in machine learning applications, yet challenging due to the unavailability of class labels. Although a few unsupervised methods take advantage of external sources of correlations within feature groups in feature selection, they are limited to genomic data, and suffer poor accuracy because they ignore input data or encourage features from the same group. We propose a framework which facilitates unsupervised filter feature selection methods to exploit input data and feature group information simultaneously, encouraging features from different groups. We use this framework to incorporate feature group information into Laplace Score algorithm. Our method achieves high accuracy compared to other popular unsupervised feature selection methods ([Formula: see text]30% maximum improvement of Normalized Mutual Information (NMI)) with low computational costs ([Formula: see text]50 times lower than embedded methods on average). It has many real world applications, particularly the ones that use image, text and genomic data, whose features demonstrate strong group structures.
format Online
Article
Text
id pubmed-7206179
institution National Center for Biotechnology Information
language English
publishDate 2020
record_format MEDLINE/PubMed
spelling pubmed-72061792020-05-08 Group Based Unsupervised Feature Selection Perera, Kushani Chan, Jeffrey Karunasekera, Shanika Advances in Knowledge Discovery and Data Mining Article Unsupervised feature selection is an important task in machine learning applications, yet challenging due to the unavailability of class labels. Although a few unsupervised methods take advantage of external sources of correlations within feature groups in feature selection, they are limited to genomic data, and suffer poor accuracy because they ignore input data or encourage features from the same group. We propose a framework which facilitates unsupervised filter feature selection methods to exploit input data and feature group information simultaneously, encouraging features from different groups. We use this framework to incorporate feature group information into Laplace Score algorithm. Our method achieves high accuracy compared to other popular unsupervised feature selection methods ([Formula: see text]30% maximum improvement of Normalized Mutual Information (NMI)) with low computational costs ([Formula: see text]50 times lower than embedded methods on average). It has many real world applications, particularly the ones that use image, text and genomic data, whose features demonstrate strong group structures. 2020-04-17 /pmc/articles/PMC7206179/ http://dx.doi.org/10.1007/978-3-030-47426-3_62 Text en © Springer Nature Switzerland AG 2020 This article is made available via the PMC Open Access Subset for unrestricted research re-use and secondary analysis in any form or by any means with acknowledgement of the original source. These permissions are granted for the duration of the World Health Organization (WHO) declaration of COVID-19 as a global pandemic.
spellingShingle Article
Perera, Kushani
Chan, Jeffrey
Karunasekera, Shanika
Group Based Unsupervised Feature Selection
title Group Based Unsupervised Feature Selection
title_full Group Based Unsupervised Feature Selection
title_fullStr Group Based Unsupervised Feature Selection
title_full_unstemmed Group Based Unsupervised Feature Selection
title_short Group Based Unsupervised Feature Selection
title_sort group based unsupervised feature selection
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7206179/
http://dx.doi.org/10.1007/978-3-030-47426-3_62
work_keys_str_mv AT pererakushani groupbasedunsupervisedfeatureselection
AT chanjeffrey groupbasedunsupervisedfeatureselection
AT karunasekerashanika groupbasedunsupervisedfeatureselection