Cargando…

Multi-Label Feature Selection Based on High-Order Label Correlation Assumption

Multi-label data often involve features with high dimensionality and complicated label correlations, resulting in a great challenge for multi-label learning. Feature selection plays an important role in multi-label learning to address multi-label data. Exploring label correlations is crucial for mul...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhang, Ping, Gao, Wanfu, Hu, Juncheng, Li, Yonghao
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7517369/
https://www.ncbi.nlm.nih.gov/pubmed/33286568
http://dx.doi.org/10.3390/e22070797
_version_ 1783587214452588544
author Zhang, Ping
Gao, Wanfu
Hu, Juncheng
Li, Yonghao
author_facet Zhang, Ping
Gao, Wanfu
Hu, Juncheng
Li, Yonghao
author_sort Zhang, Ping
collection PubMed
description Multi-label data often involve features with high dimensionality and complicated label correlations, resulting in a great challenge for multi-label learning. Feature selection plays an important role in multi-label learning to address multi-label data. Exploring label correlations is crucial for multi-label feature selection. Previous information-theoretical-based methods employ the strategy of cumulative summation approximation to evaluate candidate features, which merely considers low-order label correlations. In fact, there exist high-order label correlations in label set, labels naturally cluster into several groups, similar labels intend to cluster into the same group, different labels belong to different groups. However, the strategy of cumulative summation approximation tends to select the features related to the groups containing more labels while ignoring the classification information of groups containing less labels. Therefore, many features related to similar labels are selected, which leads to poor classification performance. To this end, Max-Correlation term considering high-order label correlations is proposed. Additionally, we combine the Max-Correlation term with feature redundancy term to ensure that selected features are relevant to different label groups. Finally, a new method named Multi-label Feature Selection considering Max-Correlation (MCMFS) is proposed. Experimental results demonstrate the classification superiority of MCMFS in comparison to eight state-of-the-art multi-label feature selection methods.
format Online
Article
Text
id pubmed-7517369
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-75173692020-11-09 Multi-Label Feature Selection Based on High-Order Label Correlation Assumption Zhang, Ping Gao, Wanfu Hu, Juncheng Li, Yonghao Entropy (Basel) Article Multi-label data often involve features with high dimensionality and complicated label correlations, resulting in a great challenge for multi-label learning. Feature selection plays an important role in multi-label learning to address multi-label data. Exploring label correlations is crucial for multi-label feature selection. Previous information-theoretical-based methods employ the strategy of cumulative summation approximation to evaluate candidate features, which merely considers low-order label correlations. In fact, there exist high-order label correlations in label set, labels naturally cluster into several groups, similar labels intend to cluster into the same group, different labels belong to different groups. However, the strategy of cumulative summation approximation tends to select the features related to the groups containing more labels while ignoring the classification information of groups containing less labels. Therefore, many features related to similar labels are selected, which leads to poor classification performance. To this end, Max-Correlation term considering high-order label correlations is proposed. Additionally, we combine the Max-Correlation term with feature redundancy term to ensure that selected features are relevant to different label groups. Finally, a new method named Multi-label Feature Selection considering Max-Correlation (MCMFS) is proposed. Experimental results demonstrate the classification superiority of MCMFS in comparison to eight state-of-the-art multi-label feature selection methods. MDPI 2020-07-21 /pmc/articles/PMC7517369/ /pubmed/33286568 http://dx.doi.org/10.3390/e22070797 Text en © 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Zhang, Ping
Gao, Wanfu
Hu, Juncheng
Li, Yonghao
Multi-Label Feature Selection Based on High-Order Label Correlation Assumption
title Multi-Label Feature Selection Based on High-Order Label Correlation Assumption
title_full Multi-Label Feature Selection Based on High-Order Label Correlation Assumption
title_fullStr Multi-Label Feature Selection Based on High-Order Label Correlation Assumption
title_full_unstemmed Multi-Label Feature Selection Based on High-Order Label Correlation Assumption
title_short Multi-Label Feature Selection Based on High-Order Label Correlation Assumption
title_sort multi-label feature selection based on high-order label correlation assumption
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7517369/
https://www.ncbi.nlm.nih.gov/pubmed/33286568
http://dx.doi.org/10.3390/e22070797
work_keys_str_mv AT zhangping multilabelfeatureselectionbasedonhighorderlabelcorrelationassumption
AT gaowanfu multilabelfeatureselectionbasedonhighorderlabelcorrelationassumption
AT hujuncheng multilabelfeatureselectionbasedonhighorderlabelcorrelationassumption
AT liyonghao multilabelfeatureselectionbasedonhighorderlabelcorrelationassumption