Cargando…

A new improved maximal relevance and minimal redundancy method based on feature subset

Feature selection plays a very significant role for the success of pattern recognition and data mining. Based on the maximal relevance and minimal redundancy (mRMR) method, combined with feature subset, this paper proposes an improved maximal relevance and minimal redundancy (ImRMR) feature selectio...

Descripción completa

Detalles Bibliográficos
Autores principales:	Xie, Shanshan, Zhang, Yan, Lv, Danjv, Chen, Xu, Lu, Jing, Liu, Jiang
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Springer US 2022
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9424812/ https://www.ncbi.nlm.nih.gov/pubmed/36060093 http://dx.doi.org/10.1007/s11227-022-04763-2

_version_	1784778306606858240
author	Xie, Shanshan Zhang, Yan Lv, Danjv Chen, Xu Lu, Jing Liu, Jiang
author_facet	Xie, Shanshan Zhang, Yan Lv, Danjv Chen, Xu Lu, Jing Liu, Jiang
author_sort	Xie, Shanshan
collection	PubMed
description	Feature selection plays a very significant role for the success of pattern recognition and data mining. Based on the maximal relevance and minimal redundancy (mRMR) method, combined with feature subset, this paper proposes an improved maximal relevance and minimal redundancy (ImRMR) feature selection method based on feature subset. In ImRMR, the Pearson correlation coefficient and mutual information are first used to measure the relevance of a single feature to the sample category, and a factor is introduced to adjust the weights of the two measurement criteria. And an equal grouping method is exploited to generate candidate feature subsets according to the ranking features. Then, the relevance and redundancy of candidate feature subsets are calculated and the ordered sequence of these feature subsets is gained by incremental search method. Finally, the final optimal feature subset is obtained from these feature subsets by combining the sequence forward search method and the classification learning algorithm. Experiments are conducted on seven datasets. The results show that ImRMR can effectively remove irrelevant and redundant features, which can not only reduce the dimension of sample features and time of model training and prediction, but also improve the classification performance.
format	Online Article Text
id	pubmed-9424812
institution	National Center for Biotechnology Information
language	English
publishDate	2022
publisher	Springer US
record_format	MEDLINE/PubMed
spelling	pubmed-94248122022-08-30 A new improved maximal relevance and minimal redundancy method based on feature subset Xie, Shanshan Zhang, Yan Lv, Danjv Chen, Xu Lu, Jing Liu, Jiang J Supercomput Article Feature selection plays a very significant role for the success of pattern recognition and data mining. Based on the maximal relevance and minimal redundancy (mRMR) method, combined with feature subset, this paper proposes an improved maximal relevance and minimal redundancy (ImRMR) feature selection method based on feature subset. In ImRMR, the Pearson correlation coefficient and mutual information are first used to measure the relevance of a single feature to the sample category, and a factor is introduced to adjust the weights of the two measurement criteria. And an equal grouping method is exploited to generate candidate feature subsets according to the ranking features. Then, the relevance and redundancy of candidate feature subsets are calculated and the ordered sequence of these feature subsets is gained by incremental search method. Finally, the final optimal feature subset is obtained from these feature subsets by combining the sequence forward search method and the classification learning algorithm. Experiments are conducted on seven datasets. The results show that ImRMR can effectively remove irrelevant and redundant features, which can not only reduce the dimension of sample features and time of model training and prediction, but also improve the classification performance. Springer US 2022-08-30 2023 /pmc/articles/PMC9424812/ /pubmed/36060093 http://dx.doi.org/10.1007/s11227-022-04763-2 Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle	Article Xie, Shanshan Zhang, Yan Lv, Danjv Chen, Xu Lu, Jing Liu, Jiang A new improved maximal relevance and minimal redundancy method based on feature subset
title	A new improved maximal relevance and minimal redundancy method based on feature subset
title_full	A new improved maximal relevance and minimal redundancy method based on feature subset
title_fullStr	A new improved maximal relevance and minimal redundancy method based on feature subset
title_full_unstemmed	A new improved maximal relevance and minimal redundancy method based on feature subset
title_short	A new improved maximal relevance and minimal redundancy method based on feature subset
title_sort	new improved maximal relevance and minimal redundancy method based on feature subset
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9424812/ https://www.ncbi.nlm.nih.gov/pubmed/36060093 http://dx.doi.org/10.1007/s11227-022-04763-2
work_keys_str_mv	AT xieshanshan anewimprovedmaximalrelevanceandminimalredundancymethodbasedonfeaturesubset AT zhangyan anewimprovedmaximalrelevanceandminimalredundancymethodbasedonfeaturesubset AT lvdanjv anewimprovedmaximalrelevanceandminimalredundancymethodbasedonfeaturesubset AT chenxu anewimprovedmaximalrelevanceandminimalredundancymethodbasedonfeaturesubset AT lujing anewimprovedmaximalrelevanceandminimalredundancymethodbasedonfeaturesubset AT liujiang anewimprovedmaximalrelevanceandminimalredundancymethodbasedonfeaturesubset AT xieshanshan newimprovedmaximalrelevanceandminimalredundancymethodbasedonfeaturesubset AT zhangyan newimprovedmaximalrelevanceandminimalredundancymethodbasedonfeaturesubset AT lvdanjv newimprovedmaximalrelevanceandminimalredundancymethodbasedonfeaturesubset AT chenxu newimprovedmaximalrelevanceandminimalredundancymethodbasedonfeaturesubset AT lujing newimprovedmaximalrelevanceandminimalredundancymethodbasedonfeaturesubset AT liujiang newimprovedmaximalrelevanceandminimalredundancymethodbasedonfeaturesubset

A new improved maximal relevance and minimal redundancy method based on feature subset

Ejemplares similares