Cargando…

A multistart tabu search-based method for feature selection in medical applications

In the design of classification models, irrelevant or noisy features are often generated. In some cases, there may even be negative interactions among features. These weaknesses can degrade the performance of the models. Feature selection is a task that searches for a small subset of relevant featur...

Descripción completa

Detalles Bibliográficos
Autores principales: Pacheco, Joaquín, Saiz, Olalla, Casado, Silvia, Ubillos, Silvia
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10564765/
https://www.ncbi.nlm.nih.gov/pubmed/37816874
http://dx.doi.org/10.1038/s41598-023-44437-4
_version_ 1785118547847938048
author Pacheco, Joaquín
Saiz, Olalla
Casado, Silvia
Ubillos, Silvia
author_facet Pacheco, Joaquín
Saiz, Olalla
Casado, Silvia
Ubillos, Silvia
author_sort Pacheco, Joaquín
collection PubMed
description In the design of classification models, irrelevant or noisy features are often generated. In some cases, there may even be negative interactions among features. These weaknesses can degrade the performance of the models. Feature selection is a task that searches for a small subset of relevant features from the original set that generate the most efficient models possible. In addition to improving the efficiency of the models, feature selection confers other advantages, such as greater ease in the generation of the necessary data as well as clearer and more interpretable models. In the case of medical applications, feature selection may help to distinguish which characteristics, habits, and factors have the greatest impact on the onset of diseases. However, feature selection is a complex task due to the large number of possible solutions. In the last few years, methods based on different metaheuristic strategies, mainly evolutionary algorithms, have been proposed. The motivation of this work is to develop a method that outperforms previous methods, with the benefits that this implies especially in the medical field. More precisely, the present study proposes a simple method based on tabu search and multistart techniques. The proposed method was analyzed and compared to other methods by testing their performance on several medical databases. Specifically, eight databases belong to the well-known repository of the University of California in Irvine and one of our own design were used. In these computational tests, the proposed method outperformed other recent methods as gauged by various metrics and classifiers. The analyses were accompanied by statistical tests, the results of which showed that the superiority of our method is significant and therefore strengthened these conclusions. In short, the contribution of this work is the development of a method that, on the one hand, is based on different strategies than those used in recent methods, and on the other hand, improves the performance of these methods.
format Online
Article
Text
id pubmed-10564765
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-105647652023-10-12 A multistart tabu search-based method for feature selection in medical applications Pacheco, Joaquín Saiz, Olalla Casado, Silvia Ubillos, Silvia Sci Rep Article In the design of classification models, irrelevant or noisy features are often generated. In some cases, there may even be negative interactions among features. These weaknesses can degrade the performance of the models. Feature selection is a task that searches for a small subset of relevant features from the original set that generate the most efficient models possible. In addition to improving the efficiency of the models, feature selection confers other advantages, such as greater ease in the generation of the necessary data as well as clearer and more interpretable models. In the case of medical applications, feature selection may help to distinguish which characteristics, habits, and factors have the greatest impact on the onset of diseases. However, feature selection is a complex task due to the large number of possible solutions. In the last few years, methods based on different metaheuristic strategies, mainly evolutionary algorithms, have been proposed. The motivation of this work is to develop a method that outperforms previous methods, with the benefits that this implies especially in the medical field. More precisely, the present study proposes a simple method based on tabu search and multistart techniques. The proposed method was analyzed and compared to other methods by testing their performance on several medical databases. Specifically, eight databases belong to the well-known repository of the University of California in Irvine and one of our own design were used. In these computational tests, the proposed method outperformed other recent methods as gauged by various metrics and classifiers. The analyses were accompanied by statistical tests, the results of which showed that the superiority of our method is significant and therefore strengthened these conclusions. In short, the contribution of this work is the development of a method that, on the one hand, is based on different strategies than those used in recent methods, and on the other hand, improves the performance of these methods. Nature Publishing Group UK 2023-10-10 /pmc/articles/PMC10564765/ /pubmed/37816874 http://dx.doi.org/10.1038/s41598-023-44437-4 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Article
Pacheco, Joaquín
Saiz, Olalla
Casado, Silvia
Ubillos, Silvia
A multistart tabu search-based method for feature selection in medical applications
title A multistart tabu search-based method for feature selection in medical applications
title_full A multistart tabu search-based method for feature selection in medical applications
title_fullStr A multistart tabu search-based method for feature selection in medical applications
title_full_unstemmed A multistart tabu search-based method for feature selection in medical applications
title_short A multistart tabu search-based method for feature selection in medical applications
title_sort multistart tabu search-based method for feature selection in medical applications
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10564765/
https://www.ncbi.nlm.nih.gov/pubmed/37816874
http://dx.doi.org/10.1038/s41598-023-44437-4
work_keys_str_mv AT pachecojoaquin amultistarttabusearchbasedmethodforfeatureselectioninmedicalapplications
AT saizolalla amultistarttabusearchbasedmethodforfeatureselectioninmedicalapplications
AT casadosilvia amultistarttabusearchbasedmethodforfeatureselectioninmedicalapplications
AT ubillossilvia amultistarttabusearchbasedmethodforfeatureselectioninmedicalapplications
AT pachecojoaquin multistarttabusearchbasedmethodforfeatureselectioninmedicalapplications
AT saizolalla multistarttabusearchbasedmethodforfeatureselectioninmedicalapplications
AT casadosilvia multistarttabusearchbasedmethodforfeatureselectioninmedicalapplications
AT ubillossilvia multistarttabusearchbasedmethodforfeatureselectioninmedicalapplications