Cargando…

A novel bio-inspired hybrid multi-filter wrapper gene selection method with ensemble classifier for microarray data

Microarray technology is known as one of the most important tools for collecting DNA expression data. This technology allows researchers to investigate and examine types of diseases and their origins. However, microarray data are often associated with a small sample size, a significant number of gen...

Descripción completa

Detalles Bibliográficos
Autores principales: Nouri-Moghaddam, Babak, Ghazanfari, Mehdi, Fathian, Mohammad
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Springer London 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8435304/
https://www.ncbi.nlm.nih.gov/pubmed/34539088
http://dx.doi.org/10.1007/s00521-021-06459-9
_version_ 1783751764246265856
author Nouri-Moghaddam, Babak
Ghazanfari, Mehdi
Fathian, Mohammad
author_facet Nouri-Moghaddam, Babak
Ghazanfari, Mehdi
Fathian, Mohammad
author_sort Nouri-Moghaddam, Babak
collection PubMed
description Microarray technology is known as one of the most important tools for collecting DNA expression data. This technology allows researchers to investigate and examine types of diseases and their origins. However, microarray data are often associated with a small sample size, a significant number of genes, imbalanced data, etc., making classification models inefficient. Thus, a new hybrid solution based on a multi-filter and adaptive chaotic multi-objective forest optimization algorithm (AC-MOFOA) is presented to solve the gene selection problem and construct the Ensemble Classifier. In the proposed solution, a multi-filter model (i.e., ensemble filter) is proposed as preprocessing step to reduce the dataset's dimensions, using a combination of five filter methods to remove redundant and irrelevant genes. Accordingly, the results of the five filter methods are combined using a voting-based function. Additionally, the results of the proposed multi-filter indicate that it has good capability in reducing the gene subset size and selecting relevant genes. Then, an AC-MOFOA based on the concepts of non-dominated sorting, crowding distance, chaos theory, and adaptive operators is presented. AC-MOFOA as a wrapper method aimed at reducing dataset dimensions, optimizing KELM, and increasing the accuracy of the classification, simultaneously. Next, in this method, an ensemble classifier model is presented using AC-MOFOA results to classify microarray data. The performance of the proposed algorithm was evaluated on nine public microarray datasets, and its results were compared in terms of the number of selected genes, classification efficiency, execution time, time complexity, hypervolume indicator, and spacing metric with five hybrid multi-objective methods, and three hybrid single-objective methods. According to the results, the proposed hybrid method could increase the accuracy of the KELM in most datasets by reducing the dataset's dimensions and achieve similar or superior performance compared to other multi-objective methods. Furthermore, the proposed Ensemble Classifier model could provide better classification accuracy and generalizability in the seven of nine microarray datasets compared to conventional ensemble methods. Moreover, the comparison results of the Ensemble Classifier model with three state-of-the-art ensemble generation methods indicate its competitive performance in which the proposed ensemble model achieved better results in the five of nine datasets. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1007/s00521-021-06459-9.
format Online
Article
Text
id pubmed-8435304
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Springer London
record_format MEDLINE/PubMed
spelling pubmed-84353042021-09-13 A novel bio-inspired hybrid multi-filter wrapper gene selection method with ensemble classifier for microarray data Nouri-Moghaddam, Babak Ghazanfari, Mehdi Fathian, Mohammad Neural Comput Appl S.I. : ‘Babel Fish’ for Feature-driven Machine Learning to Maximise Societal Value Microarray technology is known as one of the most important tools for collecting DNA expression data. This technology allows researchers to investigate and examine types of diseases and their origins. However, microarray data are often associated with a small sample size, a significant number of genes, imbalanced data, etc., making classification models inefficient. Thus, a new hybrid solution based on a multi-filter and adaptive chaotic multi-objective forest optimization algorithm (AC-MOFOA) is presented to solve the gene selection problem and construct the Ensemble Classifier. In the proposed solution, a multi-filter model (i.e., ensemble filter) is proposed as preprocessing step to reduce the dataset's dimensions, using a combination of five filter methods to remove redundant and irrelevant genes. Accordingly, the results of the five filter methods are combined using a voting-based function. Additionally, the results of the proposed multi-filter indicate that it has good capability in reducing the gene subset size and selecting relevant genes. Then, an AC-MOFOA based on the concepts of non-dominated sorting, crowding distance, chaos theory, and adaptive operators is presented. AC-MOFOA as a wrapper method aimed at reducing dataset dimensions, optimizing KELM, and increasing the accuracy of the classification, simultaneously. Next, in this method, an ensemble classifier model is presented using AC-MOFOA results to classify microarray data. The performance of the proposed algorithm was evaluated on nine public microarray datasets, and its results were compared in terms of the number of selected genes, classification efficiency, execution time, time complexity, hypervolume indicator, and spacing metric with five hybrid multi-objective methods, and three hybrid single-objective methods. According to the results, the proposed hybrid method could increase the accuracy of the KELM in most datasets by reducing the dataset's dimensions and achieve similar or superior performance compared to other multi-objective methods. Furthermore, the proposed Ensemble Classifier model could provide better classification accuracy and generalizability in the seven of nine microarray datasets compared to conventional ensemble methods. Moreover, the comparison results of the Ensemble Classifier model with three state-of-the-art ensemble generation methods indicate its competitive performance in which the proposed ensemble model achieved better results in the five of nine datasets. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1007/s00521-021-06459-9. Springer London 2021-09-12 2023 /pmc/articles/PMC8435304/ /pubmed/34539088 http://dx.doi.org/10.1007/s00521-021-06459-9 Text en © The Author(s), under exclusive licence to Springer-Verlag London Ltd., part of Springer Nature 2021 This article is made available via the PMC Open Access Subset for unrestricted research re-use and secondary analysis in any form or by any means with acknowledgement of the original source. These permissions are granted for the duration of the World Health Organization (WHO) declaration of COVID-19 as a global pandemic.
spellingShingle S.I. : ‘Babel Fish’ for Feature-driven Machine Learning to Maximise Societal Value
Nouri-Moghaddam, Babak
Ghazanfari, Mehdi
Fathian, Mohammad
A novel bio-inspired hybrid multi-filter wrapper gene selection method with ensemble classifier for microarray data
title A novel bio-inspired hybrid multi-filter wrapper gene selection method with ensemble classifier for microarray data
title_full A novel bio-inspired hybrid multi-filter wrapper gene selection method with ensemble classifier for microarray data
title_fullStr A novel bio-inspired hybrid multi-filter wrapper gene selection method with ensemble classifier for microarray data
title_full_unstemmed A novel bio-inspired hybrid multi-filter wrapper gene selection method with ensemble classifier for microarray data
title_short A novel bio-inspired hybrid multi-filter wrapper gene selection method with ensemble classifier for microarray data
title_sort novel bio-inspired hybrid multi-filter wrapper gene selection method with ensemble classifier for microarray data
topic S.I. : ‘Babel Fish’ for Feature-driven Machine Learning to Maximise Societal Value
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8435304/
https://www.ncbi.nlm.nih.gov/pubmed/34539088
http://dx.doi.org/10.1007/s00521-021-06459-9
work_keys_str_mv AT nourimoghaddambabak anovelbioinspiredhybridmultifilterwrappergeneselectionmethodwithensembleclassifierformicroarraydata
AT ghazanfarimehdi anovelbioinspiredhybridmultifilterwrappergeneselectionmethodwithensembleclassifierformicroarraydata
AT fathianmohammad anovelbioinspiredhybridmultifilterwrappergeneselectionmethodwithensembleclassifierformicroarraydata
AT nourimoghaddambabak novelbioinspiredhybridmultifilterwrappergeneselectionmethodwithensembleclassifierformicroarraydata
AT ghazanfarimehdi novelbioinspiredhybridmultifilterwrappergeneselectionmethodwithensembleclassifierformicroarraydata
AT fathianmohammad novelbioinspiredhybridmultifilterwrappergeneselectionmethodwithensembleclassifierformicroarraydata