Cargando…

Classification of Microarray Gene Expression Data Using an Infiltration Tactics Optimization (ITO) Algorithm

A number of different feature selection and classification techniques have been proposed in literature including parameter-free and parameter-based algorithms. The former are quick but may result in local maxima while the latter use dataset-specific parameter-tuning for higher accuracy. However, hig...

Descripción completa

Detalles Bibliográficos
Autores principales: Zahoor, Javed, Zafar, Kashif
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7397166/
https://www.ncbi.nlm.nih.gov/pubmed/32708429
http://dx.doi.org/10.3390/genes11070819
_version_ 1783565718133932032
author Zahoor, Javed
Zafar, Kashif
author_facet Zahoor, Javed
Zafar, Kashif
author_sort Zahoor, Javed
collection PubMed
description A number of different feature selection and classification techniques have been proposed in literature including parameter-free and parameter-based algorithms. The former are quick but may result in local maxima while the latter use dataset-specific parameter-tuning for higher accuracy. However, higher accuracy may not necessarily mean higher reliability of the model. Thus, generalized optimization is still a challenge open for further research. This paper presents a warzone inspired “infiltration tactics” based optimization algorithm (ITO)—not to be confused with the ITO algorithm based on the Itõ Process in the field of Stochastic calculus. The proposed ITO algorithm combines parameter-free and parameter-based classifiers to produce a high-accuracy-high-reliability (HAHR) binary classifier. The algorithm produces results in two phases: (i) Lightweight Infantry Group (LIG) converges quickly to find non-local maxima and produces comparable results (i.e., 70 to 88% accuracy) (ii) Followup Team (FT) uses advanced tuning to enhance the baseline performance (i.e., 75 to 99%). Every soldier of the ITO army is a base model with its own independently chosen Subset selection method, pre-processing, and validation methods and classifier. The successful soldiers are combined through heterogeneous ensembles for optimal results. The proposed approach addresses a data scarcity problem, is flexible to the choice of heterogeneous base classifiers, and is able to produce HAHR models comparable to the established MAQC-II results.
format Online
Article
Text
id pubmed-7397166
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-73971662020-08-16 Classification of Microarray Gene Expression Data Using an Infiltration Tactics Optimization (ITO) Algorithm Zahoor, Javed Zafar, Kashif Genes (Basel) Article A number of different feature selection and classification techniques have been proposed in literature including parameter-free and parameter-based algorithms. The former are quick but may result in local maxima while the latter use dataset-specific parameter-tuning for higher accuracy. However, higher accuracy may not necessarily mean higher reliability of the model. Thus, generalized optimization is still a challenge open for further research. This paper presents a warzone inspired “infiltration tactics” based optimization algorithm (ITO)—not to be confused with the ITO algorithm based on the Itõ Process in the field of Stochastic calculus. The proposed ITO algorithm combines parameter-free and parameter-based classifiers to produce a high-accuracy-high-reliability (HAHR) binary classifier. The algorithm produces results in two phases: (i) Lightweight Infantry Group (LIG) converges quickly to find non-local maxima and produces comparable results (i.e., 70 to 88% accuracy) (ii) Followup Team (FT) uses advanced tuning to enhance the baseline performance (i.e., 75 to 99%). Every soldier of the ITO army is a base model with its own independently chosen Subset selection method, pre-processing, and validation methods and classifier. The successful soldiers are combined through heterogeneous ensembles for optimal results. The proposed approach addresses a data scarcity problem, is flexible to the choice of heterogeneous base classifiers, and is able to produce HAHR models comparable to the established MAQC-II results. MDPI 2020-07-18 /pmc/articles/PMC7397166/ /pubmed/32708429 http://dx.doi.org/10.3390/genes11070819 Text en © 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Zahoor, Javed
Zafar, Kashif
Classification of Microarray Gene Expression Data Using an Infiltration Tactics Optimization (ITO) Algorithm
title Classification of Microarray Gene Expression Data Using an Infiltration Tactics Optimization (ITO) Algorithm
title_full Classification of Microarray Gene Expression Data Using an Infiltration Tactics Optimization (ITO) Algorithm
title_fullStr Classification of Microarray Gene Expression Data Using an Infiltration Tactics Optimization (ITO) Algorithm
title_full_unstemmed Classification of Microarray Gene Expression Data Using an Infiltration Tactics Optimization (ITO) Algorithm
title_short Classification of Microarray Gene Expression Data Using an Infiltration Tactics Optimization (ITO) Algorithm
title_sort classification of microarray gene expression data using an infiltration tactics optimization (ito) algorithm
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7397166/
https://www.ncbi.nlm.nih.gov/pubmed/32708429
http://dx.doi.org/10.3390/genes11070819
work_keys_str_mv AT zahoorjaved classificationofmicroarraygeneexpressiondatausinganinfiltrationtacticsoptimizationitoalgorithm
AT zafarkashif classificationofmicroarraygeneexpressiondatausinganinfiltrationtacticsoptimizationitoalgorithm