Cargando…

Improving Classification of Cancer and Mining Biomarkers from Gene Expression Profiles Using Hybrid Optimization Algorithms and Fuzzy Support Vector Machine

BACKGROUND: Gene expression data are characteristically high dimensional with a small sample size in contrast to the feature size and variability inherent in biological processes that contribute to difficulties in analysis. Selection of highly discriminative features decreases the computational cost...

Descripción completa

Detalles Bibliográficos
Autores principales: Moteghaed, Niloofar Yousefi, Maghooli, Keivan, Garshasbi, Masoud
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Medknow Publications & Media Pvt Ltd 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5840891/
https://www.ncbi.nlm.nih.gov/pubmed/29535919
_version_ 1783304661025947648
author Moteghaed, Niloofar Yousefi
Maghooli, Keivan
Garshasbi, Masoud
author_facet Moteghaed, Niloofar Yousefi
Maghooli, Keivan
Garshasbi, Masoud
author_sort Moteghaed, Niloofar Yousefi
collection PubMed
description BACKGROUND: Gene expression data are characteristically high dimensional with a small sample size in contrast to the feature size and variability inherent in biological processes that contribute to difficulties in analysis. Selection of highly discriminative features decreases the computational cost and complexity of the classifier and improves its reliability for prediction of a new class of samples. METHODS: The present study used hybrid particle swarm optimization and genetic algorithms for gene selection and a fuzzy support vector machine (SVM) as the classifier. Fuzzy logic is used to infer the importance of each sample in the training phase and decrease the outlier sensitivity of the system to increase the ability to generalize the classifier. A decision-tree algorithm was applied to the most frequent genes to develop a set of rules for each type of cancer. This improved the abilities of the algorithm by finding the best parameters for the classifier during the training phase without the need for trial-and-error by the user. The proposed approach was tested on four benchmark gene expression profiles. RESULTS: Good results have been demonstrated for the proposed algorithm. The classification accuracy for leukemia data is 100%, for colon cancer is 96.67% and for breast cancer is 98%. The results show that the best kernel used in training the SVM classifier is the radial basis function. CONCLUSIONS: The experimental results show that the proposed algorithm can decrease the dimensionality of the dataset, determine the most informative gene subset, and improve classification accuracy using the optimal parameters of the classifier with no user interface.
format Online
Article
Text
id pubmed-5840891
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher Medknow Publications & Media Pvt Ltd
record_format MEDLINE/PubMed
spelling pubmed-58408912018-03-13 Improving Classification of Cancer and Mining Biomarkers from Gene Expression Profiles Using Hybrid Optimization Algorithms and Fuzzy Support Vector Machine Moteghaed, Niloofar Yousefi Maghooli, Keivan Garshasbi, Masoud J Med Signals Sens Original Article BACKGROUND: Gene expression data are characteristically high dimensional with a small sample size in contrast to the feature size and variability inherent in biological processes that contribute to difficulties in analysis. Selection of highly discriminative features decreases the computational cost and complexity of the classifier and improves its reliability for prediction of a new class of samples. METHODS: The present study used hybrid particle swarm optimization and genetic algorithms for gene selection and a fuzzy support vector machine (SVM) as the classifier. Fuzzy logic is used to infer the importance of each sample in the training phase and decrease the outlier sensitivity of the system to increase the ability to generalize the classifier. A decision-tree algorithm was applied to the most frequent genes to develop a set of rules for each type of cancer. This improved the abilities of the algorithm by finding the best parameters for the classifier during the training phase without the need for trial-and-error by the user. The proposed approach was tested on four benchmark gene expression profiles. RESULTS: Good results have been demonstrated for the proposed algorithm. The classification accuracy for leukemia data is 100%, for colon cancer is 96.67% and for breast cancer is 98%. The results show that the best kernel used in training the SVM classifier is the radial basis function. CONCLUSIONS: The experimental results show that the proposed algorithm can decrease the dimensionality of the dataset, determine the most informative gene subset, and improve classification accuracy using the optimal parameters of the classifier with no user interface. Medknow Publications & Media Pvt Ltd 2018 /pmc/articles/PMC5840891/ /pubmed/29535919 Text en Copyright: © 2018 Journal of Medical Signals & Sensors http://creativecommons.org/licenses/by-nc-sa/3.0 This is an open access article distributed under the terms of the Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License, which allows others to remix, tweak, and build upon the work non-commercially, as long as the author is credited and the new creations are licensed under the identical terms.
spellingShingle Original Article
Moteghaed, Niloofar Yousefi
Maghooli, Keivan
Garshasbi, Masoud
Improving Classification of Cancer and Mining Biomarkers from Gene Expression Profiles Using Hybrid Optimization Algorithms and Fuzzy Support Vector Machine
title Improving Classification of Cancer and Mining Biomarkers from Gene Expression Profiles Using Hybrid Optimization Algorithms and Fuzzy Support Vector Machine
title_full Improving Classification of Cancer and Mining Biomarkers from Gene Expression Profiles Using Hybrid Optimization Algorithms and Fuzzy Support Vector Machine
title_fullStr Improving Classification of Cancer and Mining Biomarkers from Gene Expression Profiles Using Hybrid Optimization Algorithms and Fuzzy Support Vector Machine
title_full_unstemmed Improving Classification of Cancer and Mining Biomarkers from Gene Expression Profiles Using Hybrid Optimization Algorithms and Fuzzy Support Vector Machine
title_short Improving Classification of Cancer and Mining Biomarkers from Gene Expression Profiles Using Hybrid Optimization Algorithms and Fuzzy Support Vector Machine
title_sort improving classification of cancer and mining biomarkers from gene expression profiles using hybrid optimization algorithms and fuzzy support vector machine
topic Original Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5840891/
https://www.ncbi.nlm.nih.gov/pubmed/29535919
work_keys_str_mv AT moteghaedniloofaryousefi improvingclassificationofcancerandminingbiomarkersfromgeneexpressionprofilesusinghybridoptimizationalgorithmsandfuzzysupportvectormachine
AT maghoolikeivan improvingclassificationofcancerandminingbiomarkersfromgeneexpressionprofilesusinghybridoptimizationalgorithmsandfuzzysupportvectormachine
AT garshasbimasoud improvingclassificationofcancerandminingbiomarkersfromgeneexpressionprofilesusinghybridoptimizationalgorithmsandfuzzysupportvectormachine