Cargando…

Application of genetic algorithms and constructive neural networks for the analysis of microarray cancer data

BACKGROUND: Extracting relevant information from microarray data is a very complex task due to the characteristics of the data sets, as they comprise a large number of features while few samples are generally available. In this sense, feature selection is a very important aspect of the analysis help...

Descripción completa

Detalles Bibliográficos
Autores principales: Luque-Baena, Rafael Marcos, Urda, Daniel, Subirats, Jose Luis, Franco, Leonardo, Jerez, Jose M
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4108856/
https://www.ncbi.nlm.nih.gov/pubmed/25077572
http://dx.doi.org/10.1186/1742-4682-11-S1-S7
_version_ 1782327797700624384
author Luque-Baena, Rafael Marcos
Urda, Daniel
Subirats, Jose Luis
Franco, Leonardo
Jerez, Jose M
author_facet Luque-Baena, Rafael Marcos
Urda, Daniel
Subirats, Jose Luis
Franco, Leonardo
Jerez, Jose M
author_sort Luque-Baena, Rafael Marcos
collection PubMed
description BACKGROUND: Extracting relevant information from microarray data is a very complex task due to the characteristics of the data sets, as they comprise a large number of features while few samples are generally available. In this sense, feature selection is a very important aspect of the analysis helping in the tasks of identifying relevant genes and also for maximizing predictive information. METHODS: Due to its simplicity and speed, Stepwise Forward Selection (SFS) is a widely used feature selection technique. In this work, we carry a comparative study of SFS and Genetic Algorithms (GA) as general frameworks for the analysis of microarray data with the aim of identifying group of genes with high predictive capability and biological relevance. Six standard and machine learning-based techniques (Linear Discriminant Analysis (LDA), Support Vector Machines (SVM), Naive Bayes (NB), C-MANTEC Constructive Neural Network, K-Nearest Neighbors (kNN) and Multilayer perceptron (MLP)) are used within both frameworks using six free-public datasets for the task of predicting cancer outcome. RESULTS: Better cancer outcome prediction results were obtained using the GA framework noting that this approach, in comparison to the SFS one, leads to a larger selection set, uses a large number of comparison between genetic profiles and thus it is computationally more intensive. Also the GA framework permitted to obtain a set of genes that can be considered to be more biologically relevant. Regarding the different classifiers used standard feedforward neural networks (MLP), LDA and SVM lead to similar and best results, while C-MANTEC and k-NN followed closely but with a lower accuracy. Further, C-MANTEC, MLP and LDA permitted to obtain a more limited set of genes in comparison to SVM, NB and kNN, and in particular C-MANTEC resulted in the most robust classifier in terms of changes in the parameter settings. CONCLUSIONS: This study shows that if prediction accuracy is the objective, the GA-based approach lead to better results respect to the SFS approach, independently of the classifier used. Regarding classifiers, even if C-MANTEC did not achieve the best overall results, the performance was competitive with a very robust behaviour in terms of the parameters of the algorithm, and thus it can be considered as a candidate technique for future studies.
format Online
Article
Text
id pubmed-4108856
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-41088562014-08-04 Application of genetic algorithms and constructive neural networks for the analysis of microarray cancer data Luque-Baena, Rafael Marcos Urda, Daniel Subirats, Jose Luis Franco, Leonardo Jerez, Jose M Theor Biol Med Model Research BACKGROUND: Extracting relevant information from microarray data is a very complex task due to the characteristics of the data sets, as they comprise a large number of features while few samples are generally available. In this sense, feature selection is a very important aspect of the analysis helping in the tasks of identifying relevant genes and also for maximizing predictive information. METHODS: Due to its simplicity and speed, Stepwise Forward Selection (SFS) is a widely used feature selection technique. In this work, we carry a comparative study of SFS and Genetic Algorithms (GA) as general frameworks for the analysis of microarray data with the aim of identifying group of genes with high predictive capability and biological relevance. Six standard and machine learning-based techniques (Linear Discriminant Analysis (LDA), Support Vector Machines (SVM), Naive Bayes (NB), C-MANTEC Constructive Neural Network, K-Nearest Neighbors (kNN) and Multilayer perceptron (MLP)) are used within both frameworks using six free-public datasets for the task of predicting cancer outcome. RESULTS: Better cancer outcome prediction results were obtained using the GA framework noting that this approach, in comparison to the SFS one, leads to a larger selection set, uses a large number of comparison between genetic profiles and thus it is computationally more intensive. Also the GA framework permitted to obtain a set of genes that can be considered to be more biologically relevant. Regarding the different classifiers used standard feedforward neural networks (MLP), LDA and SVM lead to similar and best results, while C-MANTEC and k-NN followed closely but with a lower accuracy. Further, C-MANTEC, MLP and LDA permitted to obtain a more limited set of genes in comparison to SVM, NB and kNN, and in particular C-MANTEC resulted in the most robust classifier in terms of changes in the parameter settings. CONCLUSIONS: This study shows that if prediction accuracy is the objective, the GA-based approach lead to better results respect to the SFS approach, independently of the classifier used. Regarding classifiers, even if C-MANTEC did not achieve the best overall results, the performance was competitive with a very robust behaviour in terms of the parameters of the algorithm, and thus it can be considered as a candidate technique for future studies. BioMed Central 2014-05-07 /pmc/articles/PMC4108856/ /pubmed/25077572 http://dx.doi.org/10.1186/1742-4682-11-S1-S7 Text en Copyright © 2014 Luque-Baena et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Luque-Baena, Rafael Marcos
Urda, Daniel
Subirats, Jose Luis
Franco, Leonardo
Jerez, Jose M
Application of genetic algorithms and constructive neural networks for the analysis of microarray cancer data
title Application of genetic algorithms and constructive neural networks for the analysis of microarray cancer data
title_full Application of genetic algorithms and constructive neural networks for the analysis of microarray cancer data
title_fullStr Application of genetic algorithms and constructive neural networks for the analysis of microarray cancer data
title_full_unstemmed Application of genetic algorithms and constructive neural networks for the analysis of microarray cancer data
title_short Application of genetic algorithms and constructive neural networks for the analysis of microarray cancer data
title_sort application of genetic algorithms and constructive neural networks for the analysis of microarray cancer data
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4108856/
https://www.ncbi.nlm.nih.gov/pubmed/25077572
http://dx.doi.org/10.1186/1742-4682-11-S1-S7
work_keys_str_mv AT luquebaenarafaelmarcos applicationofgeneticalgorithmsandconstructiveneuralnetworksfortheanalysisofmicroarraycancerdata
AT urdadaniel applicationofgeneticalgorithmsandconstructiveneuralnetworksfortheanalysisofmicroarraycancerdata
AT subiratsjoseluis applicationofgeneticalgorithmsandconstructiveneuralnetworksfortheanalysisofmicroarraycancerdata
AT francoleonardo applicationofgeneticalgorithmsandconstructiveneuralnetworksfortheanalysisofmicroarraycancerdata
AT jerezjosem applicationofgeneticalgorithmsandconstructiveneuralnetworksfortheanalysisofmicroarraycancerdata