Cargando…

Prediction of lung tumor types based on protein attributes by machine learning algorithms

Early diagnosis of lung cancers and distinction between the tumor types (Small Cell Lung Cancer (SCLC) and Non-Small Cell Lung Cancer (NSCLC) are very important to increase the survival rate of patients. Herein, we propose a diagnostic system based on sequence-derived structural and physicochemical...

Descripción completa

Detalles Bibliográficos
Autores principales: Hosseinzadeh, Faezeh, KayvanJoo, Amir Hossein, Ebrahimi, Mansuor, Goliaei, Bahram
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Springer International Publishing 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3710575/
https://www.ncbi.nlm.nih.gov/pubmed/23888262
http://dx.doi.org/10.1186/2193-1801-2-238
_version_ 1782276885623865344
author Hosseinzadeh, Faezeh
KayvanJoo, Amir Hossein
Ebrahimi, Mansuor
Goliaei, Bahram
author_facet Hosseinzadeh, Faezeh
KayvanJoo, Amir Hossein
Ebrahimi, Mansuor
Goliaei, Bahram
author_sort Hosseinzadeh, Faezeh
collection PubMed
description Early diagnosis of lung cancers and distinction between the tumor types (Small Cell Lung Cancer (SCLC) and Non-Small Cell Lung Cancer (NSCLC) are very important to increase the survival rate of patients. Herein, we propose a diagnostic system based on sequence-derived structural and physicochemical attributes of proteins that involved in both types of tumors via feature extraction, feature selection and prediction models. 1497 proteins attributes computed and important features selected by 12 attribute weighting models and finally machine learning models consist of seven SVM models, three ANN models and two NB models applied on original database and newly created ones from attribute weighting models; models accuracies calculated through 10-fold cross and wrapper validation (just for SVM algorithms). In line with our previous findings, dipeptide composition, autocorrelation and distribution descriptor were the most important protein features selected by bioinformatics tools. The algorithms performances in lung cancer tumor type prediction increased when they applied on datasets created by attribute weighting models rather than original dataset. Wrapper-Validation performed better than X-Validation; the best cancer type prediction resulted from SVM and SVM Linear models (82%). The best accuracy of ANN gained when Neural Net model applied on SVM dataset (88%). This is the first report suggesting that the combination of protein features and attribute weighting models with machine learning algorithms can be effectively used to predict the type of lung cancer tumors (SCLC and NSCLC). ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/2193-1801-2-238) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-3710575
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher Springer International Publishing
record_format MEDLINE/PubMed
spelling pubmed-37105752013-07-23 Prediction of lung tumor types based on protein attributes by machine learning algorithms Hosseinzadeh, Faezeh KayvanJoo, Amir Hossein Ebrahimi, Mansuor Goliaei, Bahram Springerplus Research Early diagnosis of lung cancers and distinction between the tumor types (Small Cell Lung Cancer (SCLC) and Non-Small Cell Lung Cancer (NSCLC) are very important to increase the survival rate of patients. Herein, we propose a diagnostic system based on sequence-derived structural and physicochemical attributes of proteins that involved in both types of tumors via feature extraction, feature selection and prediction models. 1497 proteins attributes computed and important features selected by 12 attribute weighting models and finally machine learning models consist of seven SVM models, three ANN models and two NB models applied on original database and newly created ones from attribute weighting models; models accuracies calculated through 10-fold cross and wrapper validation (just for SVM algorithms). In line with our previous findings, dipeptide composition, autocorrelation and distribution descriptor were the most important protein features selected by bioinformatics tools. The algorithms performances in lung cancer tumor type prediction increased when they applied on datasets created by attribute weighting models rather than original dataset. Wrapper-Validation performed better than X-Validation; the best cancer type prediction resulted from SVM and SVM Linear models (82%). The best accuracy of ANN gained when Neural Net model applied on SVM dataset (88%). This is the first report suggesting that the combination of protein features and attribute weighting models with machine learning algorithms can be effectively used to predict the type of lung cancer tumors (SCLC and NSCLC). ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/2193-1801-2-238) contains supplementary material, which is available to authorized users. Springer International Publishing 2013-05-24 /pmc/articles/PMC3710575/ /pubmed/23888262 http://dx.doi.org/10.1186/2193-1801-2-238 Text en © Hosseinzadeh et al.; licensee Springer. 2013 This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research
Hosseinzadeh, Faezeh
KayvanJoo, Amir Hossein
Ebrahimi, Mansuor
Goliaei, Bahram
Prediction of lung tumor types based on protein attributes by machine learning algorithms
title Prediction of lung tumor types based on protein attributes by machine learning algorithms
title_full Prediction of lung tumor types based on protein attributes by machine learning algorithms
title_fullStr Prediction of lung tumor types based on protein attributes by machine learning algorithms
title_full_unstemmed Prediction of lung tumor types based on protein attributes by machine learning algorithms
title_short Prediction of lung tumor types based on protein attributes by machine learning algorithms
title_sort prediction of lung tumor types based on protein attributes by machine learning algorithms
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3710575/
https://www.ncbi.nlm.nih.gov/pubmed/23888262
http://dx.doi.org/10.1186/2193-1801-2-238
work_keys_str_mv AT hosseinzadehfaezeh predictionoflungtumortypesbasedonproteinattributesbymachinelearningalgorithms
AT kayvanjooamirhossein predictionoflungtumortypesbasedonproteinattributesbymachinelearningalgorithms
AT ebrahimimansuor predictionoflungtumortypesbasedonproteinattributesbymachinelearningalgorithms
AT goliaeibahram predictionoflungtumortypesbasedonproteinattributesbymachinelearningalgorithms