Cargando…

Using Protein Interaction Database and Support Vector Machines to Improve Gene Signatures for Prediction of Breast Cancer Recurrence

Numerous studies used microarray gene expression data to extract metastasis-driving gene signatures for the prediction of breast cancer relapse. However, the accuracy and generality of the previously introduced biomarkers are not acceptable for reliable usage in independent datasets. This inadequacy...

Descripción completa

Detalles Bibliográficos
Autores principales: Sehhati, Mohammad Reza, Dehnavi, Alireza Mehri, Rabbani, Hossein, Javanmard, Shaghayegh Haghjoo
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Medknow Publications & Media Pvt Ltd 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3788198/
https://www.ncbi.nlm.nih.gov/pubmed/24098862
_version_ 1782286290910183424
author Sehhati, Mohammad Reza
Dehnavi, Alireza Mehri
Rabbani, Hossein
Javanmard, Shaghayegh Haghjoo
author_facet Sehhati, Mohammad Reza
Dehnavi, Alireza Mehri
Rabbani, Hossein
Javanmard, Shaghayegh Haghjoo
author_sort Sehhati, Mohammad Reza
collection PubMed
description Numerous studies used microarray gene expression data to extract metastasis-driving gene signatures for the prediction of breast cancer relapse. However, the accuracy and generality of the previously introduced biomarkers are not acceptable for reliable usage in independent datasets. This inadequacy is attributed to ignoring gene interactions by simple feature selection methods, due to their computational burden. In this study, an integrated approach with low computational cost was proposed for identifying a more predictive gene signature, for prediction of breast cancer recurrence. First, a small set of genes was primarily selected as signature by an appropriate filter feature selection (FFS) method. Then, a binary sub-class of protein-protein interaction (PPI) network was used to expand the primary set by adding adjacent proteins of each gene signature from the PPI-network. Subsequently, the support vector machine-based recursive feature elimination (SVMRFE) method was applied to the expression level of all the genes in the expanded set. Finally, the genes with the highest score by SVMRFE were selected as the new biomarkers. Accuracy of the final selected biomarkers was evaluated to classify four datasets on breast cancer patients, including 800 cases, into two cohorts of poor and good prognosis. The results of the five-fold cross validation test, using the support vector machine as a classifier, showed more than 13% improvement in the average accuracy, after modifying the primary selected signatures. Moreover, the method used in this study showed a lower computational cost compared to the other PPI-based methods. The proposed method demonstrated more robust and accurate biomarkers using the PPI network, at a low computational cost. This approach could be used as a supplementary procedure in microarray studies after applying various gene selection methods.
format Online
Article
Text
id pubmed-3788198
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher Medknow Publications & Media Pvt Ltd
record_format MEDLINE/PubMed
spelling pubmed-37881982013-10-04 Using Protein Interaction Database and Support Vector Machines to Improve Gene Signatures for Prediction of Breast Cancer Recurrence Sehhati, Mohammad Reza Dehnavi, Alireza Mehri Rabbani, Hossein Javanmard, Shaghayegh Haghjoo J Med Signals Sens Original Article Numerous studies used microarray gene expression data to extract metastasis-driving gene signatures for the prediction of breast cancer relapse. However, the accuracy and generality of the previously introduced biomarkers are not acceptable for reliable usage in independent datasets. This inadequacy is attributed to ignoring gene interactions by simple feature selection methods, due to their computational burden. In this study, an integrated approach with low computational cost was proposed for identifying a more predictive gene signature, for prediction of breast cancer recurrence. First, a small set of genes was primarily selected as signature by an appropriate filter feature selection (FFS) method. Then, a binary sub-class of protein-protein interaction (PPI) network was used to expand the primary set by adding adjacent proteins of each gene signature from the PPI-network. Subsequently, the support vector machine-based recursive feature elimination (SVMRFE) method was applied to the expression level of all the genes in the expanded set. Finally, the genes with the highest score by SVMRFE were selected as the new biomarkers. Accuracy of the final selected biomarkers was evaluated to classify four datasets on breast cancer patients, including 800 cases, into two cohorts of poor and good prognosis. The results of the five-fold cross validation test, using the support vector machine as a classifier, showed more than 13% improvement in the average accuracy, after modifying the primary selected signatures. Moreover, the method used in this study showed a lower computational cost compared to the other PPI-based methods. The proposed method demonstrated more robust and accurate biomarkers using the PPI network, at a low computational cost. This approach could be used as a supplementary procedure in microarray studies after applying various gene selection methods. Medknow Publications & Media Pvt Ltd 2013 /pmc/articles/PMC3788198/ /pubmed/24098862 Text en Copyright: © Journal of Medical Signals and Sensors http://creativecommons.org/licenses/by-nc-sa/3.0 This is an open-access article distributed under the terms of the Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Article
Sehhati, Mohammad Reza
Dehnavi, Alireza Mehri
Rabbani, Hossein
Javanmard, Shaghayegh Haghjoo
Using Protein Interaction Database and Support Vector Machines to Improve Gene Signatures for Prediction of Breast Cancer Recurrence
title Using Protein Interaction Database and Support Vector Machines to Improve Gene Signatures for Prediction of Breast Cancer Recurrence
title_full Using Protein Interaction Database and Support Vector Machines to Improve Gene Signatures for Prediction of Breast Cancer Recurrence
title_fullStr Using Protein Interaction Database and Support Vector Machines to Improve Gene Signatures for Prediction of Breast Cancer Recurrence
title_full_unstemmed Using Protein Interaction Database and Support Vector Machines to Improve Gene Signatures for Prediction of Breast Cancer Recurrence
title_short Using Protein Interaction Database and Support Vector Machines to Improve Gene Signatures for Prediction of Breast Cancer Recurrence
title_sort using protein interaction database and support vector machines to improve gene signatures for prediction of breast cancer recurrence
topic Original Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3788198/
https://www.ncbi.nlm.nih.gov/pubmed/24098862
work_keys_str_mv AT sehhatimohammadreza usingproteininteractiondatabaseandsupportvectormachinestoimprovegenesignaturesforpredictionofbreastcancerrecurrence
AT dehnavialirezamehri usingproteininteractiondatabaseandsupportvectormachinestoimprovegenesignaturesforpredictionofbreastcancerrecurrence
AT rabbanihossein usingproteininteractiondatabaseandsupportvectormachinestoimprovegenesignaturesforpredictionofbreastcancerrecurrence
AT javanmardshaghayeghhaghjoo usingproteininteractiondatabaseandsupportvectormachinestoimprovegenesignaturesforpredictionofbreastcancerrecurrence