Cargando…

Establishment of a SVM classifier to predict recurrence of ovarian cancer

Gene expression data using retrieved ovarian cancer (OC) samples were used to identify genes of interest and a support vector machine (SVM) classifier was subsequently established to predict the recurrence of OC. Three datasets (GSE17260, GSE44104 and GSE51088) investigating OC gene expression were...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhou, Jinting, Li, Lin, Wang, Liling, Li, Xiaofang, Xing, Hui, Cheng, Li
Formato: Online Artículo Texto
Lenguaje:English
Publicado: D.A. Spandidos 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6131358/
https://www.ncbi.nlm.nih.gov/pubmed/30106117
http://dx.doi.org/10.3892/mmr.2018.9362
_version_ 1783354087448772608
author Zhou, Jinting
Li, Lin
Wang, Liling
Li, Xiaofang
Xing, Hui
Cheng, Li
author_facet Zhou, Jinting
Li, Lin
Wang, Liling
Li, Xiaofang
Xing, Hui
Cheng, Li
author_sort Zhou, Jinting
collection PubMed
description Gene expression data using retrieved ovarian cancer (OC) samples were used to identify genes of interest and a support vector machine (SVM) classifier was subsequently established to predict the recurrence of OC. Three datasets (GSE17260, GSE44104 and GSE51088) investigating OC gene expression were downloaded from the Gene Expression Omnibus. Differentially expressed genes (DEGs) in samples from patients with non-recurrent and recurrent OC were revealed via a homogeneity test and quality control analysis. A protein-protein interaction (PPI) network was subsequently established for the DEGs using data from Biological General Repository for Interaction Datasets, Human Protein Reference Database and Database of Interacting Proteins. Degrees of interaction and betweenness centrality (BC) scores were calculated for each node in the PPI network. The top 100 genes ranked by BC scores were selected to identify feature genes via recursive feature elimination using the GSE17260 dataset. Following this, a SVM classifier was constructed and further validated using the GSE44104 and GSE51088 datasets and independent gene expression data obtained from the Cancer Genome Atlas (TCGA). A total of 639 DEGs were identified from the three gene expression datasets, and a PPI network including 249 nodes and 354 edges was constructed. A SVM classifier consisting of 39 feature genes (including cullin 3, mouse double minute 2 homolog, aurora kinase A, WW domain containing oxidoreducatase, large tumor suppressor kinase 2, sirtuin 6, staphylococcal nuclease and tudor domain containing 1, leucine rich repeats and immunoglobulin like domains 1 and aurora kinase 1 interacting protein 1) was subsequently constructed. The prediction accuracies of the SVM classifier for GSE17260, GSE44104 and GSE51088 datasets as well as data downloaded from TCGA were revealed to be 92.7, 93.3, 96.6 and 90.4%, respectively. Furthermore, the results of the present study revealed that patients with predicted non-recurrent OC survived significantly longer compared with the patients with predicted recurrent OC (P=6.598×10(−6)). A SVM classifier consisting of 39 feature genes was established for predicting the recurrence and prognosis of OC. Therefore, the results of the present study suggested that the 39 feature genes may serve important roles in the development of OC and may represent therapeutic biomarkers of OC.
format Online
Article
Text
id pubmed-6131358
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher D.A. Spandidos
record_format MEDLINE/PubMed
spelling pubmed-61313582018-09-14 Establishment of a SVM classifier to predict recurrence of ovarian cancer Zhou, Jinting Li, Lin Wang, Liling Li, Xiaofang Xing, Hui Cheng, Li Mol Med Rep Articles Gene expression data using retrieved ovarian cancer (OC) samples were used to identify genes of interest and a support vector machine (SVM) classifier was subsequently established to predict the recurrence of OC. Three datasets (GSE17260, GSE44104 and GSE51088) investigating OC gene expression were downloaded from the Gene Expression Omnibus. Differentially expressed genes (DEGs) in samples from patients with non-recurrent and recurrent OC were revealed via a homogeneity test and quality control analysis. A protein-protein interaction (PPI) network was subsequently established for the DEGs using data from Biological General Repository for Interaction Datasets, Human Protein Reference Database and Database of Interacting Proteins. Degrees of interaction and betweenness centrality (BC) scores were calculated for each node in the PPI network. The top 100 genes ranked by BC scores were selected to identify feature genes via recursive feature elimination using the GSE17260 dataset. Following this, a SVM classifier was constructed and further validated using the GSE44104 and GSE51088 datasets and independent gene expression data obtained from the Cancer Genome Atlas (TCGA). A total of 639 DEGs were identified from the three gene expression datasets, and a PPI network including 249 nodes and 354 edges was constructed. A SVM classifier consisting of 39 feature genes (including cullin 3, mouse double minute 2 homolog, aurora kinase A, WW domain containing oxidoreducatase, large tumor suppressor kinase 2, sirtuin 6, staphylococcal nuclease and tudor domain containing 1, leucine rich repeats and immunoglobulin like domains 1 and aurora kinase 1 interacting protein 1) was subsequently constructed. The prediction accuracies of the SVM classifier for GSE17260, GSE44104 and GSE51088 datasets as well as data downloaded from TCGA were revealed to be 92.7, 93.3, 96.6 and 90.4%, respectively. Furthermore, the results of the present study revealed that patients with predicted non-recurrent OC survived significantly longer compared with the patients with predicted recurrent OC (P=6.598×10(−6)). A SVM classifier consisting of 39 feature genes was established for predicting the recurrence and prognosis of OC. Therefore, the results of the present study suggested that the 39 feature genes may serve important roles in the development of OC and may represent therapeutic biomarkers of OC. D.A. Spandidos 2018-10 2018-08-08 /pmc/articles/PMC6131358/ /pubmed/30106117 http://dx.doi.org/10.3892/mmr.2018.9362 Text en Copyright: © Zhou et al. This is an open access article distributed under the terms of the Creative Commons Attribution-NonCommercial-NoDerivs License (https://creativecommons.org/licenses/by-nc-nd/4.0/) , which permits use and distribution in any medium, provided the original work is properly cited, the use is non-commercial and no modifications or adaptations are made.
spellingShingle Articles
Zhou, Jinting
Li, Lin
Wang, Liling
Li, Xiaofang
Xing, Hui
Cheng, Li
Establishment of a SVM classifier to predict recurrence of ovarian cancer
title Establishment of a SVM classifier to predict recurrence of ovarian cancer
title_full Establishment of a SVM classifier to predict recurrence of ovarian cancer
title_fullStr Establishment of a SVM classifier to predict recurrence of ovarian cancer
title_full_unstemmed Establishment of a SVM classifier to predict recurrence of ovarian cancer
title_short Establishment of a SVM classifier to predict recurrence of ovarian cancer
title_sort establishment of a svm classifier to predict recurrence of ovarian cancer
topic Articles
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6131358/
https://www.ncbi.nlm.nih.gov/pubmed/30106117
http://dx.doi.org/10.3892/mmr.2018.9362
work_keys_str_mv AT zhoujinting establishmentofasvmclassifiertopredictrecurrenceofovariancancer
AT lilin establishmentofasvmclassifiertopredictrecurrenceofovariancancer
AT wangliling establishmentofasvmclassifiertopredictrecurrenceofovariancancer
AT lixiaofang establishmentofasvmclassifiertopredictrecurrenceofovariancancer
AT xinghui establishmentofasvmclassifiertopredictrecurrenceofovariancancer
AT chengli establishmentofasvmclassifiertopredictrecurrenceofovariancancer