Cargando…
Establishment of a SVM classifier to predict recurrence of ovarian cancer
Gene expression data using retrieved ovarian cancer (OC) samples were used to identify genes of interest and a support vector machine (SVM) classifier was subsequently established to predict the recurrence of OC. Three datasets (GSE17260, GSE44104 and GSE51088) investigating OC gene expression were...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
D.A. Spandidos
2018
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6131358/ https://www.ncbi.nlm.nih.gov/pubmed/30106117 http://dx.doi.org/10.3892/mmr.2018.9362 |
_version_ | 1783354087448772608 |
---|---|
author | Zhou, Jinting Li, Lin Wang, Liling Li, Xiaofang Xing, Hui Cheng, Li |
author_facet | Zhou, Jinting Li, Lin Wang, Liling Li, Xiaofang Xing, Hui Cheng, Li |
author_sort | Zhou, Jinting |
collection | PubMed |
description | Gene expression data using retrieved ovarian cancer (OC) samples were used to identify genes of interest and a support vector machine (SVM) classifier was subsequently established to predict the recurrence of OC. Three datasets (GSE17260, GSE44104 and GSE51088) investigating OC gene expression were downloaded from the Gene Expression Omnibus. Differentially expressed genes (DEGs) in samples from patients with non-recurrent and recurrent OC were revealed via a homogeneity test and quality control analysis. A protein-protein interaction (PPI) network was subsequently established for the DEGs using data from Biological General Repository for Interaction Datasets, Human Protein Reference Database and Database of Interacting Proteins. Degrees of interaction and betweenness centrality (BC) scores were calculated for each node in the PPI network. The top 100 genes ranked by BC scores were selected to identify feature genes via recursive feature elimination using the GSE17260 dataset. Following this, a SVM classifier was constructed and further validated using the GSE44104 and GSE51088 datasets and independent gene expression data obtained from the Cancer Genome Atlas (TCGA). A total of 639 DEGs were identified from the three gene expression datasets, and a PPI network including 249 nodes and 354 edges was constructed. A SVM classifier consisting of 39 feature genes (including cullin 3, mouse double minute 2 homolog, aurora kinase A, WW domain containing oxidoreducatase, large tumor suppressor kinase 2, sirtuin 6, staphylococcal nuclease and tudor domain containing 1, leucine rich repeats and immunoglobulin like domains 1 and aurora kinase 1 interacting protein 1) was subsequently constructed. The prediction accuracies of the SVM classifier for GSE17260, GSE44104 and GSE51088 datasets as well as data downloaded from TCGA were revealed to be 92.7, 93.3, 96.6 and 90.4%, respectively. Furthermore, the results of the present study revealed that patients with predicted non-recurrent OC survived significantly longer compared with the patients with predicted recurrent OC (P=6.598×10(−6)). A SVM classifier consisting of 39 feature genes was established for predicting the recurrence and prognosis of OC. Therefore, the results of the present study suggested that the 39 feature genes may serve important roles in the development of OC and may represent therapeutic biomarkers of OC. |
format | Online Article Text |
id | pubmed-6131358 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2018 |
publisher | D.A. Spandidos |
record_format | MEDLINE/PubMed |
spelling | pubmed-61313582018-09-14 Establishment of a SVM classifier to predict recurrence of ovarian cancer Zhou, Jinting Li, Lin Wang, Liling Li, Xiaofang Xing, Hui Cheng, Li Mol Med Rep Articles Gene expression data using retrieved ovarian cancer (OC) samples were used to identify genes of interest and a support vector machine (SVM) classifier was subsequently established to predict the recurrence of OC. Three datasets (GSE17260, GSE44104 and GSE51088) investigating OC gene expression were downloaded from the Gene Expression Omnibus. Differentially expressed genes (DEGs) in samples from patients with non-recurrent and recurrent OC were revealed via a homogeneity test and quality control analysis. A protein-protein interaction (PPI) network was subsequently established for the DEGs using data from Biological General Repository for Interaction Datasets, Human Protein Reference Database and Database of Interacting Proteins. Degrees of interaction and betweenness centrality (BC) scores were calculated for each node in the PPI network. The top 100 genes ranked by BC scores were selected to identify feature genes via recursive feature elimination using the GSE17260 dataset. Following this, a SVM classifier was constructed and further validated using the GSE44104 and GSE51088 datasets and independent gene expression data obtained from the Cancer Genome Atlas (TCGA). A total of 639 DEGs were identified from the three gene expression datasets, and a PPI network including 249 nodes and 354 edges was constructed. A SVM classifier consisting of 39 feature genes (including cullin 3, mouse double minute 2 homolog, aurora kinase A, WW domain containing oxidoreducatase, large tumor suppressor kinase 2, sirtuin 6, staphylococcal nuclease and tudor domain containing 1, leucine rich repeats and immunoglobulin like domains 1 and aurora kinase 1 interacting protein 1) was subsequently constructed. The prediction accuracies of the SVM classifier for GSE17260, GSE44104 and GSE51088 datasets as well as data downloaded from TCGA were revealed to be 92.7, 93.3, 96.6 and 90.4%, respectively. Furthermore, the results of the present study revealed that patients with predicted non-recurrent OC survived significantly longer compared with the patients with predicted recurrent OC (P=6.598×10(−6)). A SVM classifier consisting of 39 feature genes was established for predicting the recurrence and prognosis of OC. Therefore, the results of the present study suggested that the 39 feature genes may serve important roles in the development of OC and may represent therapeutic biomarkers of OC. D.A. Spandidos 2018-10 2018-08-08 /pmc/articles/PMC6131358/ /pubmed/30106117 http://dx.doi.org/10.3892/mmr.2018.9362 Text en Copyright: © Zhou et al. This is an open access article distributed under the terms of the Creative Commons Attribution-NonCommercial-NoDerivs License (https://creativecommons.org/licenses/by-nc-nd/4.0/) , which permits use and distribution in any medium, provided the original work is properly cited, the use is non-commercial and no modifications or adaptations are made. |
spellingShingle | Articles Zhou, Jinting Li, Lin Wang, Liling Li, Xiaofang Xing, Hui Cheng, Li Establishment of a SVM classifier to predict recurrence of ovarian cancer |
title | Establishment of a SVM classifier to predict recurrence of ovarian cancer |
title_full | Establishment of a SVM classifier to predict recurrence of ovarian cancer |
title_fullStr | Establishment of a SVM classifier to predict recurrence of ovarian cancer |
title_full_unstemmed | Establishment of a SVM classifier to predict recurrence of ovarian cancer |
title_short | Establishment of a SVM classifier to predict recurrence of ovarian cancer |
title_sort | establishment of a svm classifier to predict recurrence of ovarian cancer |
topic | Articles |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6131358/ https://www.ncbi.nlm.nih.gov/pubmed/30106117 http://dx.doi.org/10.3892/mmr.2018.9362 |
work_keys_str_mv | AT zhoujinting establishmentofasvmclassifiertopredictrecurrenceofovariancancer AT lilin establishmentofasvmclassifiertopredictrecurrenceofovariancancer AT wangliling establishmentofasvmclassifiertopredictrecurrenceofovariancancer AT lixiaofang establishmentofasvmclassifiertopredictrecurrenceofovariancancer AT xinghui establishmentofasvmclassifiertopredictrecurrenceofovariancancer AT chengli establishmentofasvmclassifiertopredictrecurrenceofovariancancer |