Cargando…

Feature genes in metastatic breast cancer identified by MetaDE and SVM classifier methods

The aim of the present study was to investigate the feature genes in metastatic breast cancer samples. A total of 5 expression profiles of metastatic breast cancer samples were downloaded from the Gene Expression Omnibus database, which were then analyzed using the MetaQC and MetaDE packages in R la...

Descripción completa

Detalles Bibliográficos
Autores principales: Tuo, Youlin, An, Ning, Zhang, Ming
Formato: Online Artículo Texto
Lenguaje:English
Publicado: D.A. Spandidos 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5802200/
https://www.ncbi.nlm.nih.gov/pubmed/29328377
http://dx.doi.org/10.3892/mmr.2018.8398
_version_ 1783298499018752000
author Tuo, Youlin
An, Ning
Zhang, Ming
author_facet Tuo, Youlin
An, Ning
Zhang, Ming
author_sort Tuo, Youlin
collection PubMed
description The aim of the present study was to investigate the feature genes in metastatic breast cancer samples. A total of 5 expression profiles of metastatic breast cancer samples were downloaded from the Gene Expression Omnibus database, which were then analyzed using the MetaQC and MetaDE packages in R language. The feature genes between metastasis and non-metastasis samples were screened under the threshold of P<0.05. Based on the protein-protein interactions (PPIs) in the Biological General Repository for Interaction Datasets, Human Protein Reference Database and Biomolecular Interaction Network Database, the PPI network of the feature genes was constructed. The feature genes identified by topological characteristics were then used for support vector machine (SVM) classifier training and verification. The accuracy of the SVM classifier was then evaluated using another independent dataset from The Cancer Genome Atlas database. Finally, function and pathway enrichment analyses for genes in the SVM classifier were performed. A total of 541 feature genes were identified between metastatic and non-metastatic samples. The top 10 genes with the highest betweenness centrality values in the PPI network of feature genes were Nuclear RNA Export Factor 1, cyclin-dependent kinase 2 (CDK2), myelocytomatosis proto-oncogene protein (MYC), Cullin 5, SHC Adaptor Protein 1, Clathrin heavy chain, Nucleolin, WD repeat domain 1, proteasome 26S subunit non-ATPase 2 and telomeric repeat binding factor 2. The cyclin-dependent kinase inhibitor 1A (CDKN1A), E2F transcription factor 1 (E2F1), and MYC interacted with CDK2. The SVM classifier constructed by the top 30 feature genes was able to distinguish metastatic samples from non-metastatic samples [correct rate, specificity, positive predictive value and negative predictive value >0.89; sensitivity >0.84; area under the receiver operating characteristic curve (AUROC) >0.96]. The verification of the SVM classifier in an independent dataset (35 metastatic samples and 143 non-metastatic samples) revealed an accuracy of 94.38% and AUROC of 0.958. Cell cycle associated functions and pathways were the most significant terms of the 30 feature genes. A SVM classifier was constructed to assess the possibility of breast cancer metastasis, which presented high accuracy in several independent datasets. CDK2, CDKN1A, E2F1 and MYC were indicated as the potential feature genes in metastatic breast cancer.
format Online
Article
Text
id pubmed-5802200
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher D.A. Spandidos
record_format MEDLINE/PubMed
spelling pubmed-58022002018-02-26 Feature genes in metastatic breast cancer identified by MetaDE and SVM classifier methods Tuo, Youlin An, Ning Zhang, Ming Mol Med Rep Articles The aim of the present study was to investigate the feature genes in metastatic breast cancer samples. A total of 5 expression profiles of metastatic breast cancer samples were downloaded from the Gene Expression Omnibus database, which were then analyzed using the MetaQC and MetaDE packages in R language. The feature genes between metastasis and non-metastasis samples were screened under the threshold of P<0.05. Based on the protein-protein interactions (PPIs) in the Biological General Repository for Interaction Datasets, Human Protein Reference Database and Biomolecular Interaction Network Database, the PPI network of the feature genes was constructed. The feature genes identified by topological characteristics were then used for support vector machine (SVM) classifier training and verification. The accuracy of the SVM classifier was then evaluated using another independent dataset from The Cancer Genome Atlas database. Finally, function and pathway enrichment analyses for genes in the SVM classifier were performed. A total of 541 feature genes were identified between metastatic and non-metastatic samples. The top 10 genes with the highest betweenness centrality values in the PPI network of feature genes were Nuclear RNA Export Factor 1, cyclin-dependent kinase 2 (CDK2), myelocytomatosis proto-oncogene protein (MYC), Cullin 5, SHC Adaptor Protein 1, Clathrin heavy chain, Nucleolin, WD repeat domain 1, proteasome 26S subunit non-ATPase 2 and telomeric repeat binding factor 2. The cyclin-dependent kinase inhibitor 1A (CDKN1A), E2F transcription factor 1 (E2F1), and MYC interacted with CDK2. The SVM classifier constructed by the top 30 feature genes was able to distinguish metastatic samples from non-metastatic samples [correct rate, specificity, positive predictive value and negative predictive value >0.89; sensitivity >0.84; area under the receiver operating characteristic curve (AUROC) >0.96]. The verification of the SVM classifier in an independent dataset (35 metastatic samples and 143 non-metastatic samples) revealed an accuracy of 94.38% and AUROC of 0.958. Cell cycle associated functions and pathways were the most significant terms of the 30 feature genes. A SVM classifier was constructed to assess the possibility of breast cancer metastasis, which presented high accuracy in several independent datasets. CDK2, CDKN1A, E2F1 and MYC were indicated as the potential feature genes in metastatic breast cancer. D.A. Spandidos 2018-03 2018-01-09 /pmc/articles/PMC5802200/ /pubmed/29328377 http://dx.doi.org/10.3892/mmr.2018.8398 Text en Copyright: © Tuo et al. This is an open access article distributed under the terms of the Creative Commons Attribution-NonCommercial-NoDerivs License (https://creativecommons.org/licenses/by-nc-nd/4.0/) , which permits use and distribution in any medium, provided the original work is properly cited, the use is non-commercial and no modifications or adaptations are made.
spellingShingle Articles
Tuo, Youlin
An, Ning
Zhang, Ming
Feature genes in metastatic breast cancer identified by MetaDE and SVM classifier methods
title Feature genes in metastatic breast cancer identified by MetaDE and SVM classifier methods
title_full Feature genes in metastatic breast cancer identified by MetaDE and SVM classifier methods
title_fullStr Feature genes in metastatic breast cancer identified by MetaDE and SVM classifier methods
title_full_unstemmed Feature genes in metastatic breast cancer identified by MetaDE and SVM classifier methods
title_short Feature genes in metastatic breast cancer identified by MetaDE and SVM classifier methods
title_sort feature genes in metastatic breast cancer identified by metade and svm classifier methods
topic Articles
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5802200/
https://www.ncbi.nlm.nih.gov/pubmed/29328377
http://dx.doi.org/10.3892/mmr.2018.8398
work_keys_str_mv AT tuoyoulin featuregenesinmetastaticbreastcanceridentifiedbymetadeandsvmclassifiermethods
AT anning featuregenesinmetastaticbreastcanceridentifiedbymetadeandsvmclassifiermethods
AT zhangming featuregenesinmetastaticbreastcanceridentifiedbymetadeandsvmclassifiermethods