Cargando…
Feature genes in metastatic breast cancer identified by MetaDE and SVM classifier methods
The aim of the present study was to investigate the feature genes in metastatic breast cancer samples. A total of 5 expression profiles of metastatic breast cancer samples were downloaded from the Gene Expression Omnibus database, which were then analyzed using the MetaQC and MetaDE packages in R la...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
D.A. Spandidos
2018
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5802200/ https://www.ncbi.nlm.nih.gov/pubmed/29328377 http://dx.doi.org/10.3892/mmr.2018.8398 |
_version_ | 1783298499018752000 |
---|---|
author | Tuo, Youlin An, Ning Zhang, Ming |
author_facet | Tuo, Youlin An, Ning Zhang, Ming |
author_sort | Tuo, Youlin |
collection | PubMed |
description | The aim of the present study was to investigate the feature genes in metastatic breast cancer samples. A total of 5 expression profiles of metastatic breast cancer samples were downloaded from the Gene Expression Omnibus database, which were then analyzed using the MetaQC and MetaDE packages in R language. The feature genes between metastasis and non-metastasis samples were screened under the threshold of P<0.05. Based on the protein-protein interactions (PPIs) in the Biological General Repository for Interaction Datasets, Human Protein Reference Database and Biomolecular Interaction Network Database, the PPI network of the feature genes was constructed. The feature genes identified by topological characteristics were then used for support vector machine (SVM) classifier training and verification. The accuracy of the SVM classifier was then evaluated using another independent dataset from The Cancer Genome Atlas database. Finally, function and pathway enrichment analyses for genes in the SVM classifier were performed. A total of 541 feature genes were identified between metastatic and non-metastatic samples. The top 10 genes with the highest betweenness centrality values in the PPI network of feature genes were Nuclear RNA Export Factor 1, cyclin-dependent kinase 2 (CDK2), myelocytomatosis proto-oncogene protein (MYC), Cullin 5, SHC Adaptor Protein 1, Clathrin heavy chain, Nucleolin, WD repeat domain 1, proteasome 26S subunit non-ATPase 2 and telomeric repeat binding factor 2. The cyclin-dependent kinase inhibitor 1A (CDKN1A), E2F transcription factor 1 (E2F1), and MYC interacted with CDK2. The SVM classifier constructed by the top 30 feature genes was able to distinguish metastatic samples from non-metastatic samples [correct rate, specificity, positive predictive value and negative predictive value >0.89; sensitivity >0.84; area under the receiver operating characteristic curve (AUROC) >0.96]. The verification of the SVM classifier in an independent dataset (35 metastatic samples and 143 non-metastatic samples) revealed an accuracy of 94.38% and AUROC of 0.958. Cell cycle associated functions and pathways were the most significant terms of the 30 feature genes. A SVM classifier was constructed to assess the possibility of breast cancer metastasis, which presented high accuracy in several independent datasets. CDK2, CDKN1A, E2F1 and MYC were indicated as the potential feature genes in metastatic breast cancer. |
format | Online Article Text |
id | pubmed-5802200 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2018 |
publisher | D.A. Spandidos |
record_format | MEDLINE/PubMed |
spelling | pubmed-58022002018-02-26 Feature genes in metastatic breast cancer identified by MetaDE and SVM classifier methods Tuo, Youlin An, Ning Zhang, Ming Mol Med Rep Articles The aim of the present study was to investigate the feature genes in metastatic breast cancer samples. A total of 5 expression profiles of metastatic breast cancer samples were downloaded from the Gene Expression Omnibus database, which were then analyzed using the MetaQC and MetaDE packages in R language. The feature genes between metastasis and non-metastasis samples were screened under the threshold of P<0.05. Based on the protein-protein interactions (PPIs) in the Biological General Repository for Interaction Datasets, Human Protein Reference Database and Biomolecular Interaction Network Database, the PPI network of the feature genes was constructed. The feature genes identified by topological characteristics were then used for support vector machine (SVM) classifier training and verification. The accuracy of the SVM classifier was then evaluated using another independent dataset from The Cancer Genome Atlas database. Finally, function and pathway enrichment analyses for genes in the SVM classifier were performed. A total of 541 feature genes were identified between metastatic and non-metastatic samples. The top 10 genes with the highest betweenness centrality values in the PPI network of feature genes were Nuclear RNA Export Factor 1, cyclin-dependent kinase 2 (CDK2), myelocytomatosis proto-oncogene protein (MYC), Cullin 5, SHC Adaptor Protein 1, Clathrin heavy chain, Nucleolin, WD repeat domain 1, proteasome 26S subunit non-ATPase 2 and telomeric repeat binding factor 2. The cyclin-dependent kinase inhibitor 1A (CDKN1A), E2F transcription factor 1 (E2F1), and MYC interacted with CDK2. The SVM classifier constructed by the top 30 feature genes was able to distinguish metastatic samples from non-metastatic samples [correct rate, specificity, positive predictive value and negative predictive value >0.89; sensitivity >0.84; area under the receiver operating characteristic curve (AUROC) >0.96]. The verification of the SVM classifier in an independent dataset (35 metastatic samples and 143 non-metastatic samples) revealed an accuracy of 94.38% and AUROC of 0.958. Cell cycle associated functions and pathways were the most significant terms of the 30 feature genes. A SVM classifier was constructed to assess the possibility of breast cancer metastasis, which presented high accuracy in several independent datasets. CDK2, CDKN1A, E2F1 and MYC were indicated as the potential feature genes in metastatic breast cancer. D.A. Spandidos 2018-03 2018-01-09 /pmc/articles/PMC5802200/ /pubmed/29328377 http://dx.doi.org/10.3892/mmr.2018.8398 Text en Copyright: © Tuo et al. This is an open access article distributed under the terms of the Creative Commons Attribution-NonCommercial-NoDerivs License (https://creativecommons.org/licenses/by-nc-nd/4.0/) , which permits use and distribution in any medium, provided the original work is properly cited, the use is non-commercial and no modifications or adaptations are made. |
spellingShingle | Articles Tuo, Youlin An, Ning Zhang, Ming Feature genes in metastatic breast cancer identified by MetaDE and SVM classifier methods |
title | Feature genes in metastatic breast cancer identified by MetaDE and SVM classifier methods |
title_full | Feature genes in metastatic breast cancer identified by MetaDE and SVM classifier methods |
title_fullStr | Feature genes in metastatic breast cancer identified by MetaDE and SVM classifier methods |
title_full_unstemmed | Feature genes in metastatic breast cancer identified by MetaDE and SVM classifier methods |
title_short | Feature genes in metastatic breast cancer identified by MetaDE and SVM classifier methods |
title_sort | feature genes in metastatic breast cancer identified by metade and svm classifier methods |
topic | Articles |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5802200/ https://www.ncbi.nlm.nih.gov/pubmed/29328377 http://dx.doi.org/10.3892/mmr.2018.8398 |
work_keys_str_mv | AT tuoyoulin featuregenesinmetastaticbreastcanceridentifiedbymetadeandsvmclassifiermethods AT anning featuregenesinmetastaticbreastcanceridentifiedbymetadeandsvmclassifiermethods AT zhangming featuregenesinmetastaticbreastcanceridentifiedbymetadeandsvmclassifiermethods |