Cargando…
Machine Learning Models to Predict Primary Sites of Metastatic Cervical Carcinoma From Unknown Primary
Metastatic cervical carcinoma from unknown primary (MCCUP) accounts for 1–4% of all head and neck tumors, and identifying the primary site in MCCUP is challenging. The most common histopathological type of MCCUP is squamous cell carcinoma (SCC), and it remains difficult to identify the primary site...
Autores principales: | , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7779672/ https://www.ncbi.nlm.nih.gov/pubmed/33408743 http://dx.doi.org/10.3389/fgene.2020.614823 |
_version_ | 1783631380450639872 |
---|---|
author | Lu, Di Jiang, Jianjun Liu, Xiguang Wang, He Feng, Siyang Shi, Xiaoshun Wang, Zhizhi Chen, Zhiming Yan, Xuebin Wu, Hua Cai, Kaican |
author_facet | Lu, Di Jiang, Jianjun Liu, Xiguang Wang, He Feng, Siyang Shi, Xiaoshun Wang, Zhizhi Chen, Zhiming Yan, Xuebin Wu, Hua Cai, Kaican |
author_sort | Lu, Di |
collection | PubMed |
description | Metastatic cervical carcinoma from unknown primary (MCCUP) accounts for 1–4% of all head and neck tumors, and identifying the primary site in MCCUP is challenging. The most common histopathological type of MCCUP is squamous cell carcinoma (SCC), and it remains difficult to identify the primary site pathologically. Therefore, it seems necessary and urgent to develop novel and effective methods to determine the primary site in MCCUP. In the present study, the RNA sequencing data of four types of SCC and Pan-Cancer from the cancer genome atlas (TCGA) were obtained. And after data pre-processing, their differentially expressed genes (DEGs) were identified, respectively. Gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis indicated that these significantly changed genes of four types of SCC share lots of similar molecular functions and histological features. Then three machine learning models, [Random Forest (RF), support vector machine (SVM), and neural network (NN)] which consisted of ten genes to distinguish these four types of SCC were developed. Among the three models with prediction tests, the RF model worked best in the external validation set, with an overall predictive accuracy of 88.2%, sensitivity of 88.71%, and specificity of 95.42%. The NN model is the second in efficacy, with an overall accuracy of 82.02%, sensitivity of 81.23%, and specificity of 93.04%. The SVM model is the last, with an overall accuracy of 76.69%, sensitivity of 74.81%, and specificity of 90.84%. The present analysis of similarities and differences among the four types of SCC, and novel models developments for distinguishing four types of SCC with informatics methods shed lights on precision MCCUP diagnosis in the future. |
format | Online Article Text |
id | pubmed-7779672 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-77796722021-01-05 Machine Learning Models to Predict Primary Sites of Metastatic Cervical Carcinoma From Unknown Primary Lu, Di Jiang, Jianjun Liu, Xiguang Wang, He Feng, Siyang Shi, Xiaoshun Wang, Zhizhi Chen, Zhiming Yan, Xuebin Wu, Hua Cai, Kaican Front Genet Genetics Metastatic cervical carcinoma from unknown primary (MCCUP) accounts for 1–4% of all head and neck tumors, and identifying the primary site in MCCUP is challenging. The most common histopathological type of MCCUP is squamous cell carcinoma (SCC), and it remains difficult to identify the primary site pathologically. Therefore, it seems necessary and urgent to develop novel and effective methods to determine the primary site in MCCUP. In the present study, the RNA sequencing data of four types of SCC and Pan-Cancer from the cancer genome atlas (TCGA) were obtained. And after data pre-processing, their differentially expressed genes (DEGs) were identified, respectively. Gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis indicated that these significantly changed genes of four types of SCC share lots of similar molecular functions and histological features. Then three machine learning models, [Random Forest (RF), support vector machine (SVM), and neural network (NN)] which consisted of ten genes to distinguish these four types of SCC were developed. Among the three models with prediction tests, the RF model worked best in the external validation set, with an overall predictive accuracy of 88.2%, sensitivity of 88.71%, and specificity of 95.42%. The NN model is the second in efficacy, with an overall accuracy of 82.02%, sensitivity of 81.23%, and specificity of 93.04%. The SVM model is the last, with an overall accuracy of 76.69%, sensitivity of 74.81%, and specificity of 90.84%. The present analysis of similarities and differences among the four types of SCC, and novel models developments for distinguishing four types of SCC with informatics methods shed lights on precision MCCUP diagnosis in the future. Frontiers Media S.A. 2020-12-21 /pmc/articles/PMC7779672/ /pubmed/33408743 http://dx.doi.org/10.3389/fgene.2020.614823 Text en Copyright © 2020 Lu, Jiang, Liu, Wang, Feng, Shi, Wang, Chen, Yan, Wu and Cai. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Genetics Lu, Di Jiang, Jianjun Liu, Xiguang Wang, He Feng, Siyang Shi, Xiaoshun Wang, Zhizhi Chen, Zhiming Yan, Xuebin Wu, Hua Cai, Kaican Machine Learning Models to Predict Primary Sites of Metastatic Cervical Carcinoma From Unknown Primary |
title | Machine Learning Models to Predict Primary Sites of Metastatic Cervical Carcinoma From Unknown Primary |
title_full | Machine Learning Models to Predict Primary Sites of Metastatic Cervical Carcinoma From Unknown Primary |
title_fullStr | Machine Learning Models to Predict Primary Sites of Metastatic Cervical Carcinoma From Unknown Primary |
title_full_unstemmed | Machine Learning Models to Predict Primary Sites of Metastatic Cervical Carcinoma From Unknown Primary |
title_short | Machine Learning Models to Predict Primary Sites of Metastatic Cervical Carcinoma From Unknown Primary |
title_sort | machine learning models to predict primary sites of metastatic cervical carcinoma from unknown primary |
topic | Genetics |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7779672/ https://www.ncbi.nlm.nih.gov/pubmed/33408743 http://dx.doi.org/10.3389/fgene.2020.614823 |
work_keys_str_mv | AT ludi machinelearningmodelstopredictprimarysitesofmetastaticcervicalcarcinomafromunknownprimary AT jiangjianjun machinelearningmodelstopredictprimarysitesofmetastaticcervicalcarcinomafromunknownprimary AT liuxiguang machinelearningmodelstopredictprimarysitesofmetastaticcervicalcarcinomafromunknownprimary AT wanghe machinelearningmodelstopredictprimarysitesofmetastaticcervicalcarcinomafromunknownprimary AT fengsiyang machinelearningmodelstopredictprimarysitesofmetastaticcervicalcarcinomafromunknownprimary AT shixiaoshun machinelearningmodelstopredictprimarysitesofmetastaticcervicalcarcinomafromunknownprimary AT wangzhizhi machinelearningmodelstopredictprimarysitesofmetastaticcervicalcarcinomafromunknownprimary AT chenzhiming machinelearningmodelstopredictprimarysitesofmetastaticcervicalcarcinomafromunknownprimary AT yanxuebin machinelearningmodelstopredictprimarysitesofmetastaticcervicalcarcinomafromunknownprimary AT wuhua machinelearningmodelstopredictprimarysitesofmetastaticcervicalcarcinomafromunknownprimary AT caikaican machinelearningmodelstopredictprimarysitesofmetastaticcervicalcarcinomafromunknownprimary |