Cargando…
Feature Genes in Neuroblastoma Distinguishing High-Risk and Non-high-Risk Neuroblastoma Patients: Development and Validation Combining Random Forest With Artificial Neural Network
There is a significant difference in prognosis among different risk groups. Therefore, it is of great significance to correctly identify the risk grouping of children. Using the genomic data of neuroblastoma samples in public databases, we used GSE49710 as the training set data to calculate the feat...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9336509/ https://www.ncbi.nlm.nih.gov/pubmed/35911385 http://dx.doi.org/10.3389/fmed.2022.882348 |
_version_ | 1784759554795372544 |
---|---|
author | Yang, Sha Zeng, Lingfeng Jin, Xin Lin, Huapeng Song, Jianning |
author_facet | Yang, Sha Zeng, Lingfeng Jin, Xin Lin, Huapeng Song, Jianning |
author_sort | Yang, Sha |
collection | PubMed |
description | There is a significant difference in prognosis among different risk groups. Therefore, it is of great significance to correctly identify the risk grouping of children. Using the genomic data of neuroblastoma samples in public databases, we used GSE49710 as the training set data to calculate the feature genes of the high-risk group and non-high-risk group samples based on the random forest (RF) algorithm and artificial neural network (ANN) algorithm. The screening results of RF showed that EPS8L1, PLCD4, CHD5, NTRK1, and SLC22A4 were the feature differentially expressed genes (DEGs) of high-risk neuroblastoma. The prediction model based on gene expression data in this study showed high overall accuracy and precision in both the training set and the test set (AUC = 0.998 in GSE49710 and AUC = 0.858 in GSE73517). Kaplan–Meier plotter showed that the overall survival and progression-free survival of patients in the low-risk subgroup were significantly better than those in the high-risk subgroup [HR: 3.86 (95% CI: 2.44–6.10) and HR: 3.03 (95% CI: 2.03–4.52), respectively]. Our ANN-based model has better classification performance than the SVM-based model and XGboost-based model. Nevertheless, more convincing data sets and machine learning algorithms will be needed to build diagnostic models for individual organization types in the future. |
format | Online Article Text |
id | pubmed-9336509 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-93365092022-07-30 Feature Genes in Neuroblastoma Distinguishing High-Risk and Non-high-Risk Neuroblastoma Patients: Development and Validation Combining Random Forest With Artificial Neural Network Yang, Sha Zeng, Lingfeng Jin, Xin Lin, Huapeng Song, Jianning Front Med (Lausanne) Medicine There is a significant difference in prognosis among different risk groups. Therefore, it is of great significance to correctly identify the risk grouping of children. Using the genomic data of neuroblastoma samples in public databases, we used GSE49710 as the training set data to calculate the feature genes of the high-risk group and non-high-risk group samples based on the random forest (RF) algorithm and artificial neural network (ANN) algorithm. The screening results of RF showed that EPS8L1, PLCD4, CHD5, NTRK1, and SLC22A4 were the feature differentially expressed genes (DEGs) of high-risk neuroblastoma. The prediction model based on gene expression data in this study showed high overall accuracy and precision in both the training set and the test set (AUC = 0.998 in GSE49710 and AUC = 0.858 in GSE73517). Kaplan–Meier plotter showed that the overall survival and progression-free survival of patients in the low-risk subgroup were significantly better than those in the high-risk subgroup [HR: 3.86 (95% CI: 2.44–6.10) and HR: 3.03 (95% CI: 2.03–4.52), respectively]. Our ANN-based model has better classification performance than the SVM-based model and XGboost-based model. Nevertheless, more convincing data sets and machine learning algorithms will be needed to build diagnostic models for individual organization types in the future. Frontiers Media S.A. 2022-07-15 /pmc/articles/PMC9336509/ /pubmed/35911385 http://dx.doi.org/10.3389/fmed.2022.882348 Text en Copyright © 2022 Yang, Zeng, Jin, Lin and Song. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Medicine Yang, Sha Zeng, Lingfeng Jin, Xin Lin, Huapeng Song, Jianning Feature Genes in Neuroblastoma Distinguishing High-Risk and Non-high-Risk Neuroblastoma Patients: Development and Validation Combining Random Forest With Artificial Neural Network |
title | Feature Genes in Neuroblastoma Distinguishing High-Risk and Non-high-Risk Neuroblastoma Patients: Development and Validation Combining Random Forest With Artificial Neural Network |
title_full | Feature Genes in Neuroblastoma Distinguishing High-Risk and Non-high-Risk Neuroblastoma Patients: Development and Validation Combining Random Forest With Artificial Neural Network |
title_fullStr | Feature Genes in Neuroblastoma Distinguishing High-Risk and Non-high-Risk Neuroblastoma Patients: Development and Validation Combining Random Forest With Artificial Neural Network |
title_full_unstemmed | Feature Genes in Neuroblastoma Distinguishing High-Risk and Non-high-Risk Neuroblastoma Patients: Development and Validation Combining Random Forest With Artificial Neural Network |
title_short | Feature Genes in Neuroblastoma Distinguishing High-Risk and Non-high-Risk Neuroblastoma Patients: Development and Validation Combining Random Forest With Artificial Neural Network |
title_sort | feature genes in neuroblastoma distinguishing high-risk and non-high-risk neuroblastoma patients: development and validation combining random forest with artificial neural network |
topic | Medicine |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9336509/ https://www.ncbi.nlm.nih.gov/pubmed/35911385 http://dx.doi.org/10.3389/fmed.2022.882348 |
work_keys_str_mv | AT yangsha featuregenesinneuroblastomadistinguishinghighriskandnonhighriskneuroblastomapatientsdevelopmentandvalidationcombiningrandomforestwithartificialneuralnetwork AT zenglingfeng featuregenesinneuroblastomadistinguishinghighriskandnonhighriskneuroblastomapatientsdevelopmentandvalidationcombiningrandomforestwithartificialneuralnetwork AT jinxin featuregenesinneuroblastomadistinguishinghighriskandnonhighriskneuroblastomapatientsdevelopmentandvalidationcombiningrandomforestwithartificialneuralnetwork AT linhuapeng featuregenesinneuroblastomadistinguishinghighriskandnonhighriskneuroblastomapatientsdevelopmentandvalidationcombiningrandomforestwithartificialneuralnetwork AT songjianning featuregenesinneuroblastomadistinguishinghighriskandnonhighriskneuroblastomapatientsdevelopmentandvalidationcombiningrandomforestwithartificialneuralnetwork |