Cargando…

Feature Genes in Neuroblastoma Distinguishing High-Risk and Non-high-Risk Neuroblastoma Patients: Development and Validation Combining Random Forest With Artificial Neural Network

There is a significant difference in prognosis among different risk groups. Therefore, it is of great significance to correctly identify the risk grouping of children. Using the genomic data of neuroblastoma samples in public databases, we used GSE49710 as the training set data to calculate the feat...

Descripción completa

Detalles Bibliográficos
Autores principales: Yang, Sha, Zeng, Lingfeng, Jin, Xin, Lin, Huapeng, Song, Jianning
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9336509/
https://www.ncbi.nlm.nih.gov/pubmed/35911385
http://dx.doi.org/10.3389/fmed.2022.882348
_version_ 1784759554795372544
author Yang, Sha
Zeng, Lingfeng
Jin, Xin
Lin, Huapeng
Song, Jianning
author_facet Yang, Sha
Zeng, Lingfeng
Jin, Xin
Lin, Huapeng
Song, Jianning
author_sort Yang, Sha
collection PubMed
description There is a significant difference in prognosis among different risk groups. Therefore, it is of great significance to correctly identify the risk grouping of children. Using the genomic data of neuroblastoma samples in public databases, we used GSE49710 as the training set data to calculate the feature genes of the high-risk group and non-high-risk group samples based on the random forest (RF) algorithm and artificial neural network (ANN) algorithm. The screening results of RF showed that EPS8L1, PLCD4, CHD5, NTRK1, and SLC22A4 were the feature differentially expressed genes (DEGs) of high-risk neuroblastoma. The prediction model based on gene expression data in this study showed high overall accuracy and precision in both the training set and the test set (AUC = 0.998 in GSE49710 and AUC = 0.858 in GSE73517). Kaplan–Meier plotter showed that the overall survival and progression-free survival of patients in the low-risk subgroup were significantly better than those in the high-risk subgroup [HR: 3.86 (95% CI: 2.44–6.10) and HR: 3.03 (95% CI: 2.03–4.52), respectively]. Our ANN-based model has better classification performance than the SVM-based model and XGboost-based model. Nevertheless, more convincing data sets and machine learning algorithms will be needed to build diagnostic models for individual organization types in the future.
format Online
Article
Text
id pubmed-9336509
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-93365092022-07-30 Feature Genes in Neuroblastoma Distinguishing High-Risk and Non-high-Risk Neuroblastoma Patients: Development and Validation Combining Random Forest With Artificial Neural Network Yang, Sha Zeng, Lingfeng Jin, Xin Lin, Huapeng Song, Jianning Front Med (Lausanne) Medicine There is a significant difference in prognosis among different risk groups. Therefore, it is of great significance to correctly identify the risk grouping of children. Using the genomic data of neuroblastoma samples in public databases, we used GSE49710 as the training set data to calculate the feature genes of the high-risk group and non-high-risk group samples based on the random forest (RF) algorithm and artificial neural network (ANN) algorithm. The screening results of RF showed that EPS8L1, PLCD4, CHD5, NTRK1, and SLC22A4 were the feature differentially expressed genes (DEGs) of high-risk neuroblastoma. The prediction model based on gene expression data in this study showed high overall accuracy and precision in both the training set and the test set (AUC = 0.998 in GSE49710 and AUC = 0.858 in GSE73517). Kaplan–Meier plotter showed that the overall survival and progression-free survival of patients in the low-risk subgroup were significantly better than those in the high-risk subgroup [HR: 3.86 (95% CI: 2.44–6.10) and HR: 3.03 (95% CI: 2.03–4.52), respectively]. Our ANN-based model has better classification performance than the SVM-based model and XGboost-based model. Nevertheless, more convincing data sets and machine learning algorithms will be needed to build diagnostic models for individual organization types in the future. Frontiers Media S.A. 2022-07-15 /pmc/articles/PMC9336509/ /pubmed/35911385 http://dx.doi.org/10.3389/fmed.2022.882348 Text en Copyright © 2022 Yang, Zeng, Jin, Lin and Song. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Medicine
Yang, Sha
Zeng, Lingfeng
Jin, Xin
Lin, Huapeng
Song, Jianning
Feature Genes in Neuroblastoma Distinguishing High-Risk and Non-high-Risk Neuroblastoma Patients: Development and Validation Combining Random Forest With Artificial Neural Network
title Feature Genes in Neuroblastoma Distinguishing High-Risk and Non-high-Risk Neuroblastoma Patients: Development and Validation Combining Random Forest With Artificial Neural Network
title_full Feature Genes in Neuroblastoma Distinguishing High-Risk and Non-high-Risk Neuroblastoma Patients: Development and Validation Combining Random Forest With Artificial Neural Network
title_fullStr Feature Genes in Neuroblastoma Distinguishing High-Risk and Non-high-Risk Neuroblastoma Patients: Development and Validation Combining Random Forest With Artificial Neural Network
title_full_unstemmed Feature Genes in Neuroblastoma Distinguishing High-Risk and Non-high-Risk Neuroblastoma Patients: Development and Validation Combining Random Forest With Artificial Neural Network
title_short Feature Genes in Neuroblastoma Distinguishing High-Risk and Non-high-Risk Neuroblastoma Patients: Development and Validation Combining Random Forest With Artificial Neural Network
title_sort feature genes in neuroblastoma distinguishing high-risk and non-high-risk neuroblastoma patients: development and validation combining random forest with artificial neural network
topic Medicine
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9336509/
https://www.ncbi.nlm.nih.gov/pubmed/35911385
http://dx.doi.org/10.3389/fmed.2022.882348
work_keys_str_mv AT yangsha featuregenesinneuroblastomadistinguishinghighriskandnonhighriskneuroblastomapatientsdevelopmentandvalidationcombiningrandomforestwithartificialneuralnetwork
AT zenglingfeng featuregenesinneuroblastomadistinguishinghighriskandnonhighriskneuroblastomapatientsdevelopmentandvalidationcombiningrandomforestwithartificialneuralnetwork
AT jinxin featuregenesinneuroblastomadistinguishinghighriskandnonhighriskneuroblastomapatientsdevelopmentandvalidationcombiningrandomforestwithartificialneuralnetwork
AT linhuapeng featuregenesinneuroblastomadistinguishinghighriskandnonhighriskneuroblastomapatientsdevelopmentandvalidationcombiningrandomforestwithartificialneuralnetwork
AT songjianning featuregenesinneuroblastomadistinguishinghighriskandnonhighriskneuroblastomapatientsdevelopmentandvalidationcombiningrandomforestwithartificialneuralnetwork