Cargando…

A united model for diagnosing pulmonary tuberculosis with random forest and artificial neural network

Background: Pulmonary tuberculosis (PTB) is a chronic infectious disease and is the most common type of TB. Although the sputum smear test is a gold standard for diagnosing PTB, the method has numerous limitations, including low sensitivity, low specificity, and insufficient samples. Methods: The pr...

Descripción completa

Detalles Bibliográficos
Autores principales:	Zhu, Qingqing, Liu, Jie
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Frontiers Media S.A. 2023
Materias:	Genetics
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10033863/ https://www.ncbi.nlm.nih.gov/pubmed/36968608 http://dx.doi.org/10.3389/fgene.2023.1094099

_version_	1784911080280031232
author	Zhu, Qingqing Liu, Jie
author_facet	Zhu, Qingqing Liu, Jie
author_sort	Zhu, Qingqing
collection	PubMed
description	Background: Pulmonary tuberculosis (PTB) is a chronic infectious disease and is the most common type of TB. Although the sputum smear test is a gold standard for diagnosing PTB, the method has numerous limitations, including low sensitivity, low specificity, and insufficient samples. Methods: The present study aimed to identify specific biomarkers of PTB and construct a model for diagnosing PTB by combining random forest (RF) and artificial neural network (ANN) algorithms. Two publicly available cohorts of TB, namely, the GSE83456 (training) and GSE42834 (validation) cohorts, were retrieved from the Gene Expression Omnibus (GEO) database. A total of 45 and 61 differentially expressed genes (DEGs) were identified between the PTB and control samples, respectively, by screening the GSE83456 cohort. An RF classifier was used for identifying specific biomarkers, following which an ANN-based classification model was constructed for identifying PTB samples. The accuracy of the ANN model was validated using the receiver operating characteristic (ROC) curve. The proportion of 22 types of immunocytes in the PTB samples was measured using the CIBERSORT algorithm, and the correlations between the immunocytes were determined. Results: Differential analysis revealed that 11 and 22 DEGs were upregulated and downregulated, respectively, and 11 biomarkers specific to PTB were identified by the RF classifier. The weights of these biomarkers were determined and an ANN-based classification model was subsequently constructed. The model exhibited outstanding performance, as revealed by the area under the curve (AUC), which was 1.000 for the training cohort. The AUC of the validation cohort was 0.946, which further confirmed the accuracy of the model. Conclusion: Altogether, the present study successfully identified specific genetic biomarkers of PTB and constructed a highly accurate model for the diagnosis of PTB based on blood samples. The model developed herein can serve as a reliable reference for the early detection of PTB and provide novel perspectives into the pathogenesis of PTB.
format	Online Article Text
id	pubmed-10033863
institution	National Center for Biotechnology Information
language	English
publishDate	2023
publisher	Frontiers Media S.A.
record_format	MEDLINE/PubMed
spelling	pubmed-100338632023-03-24 A united model for diagnosing pulmonary tuberculosis with random forest and artificial neural network Zhu, Qingqing Liu, Jie Front Genet Genetics Background: Pulmonary tuberculosis (PTB) is a chronic infectious disease and is the most common type of TB. Although the sputum smear test is a gold standard for diagnosing PTB, the method has numerous limitations, including low sensitivity, low specificity, and insufficient samples. Methods: The present study aimed to identify specific biomarkers of PTB and construct a model for diagnosing PTB by combining random forest (RF) and artificial neural network (ANN) algorithms. Two publicly available cohorts of TB, namely, the GSE83456 (training) and GSE42834 (validation) cohorts, were retrieved from the Gene Expression Omnibus (GEO) database. A total of 45 and 61 differentially expressed genes (DEGs) were identified between the PTB and control samples, respectively, by screening the GSE83456 cohort. An RF classifier was used for identifying specific biomarkers, following which an ANN-based classification model was constructed for identifying PTB samples. The accuracy of the ANN model was validated using the receiver operating characteristic (ROC) curve. The proportion of 22 types of immunocytes in the PTB samples was measured using the CIBERSORT algorithm, and the correlations between the immunocytes were determined. Results: Differential analysis revealed that 11 and 22 DEGs were upregulated and downregulated, respectively, and 11 biomarkers specific to PTB were identified by the RF classifier. The weights of these biomarkers were determined and an ANN-based classification model was subsequently constructed. The model exhibited outstanding performance, as revealed by the area under the curve (AUC), which was 1.000 for the training cohort. The AUC of the validation cohort was 0.946, which further confirmed the accuracy of the model. Conclusion: Altogether, the present study successfully identified specific genetic biomarkers of PTB and constructed a highly accurate model for the diagnosis of PTB based on blood samples. The model developed herein can serve as a reliable reference for the early detection of PTB and provide novel perspectives into the pathogenesis of PTB. Frontiers Media S.A. 2023-03-09 /pmc/articles/PMC10033863/ /pubmed/36968608 http://dx.doi.org/10.3389/fgene.2023.1094099 Text en Copyright © 2023 Zhu and Liu. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle	Genetics Zhu, Qingqing Liu, Jie A united model for diagnosing pulmonary tuberculosis with random forest and artificial neural network
title	A united model for diagnosing pulmonary tuberculosis with random forest and artificial neural network
title_full	A united model for diagnosing pulmonary tuberculosis with random forest and artificial neural network
title_fullStr	A united model for diagnosing pulmonary tuberculosis with random forest and artificial neural network
title_full_unstemmed	A united model for diagnosing pulmonary tuberculosis with random forest and artificial neural network
title_short	A united model for diagnosing pulmonary tuberculosis with random forest and artificial neural network
title_sort	united model for diagnosing pulmonary tuberculosis with random forest and artificial neural network
topic	Genetics
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10033863/ https://www.ncbi.nlm.nih.gov/pubmed/36968608 http://dx.doi.org/10.3389/fgene.2023.1094099
work_keys_str_mv	AT zhuqingqing aunitedmodelfordiagnosingpulmonarytuberculosiswithrandomforestandartificialneuralnetwork AT liujie aunitedmodelfordiagnosingpulmonarytuberculosiswithrandomforestandartificialneuralnetwork AT zhuqingqing unitedmodelfordiagnosingpulmonarytuberculosiswithrandomforestandartificialneuralnetwork AT liujie unitedmodelfordiagnosingpulmonarytuberculosiswithrandomforestandartificialneuralnetwork

A united model for diagnosing pulmonary tuberculosis with random forest and artificial neural network

Ejemplares similares