Cargando…

Predicting the risk of nodular thyroid disease in coal miners based on different machine learning models

BACKGROUND: Nodular thyroid disease is by far the most common thyroid disease and is closely associated with the development of thyroid cancer. Coal miners with chronic coal dust exposure are at higher risk of developing nodular thyroid disease. There are few studies that use machine learning models...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhao, Feng, Zhang, Hongzhen, Cheng, Danqing, Wang, Wenping, Li, Yongtian, Wang, Yisong, Lu, Dekun, Dong, Chunhui, Ren, Dingfei, Yang, Lixin
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9732087/
https://www.ncbi.nlm.nih.gov/pubmed/36507527
http://dx.doi.org/10.3389/fmed.2022.1037944
_version_ 1784846051664986112
author Zhao, Feng
Zhang, Hongzhen
Cheng, Danqing
Wang, Wenping
Li, Yongtian
Wang, Yisong
Lu, Dekun
Dong, Chunhui
Ren, Dingfei
Yang, Lixin
author_facet Zhao, Feng
Zhang, Hongzhen
Cheng, Danqing
Wang, Wenping
Li, Yongtian
Wang, Yisong
Lu, Dekun
Dong, Chunhui
Ren, Dingfei
Yang, Lixin
author_sort Zhao, Feng
collection PubMed
description BACKGROUND: Nodular thyroid disease is by far the most common thyroid disease and is closely associated with the development of thyroid cancer. Coal miners with chronic coal dust exposure are at higher risk of developing nodular thyroid disease. There are few studies that use machine learning models to predict the occurrence of nodular thyroid disease in coal miners. The aim of this study was to predict the high risk of nodular thyroid disease in coal miners based on five different Machine learning (ML) models. METHODS: This is a retrospective clinical study in which 1,708 coal miners who were examined at the Huaihe Energy Occupational Disease Control Hospital in Anhui Province in April 2021 were selected and their clinical physical examination data, including general information, laboratory tests and imaging findings, were collected. A synthetic minority oversampling technique (SMOTE) was used for sample balancing, and the data set was randomly split into a training and Test dataset in a ratio of 8:2. Lasso regression and correlation heat map were used to screen the predictors of the models, and five ML models, including Extreme Gradient Augmentation (XGBoost), Logistic Classification (LR), Gaussian Parsimonious Bayesian Classification (GNB), Neural Network Classification (MLP), and Complementary Parsimonious Bayesian Classification (CNB) for their predictive efficacy, and the model with the highest AUC was selected as the optimal model for predicting the occurrence of nodular thyroid disease in coal miners. RESULT: Lasso regression analysis showed Age, H-DLC, HCT, MCH, PLT, and GGT as predictor variables for the ML models; in addition, heat maps showed no significant correlation between the six variables. In the prediction of nodular thyroid disease, the AUC results of the five ML models, XGBoost (0.892), LR (0.577), GNB (0.603), MLP (0.601), and CNB (0.543), with the XGBoost model having the largest AUC, the model can be applied in clinical practice. CONCLUSION: In this research, all five ML models were found to predict the risk of nodular thyroid disease in coal miners, with the XGBoost model having the best overall predictive performance. The model can assist clinicians in quickly and accurately predicting the occurrence of nodular thyroid disease in coal miners, and in adopting individualized clinical prevention and treatment strategies.
format Online
Article
Text
id pubmed-9732087
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-97320872022-12-10 Predicting the risk of nodular thyroid disease in coal miners based on different machine learning models Zhao, Feng Zhang, Hongzhen Cheng, Danqing Wang, Wenping Li, Yongtian Wang, Yisong Lu, Dekun Dong, Chunhui Ren, Dingfei Yang, Lixin Front Med (Lausanne) Medicine BACKGROUND: Nodular thyroid disease is by far the most common thyroid disease and is closely associated with the development of thyroid cancer. Coal miners with chronic coal dust exposure are at higher risk of developing nodular thyroid disease. There are few studies that use machine learning models to predict the occurrence of nodular thyroid disease in coal miners. The aim of this study was to predict the high risk of nodular thyroid disease in coal miners based on five different Machine learning (ML) models. METHODS: This is a retrospective clinical study in which 1,708 coal miners who were examined at the Huaihe Energy Occupational Disease Control Hospital in Anhui Province in April 2021 were selected and their clinical physical examination data, including general information, laboratory tests and imaging findings, were collected. A synthetic minority oversampling technique (SMOTE) was used for sample balancing, and the data set was randomly split into a training and Test dataset in a ratio of 8:2. Lasso regression and correlation heat map were used to screen the predictors of the models, and five ML models, including Extreme Gradient Augmentation (XGBoost), Logistic Classification (LR), Gaussian Parsimonious Bayesian Classification (GNB), Neural Network Classification (MLP), and Complementary Parsimonious Bayesian Classification (CNB) for their predictive efficacy, and the model with the highest AUC was selected as the optimal model for predicting the occurrence of nodular thyroid disease in coal miners. RESULT: Lasso regression analysis showed Age, H-DLC, HCT, MCH, PLT, and GGT as predictor variables for the ML models; in addition, heat maps showed no significant correlation between the six variables. In the prediction of nodular thyroid disease, the AUC results of the five ML models, XGBoost (0.892), LR (0.577), GNB (0.603), MLP (0.601), and CNB (0.543), with the XGBoost model having the largest AUC, the model can be applied in clinical practice. CONCLUSION: In this research, all five ML models were found to predict the risk of nodular thyroid disease in coal miners, with the XGBoost model having the best overall predictive performance. The model can assist clinicians in quickly and accurately predicting the occurrence of nodular thyroid disease in coal miners, and in adopting individualized clinical prevention and treatment strategies. Frontiers Media S.A. 2022-11-25 /pmc/articles/PMC9732087/ /pubmed/36507527 http://dx.doi.org/10.3389/fmed.2022.1037944 Text en Copyright © 2022 Zhao, Zhang, Cheng, Wang, Li, Wang, Lu, Dong, Ren and Yang. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Medicine
Zhao, Feng
Zhang, Hongzhen
Cheng, Danqing
Wang, Wenping
Li, Yongtian
Wang, Yisong
Lu, Dekun
Dong, Chunhui
Ren, Dingfei
Yang, Lixin
Predicting the risk of nodular thyroid disease in coal miners based on different machine learning models
title Predicting the risk of nodular thyroid disease in coal miners based on different machine learning models
title_full Predicting the risk of nodular thyroid disease in coal miners based on different machine learning models
title_fullStr Predicting the risk of nodular thyroid disease in coal miners based on different machine learning models
title_full_unstemmed Predicting the risk of nodular thyroid disease in coal miners based on different machine learning models
title_short Predicting the risk of nodular thyroid disease in coal miners based on different machine learning models
title_sort predicting the risk of nodular thyroid disease in coal miners based on different machine learning models
topic Medicine
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9732087/
https://www.ncbi.nlm.nih.gov/pubmed/36507527
http://dx.doi.org/10.3389/fmed.2022.1037944
work_keys_str_mv AT zhaofeng predictingtheriskofnodularthyroiddiseaseincoalminersbasedondifferentmachinelearningmodels
AT zhanghongzhen predictingtheriskofnodularthyroiddiseaseincoalminersbasedondifferentmachinelearningmodels
AT chengdanqing predictingtheriskofnodularthyroiddiseaseincoalminersbasedondifferentmachinelearningmodels
AT wangwenping predictingtheriskofnodularthyroiddiseaseincoalminersbasedondifferentmachinelearningmodels
AT liyongtian predictingtheriskofnodularthyroiddiseaseincoalminersbasedondifferentmachinelearningmodels
AT wangyisong predictingtheriskofnodularthyroiddiseaseincoalminersbasedondifferentmachinelearningmodels
AT ludekun predictingtheriskofnodularthyroiddiseaseincoalminersbasedondifferentmachinelearningmodels
AT dongchunhui predictingtheriskofnodularthyroiddiseaseincoalminersbasedondifferentmachinelearningmodels
AT rendingfei predictingtheriskofnodularthyroiddiseaseincoalminersbasedondifferentmachinelearningmodels
AT yanglixin predictingtheriskofnodularthyroiddiseaseincoalminersbasedondifferentmachinelearningmodels