Cargando…
Predicting the risk of nodular thyroid disease in coal miners based on different machine learning models
BACKGROUND: Nodular thyroid disease is by far the most common thyroid disease and is closely associated with the development of thyroid cancer. Coal miners with chronic coal dust exposure are at higher risk of developing nodular thyroid disease. There are few studies that use machine learning models...
Autores principales: | , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9732087/ https://www.ncbi.nlm.nih.gov/pubmed/36507527 http://dx.doi.org/10.3389/fmed.2022.1037944 |
_version_ | 1784846051664986112 |
---|---|
author | Zhao, Feng Zhang, Hongzhen Cheng, Danqing Wang, Wenping Li, Yongtian Wang, Yisong Lu, Dekun Dong, Chunhui Ren, Dingfei Yang, Lixin |
author_facet | Zhao, Feng Zhang, Hongzhen Cheng, Danqing Wang, Wenping Li, Yongtian Wang, Yisong Lu, Dekun Dong, Chunhui Ren, Dingfei Yang, Lixin |
author_sort | Zhao, Feng |
collection | PubMed |
description | BACKGROUND: Nodular thyroid disease is by far the most common thyroid disease and is closely associated with the development of thyroid cancer. Coal miners with chronic coal dust exposure are at higher risk of developing nodular thyroid disease. There are few studies that use machine learning models to predict the occurrence of nodular thyroid disease in coal miners. The aim of this study was to predict the high risk of nodular thyroid disease in coal miners based on five different Machine learning (ML) models. METHODS: This is a retrospective clinical study in which 1,708 coal miners who were examined at the Huaihe Energy Occupational Disease Control Hospital in Anhui Province in April 2021 were selected and their clinical physical examination data, including general information, laboratory tests and imaging findings, were collected. A synthetic minority oversampling technique (SMOTE) was used for sample balancing, and the data set was randomly split into a training and Test dataset in a ratio of 8:2. Lasso regression and correlation heat map were used to screen the predictors of the models, and five ML models, including Extreme Gradient Augmentation (XGBoost), Logistic Classification (LR), Gaussian Parsimonious Bayesian Classification (GNB), Neural Network Classification (MLP), and Complementary Parsimonious Bayesian Classification (CNB) for their predictive efficacy, and the model with the highest AUC was selected as the optimal model for predicting the occurrence of nodular thyroid disease in coal miners. RESULT: Lasso regression analysis showed Age, H-DLC, HCT, MCH, PLT, and GGT as predictor variables for the ML models; in addition, heat maps showed no significant correlation between the six variables. In the prediction of nodular thyroid disease, the AUC results of the five ML models, XGBoost (0.892), LR (0.577), GNB (0.603), MLP (0.601), and CNB (0.543), with the XGBoost model having the largest AUC, the model can be applied in clinical practice. CONCLUSION: In this research, all five ML models were found to predict the risk of nodular thyroid disease in coal miners, with the XGBoost model having the best overall predictive performance. The model can assist clinicians in quickly and accurately predicting the occurrence of nodular thyroid disease in coal miners, and in adopting individualized clinical prevention and treatment strategies. |
format | Online Article Text |
id | pubmed-9732087 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-97320872022-12-10 Predicting the risk of nodular thyroid disease in coal miners based on different machine learning models Zhao, Feng Zhang, Hongzhen Cheng, Danqing Wang, Wenping Li, Yongtian Wang, Yisong Lu, Dekun Dong, Chunhui Ren, Dingfei Yang, Lixin Front Med (Lausanne) Medicine BACKGROUND: Nodular thyroid disease is by far the most common thyroid disease and is closely associated with the development of thyroid cancer. Coal miners with chronic coal dust exposure are at higher risk of developing nodular thyroid disease. There are few studies that use machine learning models to predict the occurrence of nodular thyroid disease in coal miners. The aim of this study was to predict the high risk of nodular thyroid disease in coal miners based on five different Machine learning (ML) models. METHODS: This is a retrospective clinical study in which 1,708 coal miners who were examined at the Huaihe Energy Occupational Disease Control Hospital in Anhui Province in April 2021 were selected and their clinical physical examination data, including general information, laboratory tests and imaging findings, were collected. A synthetic minority oversampling technique (SMOTE) was used for sample balancing, and the data set was randomly split into a training and Test dataset in a ratio of 8:2. Lasso regression and correlation heat map were used to screen the predictors of the models, and five ML models, including Extreme Gradient Augmentation (XGBoost), Logistic Classification (LR), Gaussian Parsimonious Bayesian Classification (GNB), Neural Network Classification (MLP), and Complementary Parsimonious Bayesian Classification (CNB) for their predictive efficacy, and the model with the highest AUC was selected as the optimal model for predicting the occurrence of nodular thyroid disease in coal miners. RESULT: Lasso regression analysis showed Age, H-DLC, HCT, MCH, PLT, and GGT as predictor variables for the ML models; in addition, heat maps showed no significant correlation between the six variables. In the prediction of nodular thyroid disease, the AUC results of the five ML models, XGBoost (0.892), LR (0.577), GNB (0.603), MLP (0.601), and CNB (0.543), with the XGBoost model having the largest AUC, the model can be applied in clinical practice. CONCLUSION: In this research, all five ML models were found to predict the risk of nodular thyroid disease in coal miners, with the XGBoost model having the best overall predictive performance. The model can assist clinicians in quickly and accurately predicting the occurrence of nodular thyroid disease in coal miners, and in adopting individualized clinical prevention and treatment strategies. Frontiers Media S.A. 2022-11-25 /pmc/articles/PMC9732087/ /pubmed/36507527 http://dx.doi.org/10.3389/fmed.2022.1037944 Text en Copyright © 2022 Zhao, Zhang, Cheng, Wang, Li, Wang, Lu, Dong, Ren and Yang. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Medicine Zhao, Feng Zhang, Hongzhen Cheng, Danqing Wang, Wenping Li, Yongtian Wang, Yisong Lu, Dekun Dong, Chunhui Ren, Dingfei Yang, Lixin Predicting the risk of nodular thyroid disease in coal miners based on different machine learning models |
title | Predicting the risk of nodular thyroid disease in coal miners based on different machine learning models |
title_full | Predicting the risk of nodular thyroid disease in coal miners based on different machine learning models |
title_fullStr | Predicting the risk of nodular thyroid disease in coal miners based on different machine learning models |
title_full_unstemmed | Predicting the risk of nodular thyroid disease in coal miners based on different machine learning models |
title_short | Predicting the risk of nodular thyroid disease in coal miners based on different machine learning models |
title_sort | predicting the risk of nodular thyroid disease in coal miners based on different machine learning models |
topic | Medicine |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9732087/ https://www.ncbi.nlm.nih.gov/pubmed/36507527 http://dx.doi.org/10.3389/fmed.2022.1037944 |
work_keys_str_mv | AT zhaofeng predictingtheriskofnodularthyroiddiseaseincoalminersbasedondifferentmachinelearningmodels AT zhanghongzhen predictingtheriskofnodularthyroiddiseaseincoalminersbasedondifferentmachinelearningmodels AT chengdanqing predictingtheriskofnodularthyroiddiseaseincoalminersbasedondifferentmachinelearningmodels AT wangwenping predictingtheriskofnodularthyroiddiseaseincoalminersbasedondifferentmachinelearningmodels AT liyongtian predictingtheriskofnodularthyroiddiseaseincoalminersbasedondifferentmachinelearningmodels AT wangyisong predictingtheriskofnodularthyroiddiseaseincoalminersbasedondifferentmachinelearningmodels AT ludekun predictingtheriskofnodularthyroiddiseaseincoalminersbasedondifferentmachinelearningmodels AT dongchunhui predictingtheriskofnodularthyroiddiseaseincoalminersbasedondifferentmachinelearningmodels AT rendingfei predictingtheriskofnodularthyroiddiseaseincoalminersbasedondifferentmachinelearningmodels AT yanglixin predictingtheriskofnodularthyroiddiseaseincoalminersbasedondifferentmachinelearningmodels |