Cargando…
Prediction of Fatty Liver Disease in a Chinese Population Using Machine-Learning Algorithms
Background: Fatty liver disease (FLD) is an important risk factor for liver cancer and cardiovascular disease and can lead to significant social and economic burden. However, there is currently no nationwide epidemiological survey for FLD in China, making early FLD screening crucial for the Chinese...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10047083/ https://www.ncbi.nlm.nih.gov/pubmed/36980476 http://dx.doi.org/10.3390/diagnostics13061168 |
_version_ | 1785013832967520256 |
---|---|
author | Weng, Shuwei Hu, Die Chen, Jin Yang, Yanyi Peng, Daoquan |
author_facet | Weng, Shuwei Hu, Die Chen, Jin Yang, Yanyi Peng, Daoquan |
author_sort | Weng, Shuwei |
collection | PubMed |
description | Background: Fatty liver disease (FLD) is an important risk factor for liver cancer and cardiovascular disease and can lead to significant social and economic burden. However, there is currently no nationwide epidemiological survey for FLD in China, making early FLD screening crucial for the Chinese population. Unfortunately, liver biopsy and abdominal ultrasound, the preferred methods for FLD diagnosis, are not practical for primary medical institutions. Therefore, the aim of this study was to develop machine learning (ML) models for screening individuals at high risk of FLD, and to provide a new perspective on early FLD diagnosis. Methods: This study included a total of 30,574 individuals between the ages of 18 and 70 who completed abdominal ultrasound and the related clinical examinations. Among them, 3474 individuals were diagnosed with FLD by abdominal ultrasound. We used 11 indicators to build eight classification models to predict FLD. The model prediction ability was evaluated by the area under the curve, sensitivity, specificity, positive predictive value, negative predictive value, and kappa value. Feature importance analysis was assessed by Shapley value or root mean square error loss after permutations. Results: Among the eight ML models, the prediction accuracy of the extreme gradient boosting (XGBoost) model was highest at 89.77%. By feature importance analysis, we found that the body mass index, triglyceride, and alanine aminotransferase play important roles in FLD prediction. Conclusion: XGBoost improves the efficiency and cost of large-scale FLD screening. |
format | Online Article Text |
id | pubmed-10047083 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-100470832023-03-29 Prediction of Fatty Liver Disease in a Chinese Population Using Machine-Learning Algorithms Weng, Shuwei Hu, Die Chen, Jin Yang, Yanyi Peng, Daoquan Diagnostics (Basel) Article Background: Fatty liver disease (FLD) is an important risk factor for liver cancer and cardiovascular disease and can lead to significant social and economic burden. However, there is currently no nationwide epidemiological survey for FLD in China, making early FLD screening crucial for the Chinese population. Unfortunately, liver biopsy and abdominal ultrasound, the preferred methods for FLD diagnosis, are not practical for primary medical institutions. Therefore, the aim of this study was to develop machine learning (ML) models for screening individuals at high risk of FLD, and to provide a new perspective on early FLD diagnosis. Methods: This study included a total of 30,574 individuals between the ages of 18 and 70 who completed abdominal ultrasound and the related clinical examinations. Among them, 3474 individuals were diagnosed with FLD by abdominal ultrasound. We used 11 indicators to build eight classification models to predict FLD. The model prediction ability was evaluated by the area under the curve, sensitivity, specificity, positive predictive value, negative predictive value, and kappa value. Feature importance analysis was assessed by Shapley value or root mean square error loss after permutations. Results: Among the eight ML models, the prediction accuracy of the extreme gradient boosting (XGBoost) model was highest at 89.77%. By feature importance analysis, we found that the body mass index, triglyceride, and alanine aminotransferase play important roles in FLD prediction. Conclusion: XGBoost improves the efficiency and cost of large-scale FLD screening. MDPI 2023-03-18 /pmc/articles/PMC10047083/ /pubmed/36980476 http://dx.doi.org/10.3390/diagnostics13061168 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Weng, Shuwei Hu, Die Chen, Jin Yang, Yanyi Peng, Daoquan Prediction of Fatty Liver Disease in a Chinese Population Using Machine-Learning Algorithms |
title | Prediction of Fatty Liver Disease in a Chinese Population Using Machine-Learning Algorithms |
title_full | Prediction of Fatty Liver Disease in a Chinese Population Using Machine-Learning Algorithms |
title_fullStr | Prediction of Fatty Liver Disease in a Chinese Population Using Machine-Learning Algorithms |
title_full_unstemmed | Prediction of Fatty Liver Disease in a Chinese Population Using Machine-Learning Algorithms |
title_short | Prediction of Fatty Liver Disease in a Chinese Population Using Machine-Learning Algorithms |
title_sort | prediction of fatty liver disease in a chinese population using machine-learning algorithms |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10047083/ https://www.ncbi.nlm.nih.gov/pubmed/36980476 http://dx.doi.org/10.3390/diagnostics13061168 |
work_keys_str_mv | AT wengshuwei predictionoffattyliverdiseaseinachinesepopulationusingmachinelearningalgorithms AT hudie predictionoffattyliverdiseaseinachinesepopulationusingmachinelearningalgorithms AT chenjin predictionoffattyliverdiseaseinachinesepopulationusingmachinelearningalgorithms AT yangyanyi predictionoffattyliverdiseaseinachinesepopulationusingmachinelearningalgorithms AT pengdaoquan predictionoffattyliverdiseaseinachinesepopulationusingmachinelearningalgorithms |