Cargando…
A Machine Learning Based Framework to Identify and Classify Non-alcoholic Fatty Liver Disease in a Large-Scale Population
Non-alcoholic fatty liver disease (NAFLD) is a common serious health problem worldwide, which lacks efficient medical treatment. We aimed to develop and validate the machine learning (ML) models which could be used to the accurate screening of large number of people. This paper included 304,145 adul...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9013842/ https://www.ncbi.nlm.nih.gov/pubmed/35444985 http://dx.doi.org/10.3389/fpubh.2022.846118 |
_version_ | 1784688084235845632 |
---|---|
author | Ji, Weidong Xue, Mingyue Zhang, Yushan Yao, Hua Wang, Yushan |
author_facet | Ji, Weidong Xue, Mingyue Zhang, Yushan Yao, Hua Wang, Yushan |
author_sort | Ji, Weidong |
collection | PubMed |
description | Non-alcoholic fatty liver disease (NAFLD) is a common serious health problem worldwide, which lacks efficient medical treatment. We aimed to develop and validate the machine learning (ML) models which could be used to the accurate screening of large number of people. This paper included 304,145 adults who have joined in the national physical examination and used their questionnaire and physical measurement parameters as model's candidate covariates. Absolute shrinkage and selection operator (LASSO) was used to feature selection from candidate covariates, then four ML algorithms were used to build the screening model for NAFLD, used a classifier with the best performance to output the importance score of the covariate in NAFLD. Among the four ML algorithms, XGBoost owned the best performance (accuracy = 0.880, precision = 0.801, recall = 0.894, F-1 = 0.882, and AUC = 0.951), and the importance ranking of covariates is accordingly BMI, age, waist circumference, gender, type 2 diabetes, gallbladder disease, smoking, hypertension, dietary status, physical activity, oil-loving and salt-loving. ML classifiers could help medical agencies achieve the early identification and classification of NAFLD, which is particularly useful for areas with poor economy, and the covariates' importance degree will be helpful to the prevention and treatment of NAFLD. |
format | Online Article Text |
id | pubmed-9013842 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-90138422022-04-19 A Machine Learning Based Framework to Identify and Classify Non-alcoholic Fatty Liver Disease in a Large-Scale Population Ji, Weidong Xue, Mingyue Zhang, Yushan Yao, Hua Wang, Yushan Front Public Health Public Health Non-alcoholic fatty liver disease (NAFLD) is a common serious health problem worldwide, which lacks efficient medical treatment. We aimed to develop and validate the machine learning (ML) models which could be used to the accurate screening of large number of people. This paper included 304,145 adults who have joined in the national physical examination and used their questionnaire and physical measurement parameters as model's candidate covariates. Absolute shrinkage and selection operator (LASSO) was used to feature selection from candidate covariates, then four ML algorithms were used to build the screening model for NAFLD, used a classifier with the best performance to output the importance score of the covariate in NAFLD. Among the four ML algorithms, XGBoost owned the best performance (accuracy = 0.880, precision = 0.801, recall = 0.894, F-1 = 0.882, and AUC = 0.951), and the importance ranking of covariates is accordingly BMI, age, waist circumference, gender, type 2 diabetes, gallbladder disease, smoking, hypertension, dietary status, physical activity, oil-loving and salt-loving. ML classifiers could help medical agencies achieve the early identification and classification of NAFLD, which is particularly useful for areas with poor economy, and the covariates' importance degree will be helpful to the prevention and treatment of NAFLD. Frontiers Media S.A. 2022-04-04 /pmc/articles/PMC9013842/ /pubmed/35444985 http://dx.doi.org/10.3389/fpubh.2022.846118 Text en Copyright © 2022 Ji, Xue, Zhang, Yao and Wang. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Public Health Ji, Weidong Xue, Mingyue Zhang, Yushan Yao, Hua Wang, Yushan A Machine Learning Based Framework to Identify and Classify Non-alcoholic Fatty Liver Disease in a Large-Scale Population |
title | A Machine Learning Based Framework to Identify and Classify Non-alcoholic Fatty Liver Disease in a Large-Scale Population |
title_full | A Machine Learning Based Framework to Identify and Classify Non-alcoholic Fatty Liver Disease in a Large-Scale Population |
title_fullStr | A Machine Learning Based Framework to Identify and Classify Non-alcoholic Fatty Liver Disease in a Large-Scale Population |
title_full_unstemmed | A Machine Learning Based Framework to Identify and Classify Non-alcoholic Fatty Liver Disease in a Large-Scale Population |
title_short | A Machine Learning Based Framework to Identify and Classify Non-alcoholic Fatty Liver Disease in a Large-Scale Population |
title_sort | machine learning based framework to identify and classify non-alcoholic fatty liver disease in a large-scale population |
topic | Public Health |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9013842/ https://www.ncbi.nlm.nih.gov/pubmed/35444985 http://dx.doi.org/10.3389/fpubh.2022.846118 |
work_keys_str_mv | AT jiweidong amachinelearningbasedframeworktoidentifyandclassifynonalcoholicfattyliverdiseaseinalargescalepopulation AT xuemingyue amachinelearningbasedframeworktoidentifyandclassifynonalcoholicfattyliverdiseaseinalargescalepopulation AT zhangyushan amachinelearningbasedframeworktoidentifyandclassifynonalcoholicfattyliverdiseaseinalargescalepopulation AT yaohua amachinelearningbasedframeworktoidentifyandclassifynonalcoholicfattyliverdiseaseinalargescalepopulation AT wangyushan amachinelearningbasedframeworktoidentifyandclassifynonalcoholicfattyliverdiseaseinalargescalepopulation AT jiweidong machinelearningbasedframeworktoidentifyandclassifynonalcoholicfattyliverdiseaseinalargescalepopulation AT xuemingyue machinelearningbasedframeworktoidentifyandclassifynonalcoholicfattyliverdiseaseinalargescalepopulation AT zhangyushan machinelearningbasedframeworktoidentifyandclassifynonalcoholicfattyliverdiseaseinalargescalepopulation AT yaohua machinelearningbasedframeworktoidentifyandclassifynonalcoholicfattyliverdiseaseinalargescalepopulation AT wangyushan machinelearningbasedframeworktoidentifyandclassifynonalcoholicfattyliverdiseaseinalargescalepopulation |