Cargando…

A Machine Learning Based Framework to Identify and Classify Non-alcoholic Fatty Liver Disease in a Large-Scale Population

Non-alcoholic fatty liver disease (NAFLD) is a common serious health problem worldwide, which lacks efficient medical treatment. We aimed to develop and validate the machine learning (ML) models which could be used to the accurate screening of large number of people. This paper included 304,145 adul...

Descripción completa

Detalles Bibliográficos
Autores principales: Ji, Weidong, Xue, Mingyue, Zhang, Yushan, Yao, Hua, Wang, Yushan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9013842/
https://www.ncbi.nlm.nih.gov/pubmed/35444985
http://dx.doi.org/10.3389/fpubh.2022.846118
_version_ 1784688084235845632
author Ji, Weidong
Xue, Mingyue
Zhang, Yushan
Yao, Hua
Wang, Yushan
author_facet Ji, Weidong
Xue, Mingyue
Zhang, Yushan
Yao, Hua
Wang, Yushan
author_sort Ji, Weidong
collection PubMed
description Non-alcoholic fatty liver disease (NAFLD) is a common serious health problem worldwide, which lacks efficient medical treatment. We aimed to develop and validate the machine learning (ML) models which could be used to the accurate screening of large number of people. This paper included 304,145 adults who have joined in the national physical examination and used their questionnaire and physical measurement parameters as model's candidate covariates. Absolute shrinkage and selection operator (LASSO) was used to feature selection from candidate covariates, then four ML algorithms were used to build the screening model for NAFLD, used a classifier with the best performance to output the importance score of the covariate in NAFLD. Among the four ML algorithms, XGBoost owned the best performance (accuracy = 0.880, precision = 0.801, recall = 0.894, F-1 = 0.882, and AUC = 0.951), and the importance ranking of covariates is accordingly BMI, age, waist circumference, gender, type 2 diabetes, gallbladder disease, smoking, hypertension, dietary status, physical activity, oil-loving and salt-loving. ML classifiers could help medical agencies achieve the early identification and classification of NAFLD, which is particularly useful for areas with poor economy, and the covariates' importance degree will be helpful to the prevention and treatment of NAFLD.
format Online
Article
Text
id pubmed-9013842
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-90138422022-04-19 A Machine Learning Based Framework to Identify and Classify Non-alcoholic Fatty Liver Disease in a Large-Scale Population Ji, Weidong Xue, Mingyue Zhang, Yushan Yao, Hua Wang, Yushan Front Public Health Public Health Non-alcoholic fatty liver disease (NAFLD) is a common serious health problem worldwide, which lacks efficient medical treatment. We aimed to develop and validate the machine learning (ML) models which could be used to the accurate screening of large number of people. This paper included 304,145 adults who have joined in the national physical examination and used their questionnaire and physical measurement parameters as model's candidate covariates. Absolute shrinkage and selection operator (LASSO) was used to feature selection from candidate covariates, then four ML algorithms were used to build the screening model for NAFLD, used a classifier with the best performance to output the importance score of the covariate in NAFLD. Among the four ML algorithms, XGBoost owned the best performance (accuracy = 0.880, precision = 0.801, recall = 0.894, F-1 = 0.882, and AUC = 0.951), and the importance ranking of covariates is accordingly BMI, age, waist circumference, gender, type 2 diabetes, gallbladder disease, smoking, hypertension, dietary status, physical activity, oil-loving and salt-loving. ML classifiers could help medical agencies achieve the early identification and classification of NAFLD, which is particularly useful for areas with poor economy, and the covariates' importance degree will be helpful to the prevention and treatment of NAFLD. Frontiers Media S.A. 2022-04-04 /pmc/articles/PMC9013842/ /pubmed/35444985 http://dx.doi.org/10.3389/fpubh.2022.846118 Text en Copyright © 2022 Ji, Xue, Zhang, Yao and Wang. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Public Health
Ji, Weidong
Xue, Mingyue
Zhang, Yushan
Yao, Hua
Wang, Yushan
A Machine Learning Based Framework to Identify and Classify Non-alcoholic Fatty Liver Disease in a Large-Scale Population
title A Machine Learning Based Framework to Identify and Classify Non-alcoholic Fatty Liver Disease in a Large-Scale Population
title_full A Machine Learning Based Framework to Identify and Classify Non-alcoholic Fatty Liver Disease in a Large-Scale Population
title_fullStr A Machine Learning Based Framework to Identify and Classify Non-alcoholic Fatty Liver Disease in a Large-Scale Population
title_full_unstemmed A Machine Learning Based Framework to Identify and Classify Non-alcoholic Fatty Liver Disease in a Large-Scale Population
title_short A Machine Learning Based Framework to Identify and Classify Non-alcoholic Fatty Liver Disease in a Large-Scale Population
title_sort machine learning based framework to identify and classify non-alcoholic fatty liver disease in a large-scale population
topic Public Health
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9013842/
https://www.ncbi.nlm.nih.gov/pubmed/35444985
http://dx.doi.org/10.3389/fpubh.2022.846118
work_keys_str_mv AT jiweidong amachinelearningbasedframeworktoidentifyandclassifynonalcoholicfattyliverdiseaseinalargescalepopulation
AT xuemingyue amachinelearningbasedframeworktoidentifyandclassifynonalcoholicfattyliverdiseaseinalargescalepopulation
AT zhangyushan amachinelearningbasedframeworktoidentifyandclassifynonalcoholicfattyliverdiseaseinalargescalepopulation
AT yaohua amachinelearningbasedframeworktoidentifyandclassifynonalcoholicfattyliverdiseaseinalargescalepopulation
AT wangyushan amachinelearningbasedframeworktoidentifyandclassifynonalcoholicfattyliverdiseaseinalargescalepopulation
AT jiweidong machinelearningbasedframeworktoidentifyandclassifynonalcoholicfattyliverdiseaseinalargescalepopulation
AT xuemingyue machinelearningbasedframeworktoidentifyandclassifynonalcoholicfattyliverdiseaseinalargescalepopulation
AT zhangyushan machinelearningbasedframeworktoidentifyandclassifynonalcoholicfattyliverdiseaseinalargescalepopulation
AT yaohua machinelearningbasedframeworktoidentifyandclassifynonalcoholicfattyliverdiseaseinalargescalepopulation
AT wangyushan machinelearningbasedframeworktoidentifyandclassifynonalcoholicfattyliverdiseaseinalargescalepopulation