Cargando…

Development of machine learning model for diagnostic disease prediction based on laboratory tests

The use of deep learning and machine learning (ML) in medical science is increasing, particularly in the visual, audio, and language data fields. We aimed to build a new optimized ensemble model by blending a DNN (deep neural network) model with two ML models for disease prediction using laboratory...

Descripción completa

Detalles Bibliográficos
Autores principales: Park, Dong Jin, Park, Min Woo, Lee, Homin, Kim, Young-Jin, Kim, Yeongsic, Park, Young Hoon
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8026627/
https://www.ncbi.nlm.nih.gov/pubmed/33828178
http://dx.doi.org/10.1038/s41598-021-87171-5
_version_ 1783675689232236544
author Park, Dong Jin
Park, Min Woo
Lee, Homin
Kim, Young-Jin
Kim, Yeongsic
Park, Young Hoon
author_facet Park, Dong Jin
Park, Min Woo
Lee, Homin
Kim, Young-Jin
Kim, Yeongsic
Park, Young Hoon
author_sort Park, Dong Jin
collection PubMed
description The use of deep learning and machine learning (ML) in medical science is increasing, particularly in the visual, audio, and language data fields. We aimed to build a new optimized ensemble model by blending a DNN (deep neural network) model with two ML models for disease prediction using laboratory test results. 86 attributes (laboratory tests) were selected from datasets based on value counts, clinical importance-related features, and missing values. We collected sample datasets on 5145 cases, including 326,686 laboratory test results. We investigated a total of 39 specific diseases based on the International Classification of Diseases, 10th revision (ICD-10) codes. These datasets were used to construct light gradient boosting machine (LightGBM) and extreme gradient boosting (XGBoost) ML models and a DNN model using TensorFlow. The optimized ensemble model achieved an F1-score of 81% and prediction accuracy of 92% for the five most common diseases. The deep learning and ML models showed differences in predictive power and disease classification patterns. We used a confusion matrix and analyzed feature importance using the SHAP value method. Our new ML model achieved high efficiency of disease prediction through classification of diseases. This study will be useful in the prediction and diagnosis of diseases.
format Online
Article
Text
id pubmed-8026627
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-80266272021-04-08 Development of machine learning model for diagnostic disease prediction based on laboratory tests Park, Dong Jin Park, Min Woo Lee, Homin Kim, Young-Jin Kim, Yeongsic Park, Young Hoon Sci Rep Article The use of deep learning and machine learning (ML) in medical science is increasing, particularly in the visual, audio, and language data fields. We aimed to build a new optimized ensemble model by blending a DNN (deep neural network) model with two ML models for disease prediction using laboratory test results. 86 attributes (laboratory tests) were selected from datasets based on value counts, clinical importance-related features, and missing values. We collected sample datasets on 5145 cases, including 326,686 laboratory test results. We investigated a total of 39 specific diseases based on the International Classification of Diseases, 10th revision (ICD-10) codes. These datasets were used to construct light gradient boosting machine (LightGBM) and extreme gradient boosting (XGBoost) ML models and a DNN model using TensorFlow. The optimized ensemble model achieved an F1-score of 81% and prediction accuracy of 92% for the five most common diseases. The deep learning and ML models showed differences in predictive power and disease classification patterns. We used a confusion matrix and analyzed feature importance using the SHAP value method. Our new ML model achieved high efficiency of disease prediction through classification of diseases. This study will be useful in the prediction and diagnosis of diseases. Nature Publishing Group UK 2021-04-07 /pmc/articles/PMC8026627/ /pubmed/33828178 http://dx.doi.org/10.1038/s41598-021-87171-5 Text en © The Author(s) 2021 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
spellingShingle Article
Park, Dong Jin
Park, Min Woo
Lee, Homin
Kim, Young-Jin
Kim, Yeongsic
Park, Young Hoon
Development of machine learning model for diagnostic disease prediction based on laboratory tests
title Development of machine learning model for diagnostic disease prediction based on laboratory tests
title_full Development of machine learning model for diagnostic disease prediction based on laboratory tests
title_fullStr Development of machine learning model for diagnostic disease prediction based on laboratory tests
title_full_unstemmed Development of machine learning model for diagnostic disease prediction based on laboratory tests
title_short Development of machine learning model for diagnostic disease prediction based on laboratory tests
title_sort development of machine learning model for diagnostic disease prediction based on laboratory tests
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8026627/
https://www.ncbi.nlm.nih.gov/pubmed/33828178
http://dx.doi.org/10.1038/s41598-021-87171-5
work_keys_str_mv AT parkdongjin developmentofmachinelearningmodelfordiagnosticdiseasepredictionbasedonlaboratorytests
AT parkminwoo developmentofmachinelearningmodelfordiagnosticdiseasepredictionbasedonlaboratorytests
AT leehomin developmentofmachinelearningmodelfordiagnosticdiseasepredictionbasedonlaboratorytests
AT kimyoungjin developmentofmachinelearningmodelfordiagnosticdiseasepredictionbasedonlaboratorytests
AT kimyeongsic developmentofmachinelearningmodelfordiagnosticdiseasepredictionbasedonlaboratorytests
AT parkyounghoon developmentofmachinelearningmodelfordiagnosticdiseasepredictionbasedonlaboratorytests