Cargando…

Leukemia can be Effectively Early Predicted in Routine Physical Examination with the Assistance of Machine Learning Models

OBJECTIVES: The diagnosis of leukemia relies very much on the results of bone marrow examinations, which is never generally performed in routine physical examination. In many rural areas even community hospitals and primary care clinics, the lack of hematological specialist and facility does not all...

Descripción completa

Detalles Bibliográficos
Autores principales: Yu, Cheng, Peng, Yin-yin, Liu, Lin, Wang, Xin, Xiao, Qing
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Hindawi 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9715329/
https://www.ncbi.nlm.nih.gov/pubmed/36465253
http://dx.doi.org/10.1155/2022/8641194
_version_ 1784842422724853760
author Yu, Cheng
Peng, Yin-yin
Liu, Lin
Wang, Xin
Xiao, Qing
author_facet Yu, Cheng
Peng, Yin-yin
Liu, Lin
Wang, Xin
Xiao, Qing
author_sort Yu, Cheng
collection PubMed
description OBJECTIVES: The diagnosis of leukemia relies very much on the results of bone marrow examinations, which is never generally performed in routine physical examination. In many rural areas even community hospitals and primary care clinics, the lack of hematological specialist and facility does not allow a definite diagnosis of leukemia. Thus, there will be a significant benefit if machine learning (ML) models could help early predict leukemia using preliminary blood test data in a routine physical examination in community hospitals to save time before a definite diagnosis. METHODS: We collected the routine physical examination data of 1230 newly diagnosed leukemia patients and 1300 healthy people. We trained and tested 3 machine learning (ML) models including linear support vector machine (LSVM), random forest (RF), and XGboost models. We not only examined the accordance between model results and statistical analysis of the input data but also examined the consistency of model accuracy scores and relative importance order of model factors with regard to different input data sets and different model arguments to check the applicability of both the models and the input data. RESULTS: Generally, the RF and XGboost models give more identical, consistent, and robust relative importance order of factors that is also accordant with the statistical analysis, while the LSVM gives much different and nonsense orders for different inputs. Results of the RF and XGboost models show that (1) generally, the models achieve accuracy scores above 0.9, indicating effective identification of leukemia, and (2) the top three factors that contribute most to the identification of leukemia include red blood cell (RBC), hematocrit (HCT), and white blood cell (WBC), while the other factors contribute relatively less. CONCLUSIONS: This study shows a feasible case example for early identification of leukemia using routine physical examination data with the assistance of ML models, which can be conveniently, cheaply, and widely applied in community hospitals or primary care clinics to save time before definite diagnosis; however, more studies are still needed to validate the applicability of more ML models to a larger variety of input data sets.
format Online
Article
Text
id pubmed-9715329
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Hindawi
record_format MEDLINE/PubMed
spelling pubmed-97153292022-12-02 Leukemia can be Effectively Early Predicted in Routine Physical Examination with the Assistance of Machine Learning Models Yu, Cheng Peng, Yin-yin Liu, Lin Wang, Xin Xiao, Qing J Healthc Eng Research Article OBJECTIVES: The diagnosis of leukemia relies very much on the results of bone marrow examinations, which is never generally performed in routine physical examination. In many rural areas even community hospitals and primary care clinics, the lack of hematological specialist and facility does not allow a definite diagnosis of leukemia. Thus, there will be a significant benefit if machine learning (ML) models could help early predict leukemia using preliminary blood test data in a routine physical examination in community hospitals to save time before a definite diagnosis. METHODS: We collected the routine physical examination data of 1230 newly diagnosed leukemia patients and 1300 healthy people. We trained and tested 3 machine learning (ML) models including linear support vector machine (LSVM), random forest (RF), and XGboost models. We not only examined the accordance between model results and statistical analysis of the input data but also examined the consistency of model accuracy scores and relative importance order of model factors with regard to different input data sets and different model arguments to check the applicability of both the models and the input data. RESULTS: Generally, the RF and XGboost models give more identical, consistent, and robust relative importance order of factors that is also accordant with the statistical analysis, while the LSVM gives much different and nonsense orders for different inputs. Results of the RF and XGboost models show that (1) generally, the models achieve accuracy scores above 0.9, indicating effective identification of leukemia, and (2) the top three factors that contribute most to the identification of leukemia include red blood cell (RBC), hematocrit (HCT), and white blood cell (WBC), while the other factors contribute relatively less. CONCLUSIONS: This study shows a feasible case example for early identification of leukemia using routine physical examination data with the assistance of ML models, which can be conveniently, cheaply, and widely applied in community hospitals or primary care clinics to save time before definite diagnosis; however, more studies are still needed to validate the applicability of more ML models to a larger variety of input data sets. Hindawi 2022-11-24 /pmc/articles/PMC9715329/ /pubmed/36465253 http://dx.doi.org/10.1155/2022/8641194 Text en Copyright © 2022 Cheng Yu et al. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Yu, Cheng
Peng, Yin-yin
Liu, Lin
Wang, Xin
Xiao, Qing
Leukemia can be Effectively Early Predicted in Routine Physical Examination with the Assistance of Machine Learning Models
title Leukemia can be Effectively Early Predicted in Routine Physical Examination with the Assistance of Machine Learning Models
title_full Leukemia can be Effectively Early Predicted in Routine Physical Examination with the Assistance of Machine Learning Models
title_fullStr Leukemia can be Effectively Early Predicted in Routine Physical Examination with the Assistance of Machine Learning Models
title_full_unstemmed Leukemia can be Effectively Early Predicted in Routine Physical Examination with the Assistance of Machine Learning Models
title_short Leukemia can be Effectively Early Predicted in Routine Physical Examination with the Assistance of Machine Learning Models
title_sort leukemia can be effectively early predicted in routine physical examination with the assistance of machine learning models
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9715329/
https://www.ncbi.nlm.nih.gov/pubmed/36465253
http://dx.doi.org/10.1155/2022/8641194
work_keys_str_mv AT yucheng leukemiacanbeeffectivelyearlypredictedinroutinephysicalexaminationwiththeassistanceofmachinelearningmodels
AT pengyinyin leukemiacanbeeffectivelyearlypredictedinroutinephysicalexaminationwiththeassistanceofmachinelearningmodels
AT liulin leukemiacanbeeffectivelyearlypredictedinroutinephysicalexaminationwiththeassistanceofmachinelearningmodels
AT wangxin leukemiacanbeeffectivelyearlypredictedinroutinephysicalexaminationwiththeassistanceofmachinelearningmodels
AT xiaoqing leukemiacanbeeffectivelyearlypredictedinroutinephysicalexaminationwiththeassistanceofmachinelearningmodels