Cargando…

The prediction of asymptomatic carotid atherosclerosis with electronic health records: a comparative study of six machine learning models

BACKGROUND: Screening carotid B-mode ultrasonography is a frequently used method to detect subjects with carotid atherosclerosis (CAS). Due to the asymptomatic progression of most CAS patients, early identification is challenging for clinicians, and it may trigger ischemic stroke. Recently, machine...

Descripción completa

Detalles Bibliográficos
Autores principales: Fan, Jiaxin, Chen, Mengying, Luo, Jian, Yang, Shusen, Shi, Jinming, Yao, Qingling, Zhang, Xiaodong, Du, Shuang, Qu, Huiyang, Cheng, Yuxuan, Ma, Shuyin, Zhang, Meijuan, Xu, Xi, Wang, Qian, Zhan, Shuqin
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8020544/
https://www.ncbi.nlm.nih.gov/pubmed/33820531
http://dx.doi.org/10.1186/s12911-021-01480-3
_version_ 1783674598681739264
author Fan, Jiaxin
Chen, Mengying
Luo, Jian
Yang, Shusen
Shi, Jinming
Yao, Qingling
Zhang, Xiaodong
Du, Shuang
Qu, Huiyang
Cheng, Yuxuan
Ma, Shuyin
Zhang, Meijuan
Xu, Xi
Wang, Qian
Zhan, Shuqin
author_facet Fan, Jiaxin
Chen, Mengying
Luo, Jian
Yang, Shusen
Shi, Jinming
Yao, Qingling
Zhang, Xiaodong
Du, Shuang
Qu, Huiyang
Cheng, Yuxuan
Ma, Shuyin
Zhang, Meijuan
Xu, Xi
Wang, Qian
Zhan, Shuqin
author_sort Fan, Jiaxin
collection PubMed
description BACKGROUND: Screening carotid B-mode ultrasonography is a frequently used method to detect subjects with carotid atherosclerosis (CAS). Due to the asymptomatic progression of most CAS patients, early identification is challenging for clinicians, and it may trigger ischemic stroke. Recently, machine learning has shown a strong ability to classify data and a potential for prediction in the medical field. The combined use of machine learning and the electronic health records of patients could provide clinicians with a more convenient and precise method to identify asymptomatic CAS. METHODS: Retrospective cohort study using routine clinical data of medical check-up subjects from April 19, 2010 to November 15, 2019. Six machine learning models (logistic regression [LR], random forest [RF], decision tree [DT], eXtreme Gradient Boosting [XGB], Gaussian Naïve Bayes [GNB], and K-Nearest Neighbour [KNN]) were used to predict asymptomatic CAS and compared their predictability in terms of the area under the receiver operating characteristic curve (AUCROC), accuracy (ACC), and F1 score (F1). RESULTS: Of the 18,441 subjects, 6553 were diagnosed with asymptomatic CAS. Compared to DT (AUCROC 0.628, ACC 65.4%, and F1 52.5%), the other five models improved prediction: KNN + 7.6% (0.704, 68.8%, and 50.9%, respectively), GNB + 12.5% (0.753, 67.0%, and 46.8%, respectively), XGB + 16.0% (0.788, 73.4%, and 55.7%, respectively), RF + 16.6% (0.794, 74.5%, and 56.8%, respectively) and LR + 18.1% (0.809, 74.7%, and 59.9%, respectively). The highest achieving model, LR predicted 1045/1966 cases (sensitivity 53.2%) and 3088/3566 non-cases (specificity 86.6%). A tenfold cross-validation scheme further verified the predictive ability of the LR. CONCLUSIONS: Among machine learning models, LR showed optimal performance in predicting asymptomatic CAS. Our findings set the stage for an early automatic alarming system, allowing a more precise allocation of CAS prevention measures to individuals probably to benefit most. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12911-021-01480-3.
format Online
Article
Text
id pubmed-8020544
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-80205442021-04-07 The prediction of asymptomatic carotid atherosclerosis with electronic health records: a comparative study of six machine learning models Fan, Jiaxin Chen, Mengying Luo, Jian Yang, Shusen Shi, Jinming Yao, Qingling Zhang, Xiaodong Du, Shuang Qu, Huiyang Cheng, Yuxuan Ma, Shuyin Zhang, Meijuan Xu, Xi Wang, Qian Zhan, Shuqin BMC Med Inform Decis Mak Research Article BACKGROUND: Screening carotid B-mode ultrasonography is a frequently used method to detect subjects with carotid atherosclerosis (CAS). Due to the asymptomatic progression of most CAS patients, early identification is challenging for clinicians, and it may trigger ischemic stroke. Recently, machine learning has shown a strong ability to classify data and a potential for prediction in the medical field. The combined use of machine learning and the electronic health records of patients could provide clinicians with a more convenient and precise method to identify asymptomatic CAS. METHODS: Retrospective cohort study using routine clinical data of medical check-up subjects from April 19, 2010 to November 15, 2019. Six machine learning models (logistic regression [LR], random forest [RF], decision tree [DT], eXtreme Gradient Boosting [XGB], Gaussian Naïve Bayes [GNB], and K-Nearest Neighbour [KNN]) were used to predict asymptomatic CAS and compared their predictability in terms of the area under the receiver operating characteristic curve (AUCROC), accuracy (ACC), and F1 score (F1). RESULTS: Of the 18,441 subjects, 6553 were diagnosed with asymptomatic CAS. Compared to DT (AUCROC 0.628, ACC 65.4%, and F1 52.5%), the other five models improved prediction: KNN + 7.6% (0.704, 68.8%, and 50.9%, respectively), GNB + 12.5% (0.753, 67.0%, and 46.8%, respectively), XGB + 16.0% (0.788, 73.4%, and 55.7%, respectively), RF + 16.6% (0.794, 74.5%, and 56.8%, respectively) and LR + 18.1% (0.809, 74.7%, and 59.9%, respectively). The highest achieving model, LR predicted 1045/1966 cases (sensitivity 53.2%) and 3088/3566 non-cases (specificity 86.6%). A tenfold cross-validation scheme further verified the predictive ability of the LR. CONCLUSIONS: Among machine learning models, LR showed optimal performance in predicting asymptomatic CAS. Our findings set the stage for an early automatic alarming system, allowing a more precise allocation of CAS prevention measures to individuals probably to benefit most. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12911-021-01480-3. BioMed Central 2021-04-05 /pmc/articles/PMC8020544/ /pubmed/33820531 http://dx.doi.org/10.1186/s12911-021-01480-3 Text en © The Author(s) 2021 Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research Article
Fan, Jiaxin
Chen, Mengying
Luo, Jian
Yang, Shusen
Shi, Jinming
Yao, Qingling
Zhang, Xiaodong
Du, Shuang
Qu, Huiyang
Cheng, Yuxuan
Ma, Shuyin
Zhang, Meijuan
Xu, Xi
Wang, Qian
Zhan, Shuqin
The prediction of asymptomatic carotid atherosclerosis with electronic health records: a comparative study of six machine learning models
title The prediction of asymptomatic carotid atherosclerosis with electronic health records: a comparative study of six machine learning models
title_full The prediction of asymptomatic carotid atherosclerosis with electronic health records: a comparative study of six machine learning models
title_fullStr The prediction of asymptomatic carotid atherosclerosis with electronic health records: a comparative study of six machine learning models
title_full_unstemmed The prediction of asymptomatic carotid atherosclerosis with electronic health records: a comparative study of six machine learning models
title_short The prediction of asymptomatic carotid atherosclerosis with electronic health records: a comparative study of six machine learning models
title_sort prediction of asymptomatic carotid atherosclerosis with electronic health records: a comparative study of six machine learning models
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8020544/
https://www.ncbi.nlm.nih.gov/pubmed/33820531
http://dx.doi.org/10.1186/s12911-021-01480-3
work_keys_str_mv AT fanjiaxin thepredictionofasymptomaticcarotidatherosclerosiswithelectronichealthrecordsacomparativestudyofsixmachinelearningmodels
AT chenmengying thepredictionofasymptomaticcarotidatherosclerosiswithelectronichealthrecordsacomparativestudyofsixmachinelearningmodels
AT luojian thepredictionofasymptomaticcarotidatherosclerosiswithelectronichealthrecordsacomparativestudyofsixmachinelearningmodels
AT yangshusen thepredictionofasymptomaticcarotidatherosclerosiswithelectronichealthrecordsacomparativestudyofsixmachinelearningmodels
AT shijinming thepredictionofasymptomaticcarotidatherosclerosiswithelectronichealthrecordsacomparativestudyofsixmachinelearningmodels
AT yaoqingling thepredictionofasymptomaticcarotidatherosclerosiswithelectronichealthrecordsacomparativestudyofsixmachinelearningmodels
AT zhangxiaodong thepredictionofasymptomaticcarotidatherosclerosiswithelectronichealthrecordsacomparativestudyofsixmachinelearningmodels
AT dushuang thepredictionofasymptomaticcarotidatherosclerosiswithelectronichealthrecordsacomparativestudyofsixmachinelearningmodels
AT quhuiyang thepredictionofasymptomaticcarotidatherosclerosiswithelectronichealthrecordsacomparativestudyofsixmachinelearningmodels
AT chengyuxuan thepredictionofasymptomaticcarotidatherosclerosiswithelectronichealthrecordsacomparativestudyofsixmachinelearningmodels
AT mashuyin thepredictionofasymptomaticcarotidatherosclerosiswithelectronichealthrecordsacomparativestudyofsixmachinelearningmodels
AT zhangmeijuan thepredictionofasymptomaticcarotidatherosclerosiswithelectronichealthrecordsacomparativestudyofsixmachinelearningmodels
AT xuxi thepredictionofasymptomaticcarotidatherosclerosiswithelectronichealthrecordsacomparativestudyofsixmachinelearningmodels
AT wangqian thepredictionofasymptomaticcarotidatherosclerosiswithelectronichealthrecordsacomparativestudyofsixmachinelearningmodels
AT zhanshuqin thepredictionofasymptomaticcarotidatherosclerosiswithelectronichealthrecordsacomparativestudyofsixmachinelearningmodels
AT fanjiaxin predictionofasymptomaticcarotidatherosclerosiswithelectronichealthrecordsacomparativestudyofsixmachinelearningmodels
AT chenmengying predictionofasymptomaticcarotidatherosclerosiswithelectronichealthrecordsacomparativestudyofsixmachinelearningmodels
AT luojian predictionofasymptomaticcarotidatherosclerosiswithelectronichealthrecordsacomparativestudyofsixmachinelearningmodels
AT yangshusen predictionofasymptomaticcarotidatherosclerosiswithelectronichealthrecordsacomparativestudyofsixmachinelearningmodels
AT shijinming predictionofasymptomaticcarotidatherosclerosiswithelectronichealthrecordsacomparativestudyofsixmachinelearningmodels
AT yaoqingling predictionofasymptomaticcarotidatherosclerosiswithelectronichealthrecordsacomparativestudyofsixmachinelearningmodels
AT zhangxiaodong predictionofasymptomaticcarotidatherosclerosiswithelectronichealthrecordsacomparativestudyofsixmachinelearningmodels
AT dushuang predictionofasymptomaticcarotidatherosclerosiswithelectronichealthrecordsacomparativestudyofsixmachinelearningmodels
AT quhuiyang predictionofasymptomaticcarotidatherosclerosiswithelectronichealthrecordsacomparativestudyofsixmachinelearningmodels
AT chengyuxuan predictionofasymptomaticcarotidatherosclerosiswithelectronichealthrecordsacomparativestudyofsixmachinelearningmodels
AT mashuyin predictionofasymptomaticcarotidatherosclerosiswithelectronichealthrecordsacomparativestudyofsixmachinelearningmodels
AT zhangmeijuan predictionofasymptomaticcarotidatherosclerosiswithelectronichealthrecordsacomparativestudyofsixmachinelearningmodels
AT xuxi predictionofasymptomaticcarotidatherosclerosiswithelectronichealthrecordsacomparativestudyofsixmachinelearningmodels
AT wangqian predictionofasymptomaticcarotidatherosclerosiswithelectronichealthrecordsacomparativestudyofsixmachinelearningmodels
AT zhanshuqin predictionofasymptomaticcarotidatherosclerosiswithelectronichealthrecordsacomparativestudyofsixmachinelearningmodels