Cargando…
Predicting Risk of Stroke From Lab Tests Using Machine Learning Algorithms: Development and Evaluation of Prediction Models
BACKGROUND: Stroke, a cerebrovascular disease, is one of the major causes of death. It causes significant health and financial burdens for both patients and health care systems. One of the important risk factors for stroke is health-related behavior, which is becoming an increasingly important focus...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
JMIR Publications
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8686476/ https://www.ncbi.nlm.nih.gov/pubmed/34860663 http://dx.doi.org/10.2196/23440 |
_version_ | 1784618022672007168 |
---|---|
author | Alanazi, Eman M Abdou, Aalaa Luo, Jake |
author_facet | Alanazi, Eman M Abdou, Aalaa Luo, Jake |
author_sort | Alanazi, Eman M |
collection | PubMed |
description | BACKGROUND: Stroke, a cerebrovascular disease, is one of the major causes of death. It causes significant health and financial burdens for both patients and health care systems. One of the important risk factors for stroke is health-related behavior, which is becoming an increasingly important focus of prevention. Many machine learning models have been built to predict the risk of stroke or to automatically diagnose stroke, using predictors such as lifestyle factors or radiological imaging. However, there have been no models built using data from lab tests. OBJECTIVE: The aim of this study was to apply computational methods using machine learning techniques to predict stroke from lab test data. METHODS: We used the National Health and Nutrition Examination Survey data sets with three different data selection methods (ie, without data resampling, with data imputation, and with data resampling) to develop predictive models. We used four machine learning classifiers and six performance measures to evaluate the performance of the models. RESULTS: We found that accurate and sensitive machine learning models can be created to predict stroke from lab test data. Our results show that the data resampling approach performed the best compared to the other two data selection techniques. Prediction with the random forest algorithm, which was the best algorithm tested, achieved an accuracy, sensitivity, specificity, positive predictive value, negative predictive value, and area under the curve of 0.96, 0.97, 0.96, 0.75, 0.99, and 0.97, respectively, when all of the attributes were used. CONCLUSIONS: The predictive model, built using data from lab tests, was easy to use and had high accuracy. In future studies, we aim to use data that reflect different types of stroke and to explore the data to build a prediction model for each type. |
format | Online Article Text |
id | pubmed-8686476 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | JMIR Publications |
record_format | MEDLINE/PubMed |
spelling | pubmed-86864762022-01-10 Predicting Risk of Stroke From Lab Tests Using Machine Learning Algorithms: Development and Evaluation of Prediction Models Alanazi, Eman M Abdou, Aalaa Luo, Jake JMIR Form Res Original Paper BACKGROUND: Stroke, a cerebrovascular disease, is one of the major causes of death. It causes significant health and financial burdens for both patients and health care systems. One of the important risk factors for stroke is health-related behavior, which is becoming an increasingly important focus of prevention. Many machine learning models have been built to predict the risk of stroke or to automatically diagnose stroke, using predictors such as lifestyle factors or radiological imaging. However, there have been no models built using data from lab tests. OBJECTIVE: The aim of this study was to apply computational methods using machine learning techniques to predict stroke from lab test data. METHODS: We used the National Health and Nutrition Examination Survey data sets with three different data selection methods (ie, without data resampling, with data imputation, and with data resampling) to develop predictive models. We used four machine learning classifiers and six performance measures to evaluate the performance of the models. RESULTS: We found that accurate and sensitive machine learning models can be created to predict stroke from lab test data. Our results show that the data resampling approach performed the best compared to the other two data selection techniques. Prediction with the random forest algorithm, which was the best algorithm tested, achieved an accuracy, sensitivity, specificity, positive predictive value, negative predictive value, and area under the curve of 0.96, 0.97, 0.96, 0.75, 0.99, and 0.97, respectively, when all of the attributes were used. CONCLUSIONS: The predictive model, built using data from lab tests, was easy to use and had high accuracy. In future studies, we aim to use data that reflect different types of stroke and to explore the data to build a prediction model for each type. JMIR Publications 2021-12-02 /pmc/articles/PMC8686476/ /pubmed/34860663 http://dx.doi.org/10.2196/23440 Text en ©Eman M Alanazi, Aalaa Abdou, Jake Luo. Originally published in JMIR Formative Research (https://formative.jmir.org), 02.12.2021. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Formative Research, is properly cited. The complete bibliographic information, a link to the original publication on https://formative.jmir.org, as well as this copyright and license information must be included. |
spellingShingle | Original Paper Alanazi, Eman M Abdou, Aalaa Luo, Jake Predicting Risk of Stroke From Lab Tests Using Machine Learning Algorithms: Development and Evaluation of Prediction Models |
title | Predicting Risk of Stroke From Lab Tests Using Machine Learning Algorithms: Development and Evaluation of Prediction Models |
title_full | Predicting Risk of Stroke From Lab Tests Using Machine Learning Algorithms: Development and Evaluation of Prediction Models |
title_fullStr | Predicting Risk of Stroke From Lab Tests Using Machine Learning Algorithms: Development and Evaluation of Prediction Models |
title_full_unstemmed | Predicting Risk of Stroke From Lab Tests Using Machine Learning Algorithms: Development and Evaluation of Prediction Models |
title_short | Predicting Risk of Stroke From Lab Tests Using Machine Learning Algorithms: Development and Evaluation of Prediction Models |
title_sort | predicting risk of stroke from lab tests using machine learning algorithms: development and evaluation of prediction models |
topic | Original Paper |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8686476/ https://www.ncbi.nlm.nih.gov/pubmed/34860663 http://dx.doi.org/10.2196/23440 |
work_keys_str_mv | AT alanaziemanm predictingriskofstrokefromlabtestsusingmachinelearningalgorithmsdevelopmentandevaluationofpredictionmodels AT abdouaalaa predictingriskofstrokefromlabtestsusingmachinelearningalgorithmsdevelopmentandevaluationofpredictionmodels AT luojake predictingriskofstrokefromlabtestsusingmachinelearningalgorithmsdevelopmentandevaluationofpredictionmodels |