Cargando…

A Stroke Risk Detection: Improving Hybrid Feature Selection Method

BACKGROUND: Stroke is one of the most common diseases that cause mortality. Detecting the risk of stroke for individuals is critical yet challenging because of a large number of risk factors for stroke. OBJECTIVE: This study aimed to address the limitation of ineffective feature selection in existin...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhang, Yonglai, Zhou, Yaojian, Zhang, Dongsong, Song, Wenai
Formato: Online Artículo Texto
Lenguaje:English
Publicado: JMIR Publications 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6466481/
https://www.ncbi.nlm.nih.gov/pubmed/30938684
http://dx.doi.org/10.2196/12437
_version_ 1783411119145091072
author Zhang, Yonglai
Zhou, Yaojian
Zhang, Dongsong
Song, Wenai
author_facet Zhang, Yonglai
Zhou, Yaojian
Zhang, Dongsong
Song, Wenai
author_sort Zhang, Yonglai
collection PubMed
description BACKGROUND: Stroke is one of the most common diseases that cause mortality. Detecting the risk of stroke for individuals is critical yet challenging because of a large number of risk factors for stroke. OBJECTIVE: This study aimed to address the limitation of ineffective feature selection in existing research on stroke risk detection. We have proposed a new feature selection method called weighting- and ranking-based hybrid feature selection (WRHFS) to select important risk factors for detecting ischemic stroke. METHODS: WRHFS integrates the strengths of various filter algorithms by following the principle of a wrapper approach. We employed a variety of filter-based feature selection models as the candidate set, including standard deviation, Pearson correlation coefficient, Fisher score, information gain, Relief algorithm, and chi-square test and used sensitivity, specificity, accuracy, and Youden index as performance metrics to evaluate the proposed method. RESULTS: This study chose 792 samples from the electronic records of 13,421 patients in a community hospital. Each sample included 28 features (24 blood test features and 4 demographic features). The results of evaluation showed that the proposed method selected 9 important features out of the original 28 features and significantly outperformed baseline methods. Their cumulative contribution was 0.51. The WRHFS method achieved a sensitivity of 82.7% (329/398), specificity of 80.4% (317/394), classification accuracy of 81.5% (645/792), and Youden index of 0.63 using only the top 9 features. We have also presented a chart for visualizing the risk of having ischemic strokes. CONCLUSIONS: This study has proposed, developed, and evaluated a new feature selection method for identifying the most important features for building effective and parsimonious models for stroke risk detection. The findings of this research provide several novel research contributions and practical implications.
format Online
Article
Text
id pubmed-6466481
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher JMIR Publications
record_format MEDLINE/PubMed
spelling pubmed-64664812019-05-08 A Stroke Risk Detection: Improving Hybrid Feature Selection Method Zhang, Yonglai Zhou, Yaojian Zhang, Dongsong Song, Wenai J Med Internet Res Original Paper BACKGROUND: Stroke is one of the most common diseases that cause mortality. Detecting the risk of stroke for individuals is critical yet challenging because of a large number of risk factors for stroke. OBJECTIVE: This study aimed to address the limitation of ineffective feature selection in existing research on stroke risk detection. We have proposed a new feature selection method called weighting- and ranking-based hybrid feature selection (WRHFS) to select important risk factors for detecting ischemic stroke. METHODS: WRHFS integrates the strengths of various filter algorithms by following the principle of a wrapper approach. We employed a variety of filter-based feature selection models as the candidate set, including standard deviation, Pearson correlation coefficient, Fisher score, information gain, Relief algorithm, and chi-square test and used sensitivity, specificity, accuracy, and Youden index as performance metrics to evaluate the proposed method. RESULTS: This study chose 792 samples from the electronic records of 13,421 patients in a community hospital. Each sample included 28 features (24 blood test features and 4 demographic features). The results of evaluation showed that the proposed method selected 9 important features out of the original 28 features and significantly outperformed baseline methods. Their cumulative contribution was 0.51. The WRHFS method achieved a sensitivity of 82.7% (329/398), specificity of 80.4% (317/394), classification accuracy of 81.5% (645/792), and Youden index of 0.63 using only the top 9 features. We have also presented a chart for visualizing the risk of having ischemic strokes. CONCLUSIONS: This study has proposed, developed, and evaluated a new feature selection method for identifying the most important features for building effective and parsimonious models for stroke risk detection. The findings of this research provide several novel research contributions and practical implications. JMIR Publications 2019-04-02 /pmc/articles/PMC6466481/ /pubmed/30938684 http://dx.doi.org/10.2196/12437 Text en ©Yonglai Zhang, Yaojian Zhou, Dongsong Zhang, Wenai Song. Originally published in the Journal of Medical Internet Research (http://www.jmir.org), 02.04.2019. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in the Journal of Medical Internet Research, is properly cited. The complete bibliographic information, a link to the original publication on http://www.jmir.org/, as well as this copyright and license information must be included.
spellingShingle Original Paper
Zhang, Yonglai
Zhou, Yaojian
Zhang, Dongsong
Song, Wenai
A Stroke Risk Detection: Improving Hybrid Feature Selection Method
title A Stroke Risk Detection: Improving Hybrid Feature Selection Method
title_full A Stroke Risk Detection: Improving Hybrid Feature Selection Method
title_fullStr A Stroke Risk Detection: Improving Hybrid Feature Selection Method
title_full_unstemmed A Stroke Risk Detection: Improving Hybrid Feature Selection Method
title_short A Stroke Risk Detection: Improving Hybrid Feature Selection Method
title_sort stroke risk detection: improving hybrid feature selection method
topic Original Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6466481/
https://www.ncbi.nlm.nih.gov/pubmed/30938684
http://dx.doi.org/10.2196/12437
work_keys_str_mv AT zhangyonglai astrokeriskdetectionimprovinghybridfeatureselectionmethod
AT zhouyaojian astrokeriskdetectionimprovinghybridfeatureselectionmethod
AT zhangdongsong astrokeriskdetectionimprovinghybridfeatureselectionmethod
AT songwenai astrokeriskdetectionimprovinghybridfeatureselectionmethod
AT zhangyonglai strokeriskdetectionimprovinghybridfeatureselectionmethod
AT zhouyaojian strokeriskdetectionimprovinghybridfeatureselectionmethod
AT zhangdongsong strokeriskdetectionimprovinghybridfeatureselectionmethod
AT songwenai strokeriskdetectionimprovinghybridfeatureselectionmethod