Cargando…

Defining the Age of Young Ischemic Stroke Using Data-Driven Approaches

Introduction: The cut-point for defining the age of young ischemic stroke (IS) is clinically and epidemiologically important, yet it is arbitrary and differs across studies. In this study, we leveraged electronic health records (EHRs) and data science techniques to estimate an optimal cut-point for...

Descripción completa

Detalles Bibliográficos
Autores principales: Abedi, Vida, Lambert, Clare, Chaudhary, Durgesh, Rieder, Emily, Avula, Venkatesh, Hwang, Wenke, Li, Jiang, Zand, Ramin
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10095415/
https://www.ncbi.nlm.nih.gov/pubmed/37048683
http://dx.doi.org/10.3390/jcm12072600
_version_ 1785024077982859264
author Abedi, Vida
Lambert, Clare
Chaudhary, Durgesh
Rieder, Emily
Avula, Venkatesh
Hwang, Wenke
Li, Jiang
Zand, Ramin
author_facet Abedi, Vida
Lambert, Clare
Chaudhary, Durgesh
Rieder, Emily
Avula, Venkatesh
Hwang, Wenke
Li, Jiang
Zand, Ramin
author_sort Abedi, Vida
collection PubMed
description Introduction: The cut-point for defining the age of young ischemic stroke (IS) is clinically and epidemiologically important, yet it is arbitrary and differs across studies. In this study, we leveraged electronic health records (EHRs) and data science techniques to estimate an optimal cut-point for defining the age of young IS. Methods: Patient-level EHRs were extracted from 13 hospitals in Pennsylvania, and used in two parallel approaches. The first approach included ICD9/10, from IS patients to group comorbidities, and computed similarity scores between every patient pair. We determined the optimal age of young IS by analyzing the trend of patient similarity with respect to their clinical profile for different ages of index IS. The second approach used the IS cohort and control (without IS), and built three sets of machine-learning models—generalized linear regression (GLM), random forest (RF), and XGBoost (XGB)—to classify patients for seventeen age groups. After extracting feature importance from the models, we determined the optimal age of young IS by analyzing the pattern of comorbidity with respect to the age of index IS. Both approaches were completed separately for male and female patients. Results: The stroke cohort contained 7555 ISs, and the control included 31,067 patients. In the first approach, the optimal age of young stroke was 53.7 and 51.0 years in female and male patients, respectively. In the second approach, we created 102 models, based on three algorithms, 17 age brackets, and two sexes. The optimal age was 53 (GLM), 52 (RF), and 54 (XGB) for female, and 52 (GLM and RF) and 53 (RF) for male patients. Different age and sex groups exhibited different comorbidity patterns. Discussion: Using a data-driven approach, we determined the age of young stroke to be 54 years for women and 52 years for men in our mainly rural population, in central Pennsylvania. Future validation studies should include more diverse populations.
format Online
Article
Text
id pubmed-10095415
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-100954152023-04-13 Defining the Age of Young Ischemic Stroke Using Data-Driven Approaches Abedi, Vida Lambert, Clare Chaudhary, Durgesh Rieder, Emily Avula, Venkatesh Hwang, Wenke Li, Jiang Zand, Ramin J Clin Med Article Introduction: The cut-point for defining the age of young ischemic stroke (IS) is clinically and epidemiologically important, yet it is arbitrary and differs across studies. In this study, we leveraged electronic health records (EHRs) and data science techniques to estimate an optimal cut-point for defining the age of young IS. Methods: Patient-level EHRs were extracted from 13 hospitals in Pennsylvania, and used in two parallel approaches. The first approach included ICD9/10, from IS patients to group comorbidities, and computed similarity scores between every patient pair. We determined the optimal age of young IS by analyzing the trend of patient similarity with respect to their clinical profile for different ages of index IS. The second approach used the IS cohort and control (without IS), and built three sets of machine-learning models—generalized linear regression (GLM), random forest (RF), and XGBoost (XGB)—to classify patients for seventeen age groups. After extracting feature importance from the models, we determined the optimal age of young IS by analyzing the pattern of comorbidity with respect to the age of index IS. Both approaches were completed separately for male and female patients. Results: The stroke cohort contained 7555 ISs, and the control included 31,067 patients. In the first approach, the optimal age of young stroke was 53.7 and 51.0 years in female and male patients, respectively. In the second approach, we created 102 models, based on three algorithms, 17 age brackets, and two sexes. The optimal age was 53 (GLM), 52 (RF), and 54 (XGB) for female, and 52 (GLM and RF) and 53 (RF) for male patients. Different age and sex groups exhibited different comorbidity patterns. Discussion: Using a data-driven approach, we determined the age of young stroke to be 54 years for women and 52 years for men in our mainly rural population, in central Pennsylvania. Future validation studies should include more diverse populations. MDPI 2023-03-30 /pmc/articles/PMC10095415/ /pubmed/37048683 http://dx.doi.org/10.3390/jcm12072600 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Abedi, Vida
Lambert, Clare
Chaudhary, Durgesh
Rieder, Emily
Avula, Venkatesh
Hwang, Wenke
Li, Jiang
Zand, Ramin
Defining the Age of Young Ischemic Stroke Using Data-Driven Approaches
title Defining the Age of Young Ischemic Stroke Using Data-Driven Approaches
title_full Defining the Age of Young Ischemic Stroke Using Data-Driven Approaches
title_fullStr Defining the Age of Young Ischemic Stroke Using Data-Driven Approaches
title_full_unstemmed Defining the Age of Young Ischemic Stroke Using Data-Driven Approaches
title_short Defining the Age of Young Ischemic Stroke Using Data-Driven Approaches
title_sort defining the age of young ischemic stroke using data-driven approaches
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10095415/
https://www.ncbi.nlm.nih.gov/pubmed/37048683
http://dx.doi.org/10.3390/jcm12072600
work_keys_str_mv AT abedivida definingtheageofyoungischemicstrokeusingdatadrivenapproaches
AT lambertclare definingtheageofyoungischemicstrokeusingdatadrivenapproaches
AT chaudharydurgesh definingtheageofyoungischemicstrokeusingdatadrivenapproaches
AT riederemily definingtheageofyoungischemicstrokeusingdatadrivenapproaches
AT avulavenkatesh definingtheageofyoungischemicstrokeusingdatadrivenapproaches
AT hwangwenke definingtheageofyoungischemicstrokeusingdatadrivenapproaches
AT lijiang definingtheageofyoungischemicstrokeusingdatadrivenapproaches
AT zandramin definingtheageofyoungischemicstrokeusingdatadrivenapproaches