Cargando…
Defining the Age of Young Ischemic Stroke Using Data-Driven Approaches
Introduction: The cut-point for defining the age of young ischemic stroke (IS) is clinically and epidemiologically important, yet it is arbitrary and differs across studies. In this study, we leveraged electronic health records (EHRs) and data science techniques to estimate an optimal cut-point for...
Autores principales: | , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10095415/ https://www.ncbi.nlm.nih.gov/pubmed/37048683 http://dx.doi.org/10.3390/jcm12072600 |
_version_ | 1785024077982859264 |
---|---|
author | Abedi, Vida Lambert, Clare Chaudhary, Durgesh Rieder, Emily Avula, Venkatesh Hwang, Wenke Li, Jiang Zand, Ramin |
author_facet | Abedi, Vida Lambert, Clare Chaudhary, Durgesh Rieder, Emily Avula, Venkatesh Hwang, Wenke Li, Jiang Zand, Ramin |
author_sort | Abedi, Vida |
collection | PubMed |
description | Introduction: The cut-point for defining the age of young ischemic stroke (IS) is clinically and epidemiologically important, yet it is arbitrary and differs across studies. In this study, we leveraged electronic health records (EHRs) and data science techniques to estimate an optimal cut-point for defining the age of young IS. Methods: Patient-level EHRs were extracted from 13 hospitals in Pennsylvania, and used in two parallel approaches. The first approach included ICD9/10, from IS patients to group comorbidities, and computed similarity scores between every patient pair. We determined the optimal age of young IS by analyzing the trend of patient similarity with respect to their clinical profile for different ages of index IS. The second approach used the IS cohort and control (without IS), and built three sets of machine-learning models—generalized linear regression (GLM), random forest (RF), and XGBoost (XGB)—to classify patients for seventeen age groups. After extracting feature importance from the models, we determined the optimal age of young IS by analyzing the pattern of comorbidity with respect to the age of index IS. Both approaches were completed separately for male and female patients. Results: The stroke cohort contained 7555 ISs, and the control included 31,067 patients. In the first approach, the optimal age of young stroke was 53.7 and 51.0 years in female and male patients, respectively. In the second approach, we created 102 models, based on three algorithms, 17 age brackets, and two sexes. The optimal age was 53 (GLM), 52 (RF), and 54 (XGB) for female, and 52 (GLM and RF) and 53 (RF) for male patients. Different age and sex groups exhibited different comorbidity patterns. Discussion: Using a data-driven approach, we determined the age of young stroke to be 54 years for women and 52 years for men in our mainly rural population, in central Pennsylvania. Future validation studies should include more diverse populations. |
format | Online Article Text |
id | pubmed-10095415 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-100954152023-04-13 Defining the Age of Young Ischemic Stroke Using Data-Driven Approaches Abedi, Vida Lambert, Clare Chaudhary, Durgesh Rieder, Emily Avula, Venkatesh Hwang, Wenke Li, Jiang Zand, Ramin J Clin Med Article Introduction: The cut-point for defining the age of young ischemic stroke (IS) is clinically and epidemiologically important, yet it is arbitrary and differs across studies. In this study, we leveraged electronic health records (EHRs) and data science techniques to estimate an optimal cut-point for defining the age of young IS. Methods: Patient-level EHRs were extracted from 13 hospitals in Pennsylvania, and used in two parallel approaches. The first approach included ICD9/10, from IS patients to group comorbidities, and computed similarity scores between every patient pair. We determined the optimal age of young IS by analyzing the trend of patient similarity with respect to their clinical profile for different ages of index IS. The second approach used the IS cohort and control (without IS), and built three sets of machine-learning models—generalized linear regression (GLM), random forest (RF), and XGBoost (XGB)—to classify patients for seventeen age groups. After extracting feature importance from the models, we determined the optimal age of young IS by analyzing the pattern of comorbidity with respect to the age of index IS. Both approaches were completed separately for male and female patients. Results: The stroke cohort contained 7555 ISs, and the control included 31,067 patients. In the first approach, the optimal age of young stroke was 53.7 and 51.0 years in female and male patients, respectively. In the second approach, we created 102 models, based on three algorithms, 17 age brackets, and two sexes. The optimal age was 53 (GLM), 52 (RF), and 54 (XGB) for female, and 52 (GLM and RF) and 53 (RF) for male patients. Different age and sex groups exhibited different comorbidity patterns. Discussion: Using a data-driven approach, we determined the age of young stroke to be 54 years for women and 52 years for men in our mainly rural population, in central Pennsylvania. Future validation studies should include more diverse populations. MDPI 2023-03-30 /pmc/articles/PMC10095415/ /pubmed/37048683 http://dx.doi.org/10.3390/jcm12072600 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Abedi, Vida Lambert, Clare Chaudhary, Durgesh Rieder, Emily Avula, Venkatesh Hwang, Wenke Li, Jiang Zand, Ramin Defining the Age of Young Ischemic Stroke Using Data-Driven Approaches |
title | Defining the Age of Young Ischemic Stroke Using Data-Driven Approaches |
title_full | Defining the Age of Young Ischemic Stroke Using Data-Driven Approaches |
title_fullStr | Defining the Age of Young Ischemic Stroke Using Data-Driven Approaches |
title_full_unstemmed | Defining the Age of Young Ischemic Stroke Using Data-Driven Approaches |
title_short | Defining the Age of Young Ischemic Stroke Using Data-Driven Approaches |
title_sort | defining the age of young ischemic stroke using data-driven approaches |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10095415/ https://www.ncbi.nlm.nih.gov/pubmed/37048683 http://dx.doi.org/10.3390/jcm12072600 |
work_keys_str_mv | AT abedivida definingtheageofyoungischemicstrokeusingdatadrivenapproaches AT lambertclare definingtheageofyoungischemicstrokeusingdatadrivenapproaches AT chaudharydurgesh definingtheageofyoungischemicstrokeusingdatadrivenapproaches AT riederemily definingtheageofyoungischemicstrokeusingdatadrivenapproaches AT avulavenkatesh definingtheageofyoungischemicstrokeusingdatadrivenapproaches AT hwangwenke definingtheageofyoungischemicstrokeusingdatadrivenapproaches AT lijiang definingtheageofyoungischemicstrokeusingdatadrivenapproaches AT zandramin definingtheageofyoungischemicstrokeusingdatadrivenapproaches |