Cargando…

StrokeClassifier: Ischemic Stroke Etiology Classification by Ensemble Consensus Modeling Using Electronic Health Records

Determining the etiology of an acute ischemic stroke (AIS) is fundamental to secondary stroke prevention efforts but can be diagnostically challenging. We trained and validated an automated classification machine intelligence tool, StrokeClassifier, using electronic health record (EHR) text data fro...

Descripción completa

Detalles Bibliográficos
Autores principales: Lee, Ho-Joon, Schwamm, Lee H., Sansing, Lauren, Kamel, Hooman, de Havenon, Adam, Turner, Ashby C., Sheth, Kevin N., Krishnaswamy, Smita, Brandt, Cynthia, Zhao, Hongyu, Krumholz, Harlan, Sharma, Richa
Formato: Online Artículo Texto
Lenguaje:English
Publicado: American Journal Experts 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10635373/
https://www.ncbi.nlm.nih.gov/pubmed/37961532
http://dx.doi.org/10.21203/rs.3.rs-3367169/v1
_version_ 1785146335410782208
author Lee, Ho-Joon
Schwamm, Lee H.
Sansing, Lauren
Kamel, Hooman
de Havenon, Adam
Turner, Ashby C.
Sheth, Kevin N.
Krishnaswamy, Smita
Brandt, Cynthia
Zhao, Hongyu
Krumholz, Harlan
Sharma, Richa
author_facet Lee, Ho-Joon
Schwamm, Lee H.
Sansing, Lauren
Kamel, Hooman
de Havenon, Adam
Turner, Ashby C.
Sheth, Kevin N.
Krishnaswamy, Smita
Brandt, Cynthia
Zhao, Hongyu
Krumholz, Harlan
Sharma, Richa
author_sort Lee, Ho-Joon
collection PubMed
description Determining the etiology of an acute ischemic stroke (AIS) is fundamental to secondary stroke prevention efforts but can be diagnostically challenging. We trained and validated an automated classification machine intelligence tool, StrokeClassifier, using electronic health record (EHR) text data from 2,039 non-cryptogenic AIS patients at 2 academic hospitals to predict the 4-level outcome of stroke etiology determined by agreement of at least 2 board-certified vascular neurologists’ review of the stroke hospitalization EHR. StrokeClassifier is an ensemble consensus meta-model of 9 machine learning classifiers applied to features extracted from discharge summary texts by natural language processing. StrokeClassifier was externally validated in 406 discharge summaries from the MIMIC-III dataset reviewed by a vascular neurologist to ascertain stroke etiology. Compared with stroke etiologies adjudicated by vascular neurologists, StrokeClassifier achieved the mean cross-validated accuracy of 0.74 (±0.01) and weighted F1 of 0.74 (±0.01). In the MIMIC-III cohort, the accuracy and weighted F1 of StrokeClassifier were 0.70 and 0.71, respectively. SHapley Additive exPlanation analysis elucidated that the top 5 features contributing to stroke etiology prediction were atrial fibrillation, age, middle cerebral artery occlusion, internal carotid artery occlusion, and frontal stroke location. We then designed a certainty heuristic to deem a StrokeClassifier diagnosis as confidently non-cryptogenic by the degree of consensus among the 9 classifiers, and applied it to 788 cryptogenic patients. This reduced the percentage of the cryptogenic strokes from 25.2% to 7.2% of all ischemic strokes. StrokeClassifier is a validated artificial intelligence tool that rivals the performance of vascular neurologists in classifying ischemic stroke etiology for individual patients. With further training, StrokeClassifier may have downstream applications including its use as a clinical decision support system.
format Online
Article
Text
id pubmed-10635373
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher American Journal Experts
record_format MEDLINE/PubMed
spelling pubmed-106353732023-11-13 StrokeClassifier: Ischemic Stroke Etiology Classification by Ensemble Consensus Modeling Using Electronic Health Records Lee, Ho-Joon Schwamm, Lee H. Sansing, Lauren Kamel, Hooman de Havenon, Adam Turner, Ashby C. Sheth, Kevin N. Krishnaswamy, Smita Brandt, Cynthia Zhao, Hongyu Krumholz, Harlan Sharma, Richa Res Sq Article Determining the etiology of an acute ischemic stroke (AIS) is fundamental to secondary stroke prevention efforts but can be diagnostically challenging. We trained and validated an automated classification machine intelligence tool, StrokeClassifier, using electronic health record (EHR) text data from 2,039 non-cryptogenic AIS patients at 2 academic hospitals to predict the 4-level outcome of stroke etiology determined by agreement of at least 2 board-certified vascular neurologists’ review of the stroke hospitalization EHR. StrokeClassifier is an ensemble consensus meta-model of 9 machine learning classifiers applied to features extracted from discharge summary texts by natural language processing. StrokeClassifier was externally validated in 406 discharge summaries from the MIMIC-III dataset reviewed by a vascular neurologist to ascertain stroke etiology. Compared with stroke etiologies adjudicated by vascular neurologists, StrokeClassifier achieved the mean cross-validated accuracy of 0.74 (±0.01) and weighted F1 of 0.74 (±0.01). In the MIMIC-III cohort, the accuracy and weighted F1 of StrokeClassifier were 0.70 and 0.71, respectively. SHapley Additive exPlanation analysis elucidated that the top 5 features contributing to stroke etiology prediction were atrial fibrillation, age, middle cerebral artery occlusion, internal carotid artery occlusion, and frontal stroke location. We then designed a certainty heuristic to deem a StrokeClassifier diagnosis as confidently non-cryptogenic by the degree of consensus among the 9 classifiers, and applied it to 788 cryptogenic patients. This reduced the percentage of the cryptogenic strokes from 25.2% to 7.2% of all ischemic strokes. StrokeClassifier is a validated artificial intelligence tool that rivals the performance of vascular neurologists in classifying ischemic stroke etiology for individual patients. With further training, StrokeClassifier may have downstream applications including its use as a clinical decision support system. American Journal Experts 2023-10-31 /pmc/articles/PMC10635373/ /pubmed/37961532 http://dx.doi.org/10.21203/rs.3.rs-3367169/v1 Text en https://creativecommons.org/licenses/by/4.0/This work is licensed under a Creative Commons Attribution 4.0 International License (https://creativecommons.org/licenses/by/4.0/) , which allows reusers to distribute, remix, adapt, and build upon the material in any medium or format, so long as attribution is given to the creator. The license allows for commercial use.
spellingShingle Article
Lee, Ho-Joon
Schwamm, Lee H.
Sansing, Lauren
Kamel, Hooman
de Havenon, Adam
Turner, Ashby C.
Sheth, Kevin N.
Krishnaswamy, Smita
Brandt, Cynthia
Zhao, Hongyu
Krumholz, Harlan
Sharma, Richa
StrokeClassifier: Ischemic Stroke Etiology Classification by Ensemble Consensus Modeling Using Electronic Health Records
title StrokeClassifier: Ischemic Stroke Etiology Classification by Ensemble Consensus Modeling Using Electronic Health Records
title_full StrokeClassifier: Ischemic Stroke Etiology Classification by Ensemble Consensus Modeling Using Electronic Health Records
title_fullStr StrokeClassifier: Ischemic Stroke Etiology Classification by Ensemble Consensus Modeling Using Electronic Health Records
title_full_unstemmed StrokeClassifier: Ischemic Stroke Etiology Classification by Ensemble Consensus Modeling Using Electronic Health Records
title_short StrokeClassifier: Ischemic Stroke Etiology Classification by Ensemble Consensus Modeling Using Electronic Health Records
title_sort strokeclassifier: ischemic stroke etiology classification by ensemble consensus modeling using electronic health records
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10635373/
https://www.ncbi.nlm.nih.gov/pubmed/37961532
http://dx.doi.org/10.21203/rs.3.rs-3367169/v1
work_keys_str_mv AT leehojoon strokeclassifierischemicstrokeetiologyclassificationbyensembleconsensusmodelingusingelectronichealthrecords
AT schwammleeh strokeclassifierischemicstrokeetiologyclassificationbyensembleconsensusmodelingusingelectronichealthrecords
AT sansinglauren strokeclassifierischemicstrokeetiologyclassificationbyensembleconsensusmodelingusingelectronichealthrecords
AT kamelhooman strokeclassifierischemicstrokeetiologyclassificationbyensembleconsensusmodelingusingelectronichealthrecords
AT dehavenonadam strokeclassifierischemicstrokeetiologyclassificationbyensembleconsensusmodelingusingelectronichealthrecords
AT turnerashbyc strokeclassifierischemicstrokeetiologyclassificationbyensembleconsensusmodelingusingelectronichealthrecords
AT shethkevinn strokeclassifierischemicstrokeetiologyclassificationbyensembleconsensusmodelingusingelectronichealthrecords
AT krishnaswamysmita strokeclassifierischemicstrokeetiologyclassificationbyensembleconsensusmodelingusingelectronichealthrecords
AT brandtcynthia strokeclassifierischemicstrokeetiologyclassificationbyensembleconsensusmodelingusingelectronichealthrecords
AT zhaohongyu strokeclassifierischemicstrokeetiologyclassificationbyensembleconsensusmodelingusingelectronichealthrecords
AT krumholzharlan strokeclassifierischemicstrokeetiologyclassificationbyensembleconsensusmodelingusingelectronichealthrecords
AT sharmaricha strokeclassifierischemicstrokeetiologyclassificationbyensembleconsensusmodelingusingelectronichealthrecords