Cargando…

Retrospective comparison of traditional and artificial intelligence-based heart failure phenotyping in a US health system to enable real-world evidence

OBJECTIVE: Quantitatively evaluate the quality of data underlying real-world evidence (RWE) in heart failure (HF). DESIGN: Retrospective comparison of accuracy in identifying patients with HF and phenotypic information was made using traditional (ie, structured query language applied to structured e...

Descripción completa

Detalles Bibliográficos
Autores principales:	Garan, Arthur Reshad, Monda, Keri L, Dent-Acosta, Ricardo E, Riskin, Daniel J, Gluckman, Ty J
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BMJ Publishing Group 2023
Materias:	Health Informatics
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10414071/ https://www.ncbi.nlm.nih.gov/pubmed/37558448 http://dx.doi.org/10.1136/bmjopen-2023-073178

_version_	1785087266472853504
author	Garan, Arthur Reshad Monda, Keri L Dent-Acosta, Ricardo E Riskin, Daniel J Gluckman, Ty J
author_facet	Garan, Arthur Reshad Monda, Keri L Dent-Acosta, Ricardo E Riskin, Daniel J Gluckman, Ty J
author_sort	Garan, Arthur Reshad
collection	PubMed
description	OBJECTIVE: Quantitatively evaluate the quality of data underlying real-world evidence (RWE) in heart failure (HF). DESIGN: Retrospective comparison of accuracy in identifying patients with HF and phenotypic information was made using traditional (ie, structured query language applied to structured electronic health record (EHR) data) and advanced (ie, artificial intelligence (AI) applied to unstructured EHR data) RWE approaches. The performance of each approach was measured by the harmonic mean of precision and recall (F(1) score) using manual annotation of medical records as a reference standard. SETTING: EHR data from a large academic healthcare system in North America between 2015 and 2019, with an expected catchment of approximately 5 00 000 patients. POPULATION: 4288 encounters for 1155 patients aged 18–85 years, with 472 patients identified as having HF. OUTCOME MEASURES: HF and associated concepts, such as comorbidities, left ventricular ejection fraction, and selected medications. RESULTS: The average F(1) scores across 19 HF-specific concepts were 49.0% and 94.1% for the traditional and advanced approaches, respectively (p<0.001 for all concepts with available data). The absolute difference in F(1) score between approaches was 45.1% (98.1% relative increase in F(1) score using the advanced approach). The advanced approach achieved superior F(1) scores for HF presence, phenotype and associated comorbidities. Some phenotypes, such as HF with preserved ejection fraction, revealed dramatic differences in extraction accuracy based on technology applied, with a 4.9% F(1) score when using natural language processing (NLP) alone and a 91.0% F(1) score when using NLP plus AI-based inference. CONCLUSIONS: A traditional RWE generation approach resulted in low data quality in patients with HF. While an advanced approach demonstrated high accuracy, the results varied dramatically based on extraction techniques. For future studies, advanced approaches and accuracy measurement may be required to ensure data are fit-for-purpose.
format	Online Article Text
id	pubmed-10414071
institution	National Center for Biotechnology Information
language	English
publishDate	2023
publisher	BMJ Publishing Group
record_format	MEDLINE/PubMed
spelling	pubmed-104140712023-08-11 Retrospective comparison of traditional and artificial intelligence-based heart failure phenotyping in a US health system to enable real-world evidence Garan, Arthur Reshad Monda, Keri L Dent-Acosta, Ricardo E Riskin, Daniel J Gluckman, Ty J BMJ Open Health Informatics OBJECTIVE: Quantitatively evaluate the quality of data underlying real-world evidence (RWE) in heart failure (HF). DESIGN: Retrospective comparison of accuracy in identifying patients with HF and phenotypic information was made using traditional (ie, structured query language applied to structured electronic health record (EHR) data) and advanced (ie, artificial intelligence (AI) applied to unstructured EHR data) RWE approaches. The performance of each approach was measured by the harmonic mean of precision and recall (F(1) score) using manual annotation of medical records as a reference standard. SETTING: EHR data from a large academic healthcare system in North America between 2015 and 2019, with an expected catchment of approximately 5 00 000 patients. POPULATION: 4288 encounters for 1155 patients aged 18–85 years, with 472 patients identified as having HF. OUTCOME MEASURES: HF and associated concepts, such as comorbidities, left ventricular ejection fraction, and selected medications. RESULTS: The average F(1) scores across 19 HF-specific concepts were 49.0% and 94.1% for the traditional and advanced approaches, respectively (p<0.001 for all concepts with available data). The absolute difference in F(1) score between approaches was 45.1% (98.1% relative increase in F(1) score using the advanced approach). The advanced approach achieved superior F(1) scores for HF presence, phenotype and associated comorbidities. Some phenotypes, such as HF with preserved ejection fraction, revealed dramatic differences in extraction accuracy based on technology applied, with a 4.9% F(1) score when using natural language processing (NLP) alone and a 91.0% F(1) score when using NLP plus AI-based inference. CONCLUSIONS: A traditional RWE generation approach resulted in low data quality in patients with HF. While an advanced approach demonstrated high accuracy, the results varied dramatically based on extraction techniques. For future studies, advanced approaches and accuracy measurement may be required to ensure data are fit-for-purpose. BMJ Publishing Group 2023-08-09 /pmc/articles/PMC10414071/ /pubmed/37558448 http://dx.doi.org/10.1136/bmjopen-2023-073178 Text en © Author(s) (or their employer(s)) 2023. Re-use permitted under CC BY-NC. No commercial re-use. See rights and permissions. Published by BMJ. https://creativecommons.org/licenses/by-nc/4.0/This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/ (https://creativecommons.org/licenses/by-nc/4.0/) .
spellingShingle	Health Informatics Garan, Arthur Reshad Monda, Keri L Dent-Acosta, Ricardo E Riskin, Daniel J Gluckman, Ty J Retrospective comparison of traditional and artificial intelligence-based heart failure phenotyping in a US health system to enable real-world evidence
title	Retrospective comparison of traditional and artificial intelligence-based heart failure phenotyping in a US health system to enable real-world evidence
title_full	Retrospective comparison of traditional and artificial intelligence-based heart failure phenotyping in a US health system to enable real-world evidence
title_fullStr	Retrospective comparison of traditional and artificial intelligence-based heart failure phenotyping in a US health system to enable real-world evidence
title_full_unstemmed	Retrospective comparison of traditional and artificial intelligence-based heart failure phenotyping in a US health system to enable real-world evidence
title_short	Retrospective comparison of traditional and artificial intelligence-based heart failure phenotyping in a US health system to enable real-world evidence
title_sort	retrospective comparison of traditional and artificial intelligence-based heart failure phenotyping in a us health system to enable real-world evidence
topic	Health Informatics
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10414071/ https://www.ncbi.nlm.nih.gov/pubmed/37558448 http://dx.doi.org/10.1136/bmjopen-2023-073178
work_keys_str_mv	AT garanarthurreshad retrospectivecomparisonoftraditionalandartificialintelligencebasedheartfailurephenotypinginaushealthsystemtoenablerealworldevidence AT mondakeril retrospectivecomparisonoftraditionalandartificialintelligencebasedheartfailurephenotypinginaushealthsystemtoenablerealworldevidence AT dentacostaricardoe retrospectivecomparisonoftraditionalandartificialintelligencebasedheartfailurephenotypinginaushealthsystemtoenablerealworldevidence AT riskindanielj retrospectivecomparisonoftraditionalandartificialintelligencebasedheartfailurephenotypinginaushealthsystemtoenablerealworldevidence AT gluckmantyj retrospectivecomparisonoftraditionalandartificialintelligencebasedheartfailurephenotypinginaushealthsystemtoenablerealworldevidence

Retrospective comparison of traditional and artificial intelligence-based heart failure phenotyping in a US health system to enable real-world evidence

Ejemplares similares