Cargando…

Hypertension identification using inpatient clinical notes from electronic medical records: an explainable, data-driven algorithm study

BACKGROUND: Case identification is important for health services research, measuring health system performance and risk adjustment, but existing methods based on manual chart review or diagnosis codes can be expensive, time consuming or of limited validity. We aimed to develop a hypertension case de...

Descripción completa

Detalles Bibliográficos
Autores principales: Martin, Elliot A., D’Souza, Adam G., Lee, Seungwon, Doktorchik, Chelsea, Eastwood, Cathy A., Quan, Hude
Formato: Online Artículo Texto
Lenguaje:English
Publicado: CMA Impact Inc. 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9933992/
https://www.ncbi.nlm.nih.gov/pubmed/36787990
http://dx.doi.org/10.9778/cmajo.20210170
_version_ 1784889786641678336
author Martin, Elliot A.
D’Souza, Adam G.
Lee, Seungwon
Doktorchik, Chelsea
Eastwood, Cathy A.
Quan, Hude
author_facet Martin, Elliot A.
D’Souza, Adam G.
Lee, Seungwon
Doktorchik, Chelsea
Eastwood, Cathy A.
Quan, Hude
author_sort Martin, Elliot A.
collection PubMed
description BACKGROUND: Case identification is important for health services research, measuring health system performance and risk adjustment, but existing methods based on manual chart review or diagnosis codes can be expensive, time consuming or of limited validity. We aimed to develop a hypertension case definition in electronic medical records (EMRs) for inpatient clinical notes using machine learning. METHODS: A cohort of patients 18 years of age or older who were discharged from 1 of 3 Calgary acute care facilities (1 academic hospital and 2 community hospitals) between Jan. 1 and June 30, 2015, were randomly selected, and we compared the performance of EMR phenotype algorithms developed using machine learning with an algorithm based on the Canadian version of the International Statistical Classification of Diseases and Related Health Problems, 10th Revision (ICD), in identifying patients with hypertension. Hypertension status was determined by chart review, the machine-learning algorithms used EMR notes and the ICD algorithm used the Discharge Abstract Database (Canadian Institute for Health Information). RESULTS: Of our study sample (n = 3040), 1475 (48.5%) patients had hypertension. The group with hypertension was older (median age of 71.0 yr v. 52.5 yr for those patients without hypertension) and had fewer females (710 [48.2%] v. 764 [52.3%]). Our final EMR-based models had higher sensitivity than the ICD algorithm (> 90% v. 47%), while maintaining high positive predictive values (> 90% v. 97%). INTERPRETATION: We found that hypertension tends to have clear documentation in EMRs and is well classified by concept search on free text. Machine learning can provide insights into how and where conditions are documented in EMRs and suggest nonmachine-learning phenotypes to implement.
format Online
Article
Text
id pubmed-9933992
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher CMA Impact Inc.
record_format MEDLINE/PubMed
spelling pubmed-99339922023-02-17 Hypertension identification using inpatient clinical notes from electronic medical records: an explainable, data-driven algorithm study Martin, Elliot A. D’Souza, Adam G. Lee, Seungwon Doktorchik, Chelsea Eastwood, Cathy A. Quan, Hude CMAJ Open Research BACKGROUND: Case identification is important for health services research, measuring health system performance and risk adjustment, but existing methods based on manual chart review or diagnosis codes can be expensive, time consuming or of limited validity. We aimed to develop a hypertension case definition in electronic medical records (EMRs) for inpatient clinical notes using machine learning. METHODS: A cohort of patients 18 years of age or older who were discharged from 1 of 3 Calgary acute care facilities (1 academic hospital and 2 community hospitals) between Jan. 1 and June 30, 2015, were randomly selected, and we compared the performance of EMR phenotype algorithms developed using machine learning with an algorithm based on the Canadian version of the International Statistical Classification of Diseases and Related Health Problems, 10th Revision (ICD), in identifying patients with hypertension. Hypertension status was determined by chart review, the machine-learning algorithms used EMR notes and the ICD algorithm used the Discharge Abstract Database (Canadian Institute for Health Information). RESULTS: Of our study sample (n = 3040), 1475 (48.5%) patients had hypertension. The group with hypertension was older (median age of 71.0 yr v. 52.5 yr for those patients without hypertension) and had fewer females (710 [48.2%] v. 764 [52.3%]). Our final EMR-based models had higher sensitivity than the ICD algorithm (> 90% v. 47%), while maintaining high positive predictive values (> 90% v. 97%). INTERPRETATION: We found that hypertension tends to have clear documentation in EMRs and is well classified by concept search on free text. Machine learning can provide insights into how and where conditions are documented in EMRs and suggest nonmachine-learning phenotypes to implement. CMA Impact Inc. 2023-02-14 /pmc/articles/PMC9933992/ /pubmed/36787990 http://dx.doi.org/10.9778/cmajo.20210170 Text en © 2023 CMA Impact Inc. or its licensors https://creativecommons.org/licenses/by-nc-nd/4.0/This is an Open Access article distributed in accordance with the terms of the Creative Commons Attribution (CC BY-NC-ND 4.0) licence, which permits use, distribution and reproduction in any medium, provided that the original publication is properly cited, the use is noncommercial (i.e., research or educational use), and no modifications or adaptations are made. See: https://creativecommons.org/licenses/by-nc-nd/4.0/
spellingShingle Research
Martin, Elliot A.
D’Souza, Adam G.
Lee, Seungwon
Doktorchik, Chelsea
Eastwood, Cathy A.
Quan, Hude
Hypertension identification using inpatient clinical notes from electronic medical records: an explainable, data-driven algorithm study
title Hypertension identification using inpatient clinical notes from electronic medical records: an explainable, data-driven algorithm study
title_full Hypertension identification using inpatient clinical notes from electronic medical records: an explainable, data-driven algorithm study
title_fullStr Hypertension identification using inpatient clinical notes from electronic medical records: an explainable, data-driven algorithm study
title_full_unstemmed Hypertension identification using inpatient clinical notes from electronic medical records: an explainable, data-driven algorithm study
title_short Hypertension identification using inpatient clinical notes from electronic medical records: an explainable, data-driven algorithm study
title_sort hypertension identification using inpatient clinical notes from electronic medical records: an explainable, data-driven algorithm study
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9933992/
https://www.ncbi.nlm.nih.gov/pubmed/36787990
http://dx.doi.org/10.9778/cmajo.20210170
work_keys_str_mv AT martinelliota hypertensionidentificationusinginpatientclinicalnotesfromelectronicmedicalrecordsanexplainabledatadrivenalgorithmstudy
AT dsouzaadamg hypertensionidentificationusinginpatientclinicalnotesfromelectronicmedicalrecordsanexplainabledatadrivenalgorithmstudy
AT leeseungwon hypertensionidentificationusinginpatientclinicalnotesfromelectronicmedicalrecordsanexplainabledatadrivenalgorithmstudy
AT doktorchikchelsea hypertensionidentificationusinginpatientclinicalnotesfromelectronicmedicalrecordsanexplainabledatadrivenalgorithmstudy
AT eastwoodcathya hypertensionidentificationusinginpatientclinicalnotesfromelectronicmedicalrecordsanexplainabledatadrivenalgorithmstudy
AT quanhude hypertensionidentificationusinginpatientclinicalnotesfromelectronicmedicalrecordsanexplainabledatadrivenalgorithmstudy