Cargando…
Hypertension identification using inpatient clinical notes from electronic medical records: an explainable, data-driven algorithm study
BACKGROUND: Case identification is important for health services research, measuring health system performance and risk adjustment, but existing methods based on manual chart review or diagnosis codes can be expensive, time consuming or of limited validity. We aimed to develop a hypertension case de...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
CMA Impact Inc.
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9933992/ https://www.ncbi.nlm.nih.gov/pubmed/36787990 http://dx.doi.org/10.9778/cmajo.20210170 |
_version_ | 1784889786641678336 |
---|---|
author | Martin, Elliot A. D’Souza, Adam G. Lee, Seungwon Doktorchik, Chelsea Eastwood, Cathy A. Quan, Hude |
author_facet | Martin, Elliot A. D’Souza, Adam G. Lee, Seungwon Doktorchik, Chelsea Eastwood, Cathy A. Quan, Hude |
author_sort | Martin, Elliot A. |
collection | PubMed |
description | BACKGROUND: Case identification is important for health services research, measuring health system performance and risk adjustment, but existing methods based on manual chart review or diagnosis codes can be expensive, time consuming or of limited validity. We aimed to develop a hypertension case definition in electronic medical records (EMRs) for inpatient clinical notes using machine learning. METHODS: A cohort of patients 18 years of age or older who were discharged from 1 of 3 Calgary acute care facilities (1 academic hospital and 2 community hospitals) between Jan. 1 and June 30, 2015, were randomly selected, and we compared the performance of EMR phenotype algorithms developed using machine learning with an algorithm based on the Canadian version of the International Statistical Classification of Diseases and Related Health Problems, 10th Revision (ICD), in identifying patients with hypertension. Hypertension status was determined by chart review, the machine-learning algorithms used EMR notes and the ICD algorithm used the Discharge Abstract Database (Canadian Institute for Health Information). RESULTS: Of our study sample (n = 3040), 1475 (48.5%) patients had hypertension. The group with hypertension was older (median age of 71.0 yr v. 52.5 yr for those patients without hypertension) and had fewer females (710 [48.2%] v. 764 [52.3%]). Our final EMR-based models had higher sensitivity than the ICD algorithm (> 90% v. 47%), while maintaining high positive predictive values (> 90% v. 97%). INTERPRETATION: We found that hypertension tends to have clear documentation in EMRs and is well classified by concept search on free text. Machine learning can provide insights into how and where conditions are documented in EMRs and suggest nonmachine-learning phenotypes to implement. |
format | Online Article Text |
id | pubmed-9933992 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | CMA Impact Inc. |
record_format | MEDLINE/PubMed |
spelling | pubmed-99339922023-02-17 Hypertension identification using inpatient clinical notes from electronic medical records: an explainable, data-driven algorithm study Martin, Elliot A. D’Souza, Adam G. Lee, Seungwon Doktorchik, Chelsea Eastwood, Cathy A. Quan, Hude CMAJ Open Research BACKGROUND: Case identification is important for health services research, measuring health system performance and risk adjustment, but existing methods based on manual chart review or diagnosis codes can be expensive, time consuming or of limited validity. We aimed to develop a hypertension case definition in electronic medical records (EMRs) for inpatient clinical notes using machine learning. METHODS: A cohort of patients 18 years of age or older who were discharged from 1 of 3 Calgary acute care facilities (1 academic hospital and 2 community hospitals) between Jan. 1 and June 30, 2015, were randomly selected, and we compared the performance of EMR phenotype algorithms developed using machine learning with an algorithm based on the Canadian version of the International Statistical Classification of Diseases and Related Health Problems, 10th Revision (ICD), in identifying patients with hypertension. Hypertension status was determined by chart review, the machine-learning algorithms used EMR notes and the ICD algorithm used the Discharge Abstract Database (Canadian Institute for Health Information). RESULTS: Of our study sample (n = 3040), 1475 (48.5%) patients had hypertension. The group with hypertension was older (median age of 71.0 yr v. 52.5 yr for those patients without hypertension) and had fewer females (710 [48.2%] v. 764 [52.3%]). Our final EMR-based models had higher sensitivity than the ICD algorithm (> 90% v. 47%), while maintaining high positive predictive values (> 90% v. 97%). INTERPRETATION: We found that hypertension tends to have clear documentation in EMRs and is well classified by concept search on free text. Machine learning can provide insights into how and where conditions are documented in EMRs and suggest nonmachine-learning phenotypes to implement. CMA Impact Inc. 2023-02-14 /pmc/articles/PMC9933992/ /pubmed/36787990 http://dx.doi.org/10.9778/cmajo.20210170 Text en © 2023 CMA Impact Inc. or its licensors https://creativecommons.org/licenses/by-nc-nd/4.0/This is an Open Access article distributed in accordance with the terms of the Creative Commons Attribution (CC BY-NC-ND 4.0) licence, which permits use, distribution and reproduction in any medium, provided that the original publication is properly cited, the use is noncommercial (i.e., research or educational use), and no modifications or adaptations are made. See: https://creativecommons.org/licenses/by-nc-nd/4.0/ |
spellingShingle | Research Martin, Elliot A. D’Souza, Adam G. Lee, Seungwon Doktorchik, Chelsea Eastwood, Cathy A. Quan, Hude Hypertension identification using inpatient clinical notes from electronic medical records: an explainable, data-driven algorithm study |
title | Hypertension identification using inpatient clinical notes from electronic medical records: an explainable, data-driven algorithm study |
title_full | Hypertension identification using inpatient clinical notes from electronic medical records: an explainable, data-driven algorithm study |
title_fullStr | Hypertension identification using inpatient clinical notes from electronic medical records: an explainable, data-driven algorithm study |
title_full_unstemmed | Hypertension identification using inpatient clinical notes from electronic medical records: an explainable, data-driven algorithm study |
title_short | Hypertension identification using inpatient clinical notes from electronic medical records: an explainable, data-driven algorithm study |
title_sort | hypertension identification using inpatient clinical notes from electronic medical records: an explainable, data-driven algorithm study |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9933992/ https://www.ncbi.nlm.nih.gov/pubmed/36787990 http://dx.doi.org/10.9778/cmajo.20210170 |
work_keys_str_mv | AT martinelliota hypertensionidentificationusinginpatientclinicalnotesfromelectronicmedicalrecordsanexplainabledatadrivenalgorithmstudy AT dsouzaadamg hypertensionidentificationusinginpatientclinicalnotesfromelectronicmedicalrecordsanexplainabledatadrivenalgorithmstudy AT leeseungwon hypertensionidentificationusinginpatientclinicalnotesfromelectronicmedicalrecordsanexplainabledatadrivenalgorithmstudy AT doktorchikchelsea hypertensionidentificationusinginpatientclinicalnotesfromelectronicmedicalrecordsanexplainabledatadrivenalgorithmstudy AT eastwoodcathya hypertensionidentificationusinginpatientclinicalnotesfromelectronicmedicalrecordsanexplainabledatadrivenalgorithmstudy AT quanhude hypertensionidentificationusinginpatientclinicalnotesfromelectronicmedicalrecordsanexplainabledatadrivenalgorithmstudy |