Cargando…

Extraction of Geriatric Syndromes From Electronic Health Record Clinical Notes: Assessment of Statistical Natural Language Processing Methods

BACKGROUND: Geriatric syndromes in older adults are associated with adverse outcomes. However, despite being reported in clinical notes, these syndromes are often poorly captured by diagnostic codes in the structured fields of electronic health records (EHRs) or administrative records. OBJECTIVE: We...

Descripción completa

Detalles Bibliográficos
Autores principales:	Chen, Tao, Dredze, Mark, Weiner, Jonathan P, Hernandez, Leilani, Kimura, Joe, Kharrazi, Hadi
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	JMIR Publications 2019
Materias:	Original Paper
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6454337/ https://www.ncbi.nlm.nih.gov/pubmed/30862607 http://dx.doi.org/10.2196/13039

_version_	1783409555466616832
author	Chen, Tao Dredze, Mark Weiner, Jonathan P Hernandez, Leilani Kimura, Joe Kharrazi, Hadi
author_facet	Chen, Tao Dredze, Mark Weiner, Jonathan P Hernandez, Leilani Kimura, Joe Kharrazi, Hadi
author_sort	Chen, Tao
collection	PubMed
description	BACKGROUND: Geriatric syndromes in older adults are associated with adverse outcomes. However, despite being reported in clinical notes, these syndromes are often poorly captured by diagnostic codes in the structured fields of electronic health records (EHRs) or administrative records. OBJECTIVE: We aim to automatically determine if a patient has any geriatric syndromes by mining the free text of associated EHR clinical notes. We assessed which statistical natural language processing (NLP) techniques are most effective. METHODS: We applied conditional random fields (CRFs), a widely used machine learning algorithm, to identify each of 10 geriatric syndrome constructs in a clinical note. We assessed three sets of features and attributes for CRF operations: a base set, enhanced token, and contextual features. We trained the CRF on 3901 manually annotated notes from 85 patients, tuned the CRF on a validation set of 50 patients, and evaluated it on 50 held-out test patients. These notes were from a group of US Medicare patients over 65 years of age enrolled in a Medicare Advantage Health Maintenance Organization and cared for by a large group practice in Massachusetts. RESULTS: A final feature set was formed through comprehensive feature ablation experiments. The final CRF model performed well at patient-level determination (macroaverage F1=0.834, microaverage F1=0.851); however, performance varied by construct. For example, at phrase-partial evaluation, the CRF model worked well on constructs such as absence of fecal control (F1=0.857) and vision impairment (F1=0.798) but poorly on malnutrition (F1=0.155), weight loss (F1=0.394), and severe urinary control issues (F1=0.532). Errors were primarily due to previously unobserved words (ie, out-of-vocabulary) and a lack of context. CONCLUSIONS: This study shows that statistical NLP can be used to identify geriatric syndromes from EHR-extracted clinical notes. This creates new opportunities to identify patients with geriatric syndromes and study their health outcomes.
format	Online Article Text
id	pubmed-6454337
institution	National Center for Biotechnology Information
language	English
publishDate	2019
publisher	JMIR Publications
record_format	MEDLINE/PubMed
spelling	pubmed-64543372019-04-26 Extraction of Geriatric Syndromes From Electronic Health Record Clinical Notes: Assessment of Statistical Natural Language Processing Methods Chen, Tao Dredze, Mark Weiner, Jonathan P Hernandez, Leilani Kimura, Joe Kharrazi, Hadi JMIR Med Inform Original Paper BACKGROUND: Geriatric syndromes in older adults are associated with adverse outcomes. However, despite being reported in clinical notes, these syndromes are often poorly captured by diagnostic codes in the structured fields of electronic health records (EHRs) or administrative records. OBJECTIVE: We aim to automatically determine if a patient has any geriatric syndromes by mining the free text of associated EHR clinical notes. We assessed which statistical natural language processing (NLP) techniques are most effective. METHODS: We applied conditional random fields (CRFs), a widely used machine learning algorithm, to identify each of 10 geriatric syndrome constructs in a clinical note. We assessed three sets of features and attributes for CRF operations: a base set, enhanced token, and contextual features. We trained the CRF on 3901 manually annotated notes from 85 patients, tuned the CRF on a validation set of 50 patients, and evaluated it on 50 held-out test patients. These notes were from a group of US Medicare patients over 65 years of age enrolled in a Medicare Advantage Health Maintenance Organization and cared for by a large group practice in Massachusetts. RESULTS: A final feature set was formed through comprehensive feature ablation experiments. The final CRF model performed well at patient-level determination (macroaverage F1=0.834, microaverage F1=0.851); however, performance varied by construct. For example, at phrase-partial evaluation, the CRF model worked well on constructs such as absence of fecal control (F1=0.857) and vision impairment (F1=0.798) but poorly on malnutrition (F1=0.155), weight loss (F1=0.394), and severe urinary control issues (F1=0.532). Errors were primarily due to previously unobserved words (ie, out-of-vocabulary) and a lack of context. CONCLUSIONS: This study shows that statistical NLP can be used to identify geriatric syndromes from EHR-extracted clinical notes. This creates new opportunities to identify patients with geriatric syndromes and study their health outcomes. JMIR Publications 2019-03-26 /pmc/articles/PMC6454337/ /pubmed/30862607 http://dx.doi.org/10.2196/13039 Text en ©Tao Chen, Mark Dredze, Jonathan P Weiner, Leilani Hernandez, Joe Kimura, Hadi Kharrazi. Originally published in JMIR Medical Informatics (http://medinform.jmir.org), 26.03.2019. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Medical Informatics, is properly cited. The complete bibliographic information, a link to the original publication on http://medinform.jmir.org/, as well as this copyright and license information must be included.
spellingShingle	Original Paper Chen, Tao Dredze, Mark Weiner, Jonathan P Hernandez, Leilani Kimura, Joe Kharrazi, Hadi Extraction of Geriatric Syndromes From Electronic Health Record Clinical Notes: Assessment of Statistical Natural Language Processing Methods
title	Extraction of Geriatric Syndromes From Electronic Health Record Clinical Notes: Assessment of Statistical Natural Language Processing Methods
title_full	Extraction of Geriatric Syndromes From Electronic Health Record Clinical Notes: Assessment of Statistical Natural Language Processing Methods
title_fullStr	Extraction of Geriatric Syndromes From Electronic Health Record Clinical Notes: Assessment of Statistical Natural Language Processing Methods
title_full_unstemmed	Extraction of Geriatric Syndromes From Electronic Health Record Clinical Notes: Assessment of Statistical Natural Language Processing Methods
title_short	Extraction of Geriatric Syndromes From Electronic Health Record Clinical Notes: Assessment of Statistical Natural Language Processing Methods
title_sort	extraction of geriatric syndromes from electronic health record clinical notes: assessment of statistical natural language processing methods
topic	Original Paper
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6454337/ https://www.ncbi.nlm.nih.gov/pubmed/30862607 http://dx.doi.org/10.2196/13039
work_keys_str_mv	AT chentao extractionofgeriatricsyndromesfromelectronichealthrecordclinicalnotesassessmentofstatisticalnaturallanguageprocessingmethods AT dredzemark extractionofgeriatricsyndromesfromelectronichealthrecordclinicalnotesassessmentofstatisticalnaturallanguageprocessingmethods AT weinerjonathanp extractionofgeriatricsyndromesfromelectronichealthrecordclinicalnotesassessmentofstatisticalnaturallanguageprocessingmethods AT hernandezleilani extractionofgeriatricsyndromesfromelectronichealthrecordclinicalnotesassessmentofstatisticalnaturallanguageprocessingmethods AT kimurajoe extractionofgeriatricsyndromesfromelectronichealthrecordclinicalnotesassessmentofstatisticalnaturallanguageprocessingmethods AT kharrazihadi extractionofgeriatricsyndromesfromelectronichealthrecordclinicalnotesassessmentofstatisticalnaturallanguageprocessingmethods

Extraction of Geriatric Syndromes From Electronic Health Record Clinical Notes: Assessment of Statistical Natural Language Processing Methods

Ejemplares similares