Cargando…

Assessment of Natural Language Processing Methods for Ascertaining the Expanded Disability Status Scale Score From the Electronic Health Records of Patients With Multiple Sclerosis: Algorithm Development and Validation Study

BACKGROUND: The Expanded Disability Status Scale (EDSS) score is a widely used measure to monitor disability progression in people with multiple sclerosis (MS). However, extracting and deriving the EDSS score from unstructured electronic health records can be time-consuming. OBJECTIVE: We aimed to c...

Descripción completa

Detalles Bibliográficos
Autores principales:	Yang, Zhen, Pou-Prom, Chloé, Jones, Ashley, Banning, Michaelia, Dai, David, Mamdani, Muhammad, Oh, Jiwon, Antoniou, Tony
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	JMIR Publications 2022
Materias:	Original Paper
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8792771/ https://www.ncbi.nlm.nih.gov/pubmed/35019849 http://dx.doi.org/10.2196/25157

_version_	1784640451916070912
author	Yang, Zhen Pou-Prom, Chloé Jones, Ashley Banning, Michaelia Dai, David Mamdani, Muhammad Oh, Jiwon Antoniou, Tony
author_facet	Yang, Zhen Pou-Prom, Chloé Jones, Ashley Banning, Michaelia Dai, David Mamdani, Muhammad Oh, Jiwon Antoniou, Tony
author_sort	Yang, Zhen
collection	PubMed
description	BACKGROUND: The Expanded Disability Status Scale (EDSS) score is a widely used measure to monitor disability progression in people with multiple sclerosis (MS). However, extracting and deriving the EDSS score from unstructured electronic health records can be time-consuming. OBJECTIVE: We aimed to compare rule-based and deep learning natural language processing algorithms for detecting and predicting the total EDSS score and EDSS functional system subscores from the electronic health records of patients with MS. METHODS: We studied 17,452 electronic health records of 4906 MS patients followed at one of Canada’s largest MS clinics between June 2015 and July 2019. We randomly divided the records into training (80%) and test (20%) data sets, and compared the performance characteristics of 3 natural language processing models. First, we applied a rule-based approach, extracting the EDSS score from sentences containing the keyword “EDSS.” Next, we trained a convolutional neural network (CNN) model to predict the 19 half-step increments of the EDSS score. Finally, we used a combined rule-based–CNN model. For each approach, we determined the accuracy, precision, recall, and F-score compared with the reference standard, which was manually labeled EDSS scores in the clinic database. RESULTS: Overall, the combined keyword-CNN model demonstrated the best performance, with accuracy, precision, recall, and an F-score of 0.90, 0.83, 0.83, and 0.83 respectively. Respective figures for the rule-based and CNN models individually were 0.57, 0.91, 0.65, and 0.70, and 0.86, 0.70, 0.70, and 0.70. Because of missing data, the model performance for EDSS subscores was lower than that for the total EDSS score. Performance improved when considering notes with known values of the EDSS subscores. CONCLUSIONS: A combined keyword-CNN natural language processing model can extract and accurately predict EDSS scores from patient records. This approach can be automated for efficient information extraction in clinical and research settings.
format	Online Article Text
id	pubmed-8792771
institution	National Center for Biotechnology Information
language	English
publishDate	2022
publisher	JMIR Publications
record_format	MEDLINE/PubMed
spelling	pubmed-87927712022-02-03 Assessment of Natural Language Processing Methods for Ascertaining the Expanded Disability Status Scale Score From the Electronic Health Records of Patients With Multiple Sclerosis: Algorithm Development and Validation Study Yang, Zhen Pou-Prom, Chloé Jones, Ashley Banning, Michaelia Dai, David Mamdani, Muhammad Oh, Jiwon Antoniou, Tony JMIR Med Inform Original Paper BACKGROUND: The Expanded Disability Status Scale (EDSS) score is a widely used measure to monitor disability progression in people with multiple sclerosis (MS). However, extracting and deriving the EDSS score from unstructured electronic health records can be time-consuming. OBJECTIVE: We aimed to compare rule-based and deep learning natural language processing algorithms for detecting and predicting the total EDSS score and EDSS functional system subscores from the electronic health records of patients with MS. METHODS: We studied 17,452 electronic health records of 4906 MS patients followed at one of Canada’s largest MS clinics between June 2015 and July 2019. We randomly divided the records into training (80%) and test (20%) data sets, and compared the performance characteristics of 3 natural language processing models. First, we applied a rule-based approach, extracting the EDSS score from sentences containing the keyword “EDSS.” Next, we trained a convolutional neural network (CNN) model to predict the 19 half-step increments of the EDSS score. Finally, we used a combined rule-based–CNN model. For each approach, we determined the accuracy, precision, recall, and F-score compared with the reference standard, which was manually labeled EDSS scores in the clinic database. RESULTS: Overall, the combined keyword-CNN model demonstrated the best performance, with accuracy, precision, recall, and an F-score of 0.90, 0.83, 0.83, and 0.83 respectively. Respective figures for the rule-based and CNN models individually were 0.57, 0.91, 0.65, and 0.70, and 0.86, 0.70, 0.70, and 0.70. Because of missing data, the model performance for EDSS subscores was lower than that for the total EDSS score. Performance improved when considering notes with known values of the EDSS subscores. CONCLUSIONS: A combined keyword-CNN natural language processing model can extract and accurately predict EDSS scores from patient records. This approach can be automated for efficient information extraction in clinical and research settings. JMIR Publications 2022-01-12 /pmc/articles/PMC8792771/ /pubmed/35019849 http://dx.doi.org/10.2196/25157 Text en ©Zhen Yang, Chloé Pou-Prom, Ashley Jones, Michaelia Banning, David Dai, Muhammad Mamdani, Jiwon Oh, Tony Antoniou. Originally published in JMIR Medical Informatics (https://medinform.jmir.org), 12.01.2022. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Medical Informatics, is properly cited. The complete bibliographic information, a link to the original publication on https://medinform.jmir.org/, as well as this copyright and license information must be included.
spellingShingle	Original Paper Yang, Zhen Pou-Prom, Chloé Jones, Ashley Banning, Michaelia Dai, David Mamdani, Muhammad Oh, Jiwon Antoniou, Tony Assessment of Natural Language Processing Methods for Ascertaining the Expanded Disability Status Scale Score From the Electronic Health Records of Patients With Multiple Sclerosis: Algorithm Development and Validation Study
title	Assessment of Natural Language Processing Methods for Ascertaining the Expanded Disability Status Scale Score From the Electronic Health Records of Patients With Multiple Sclerosis: Algorithm Development and Validation Study
title_full	Assessment of Natural Language Processing Methods for Ascertaining the Expanded Disability Status Scale Score From the Electronic Health Records of Patients With Multiple Sclerosis: Algorithm Development and Validation Study
title_fullStr	Assessment of Natural Language Processing Methods for Ascertaining the Expanded Disability Status Scale Score From the Electronic Health Records of Patients With Multiple Sclerosis: Algorithm Development and Validation Study
title_full_unstemmed	Assessment of Natural Language Processing Methods for Ascertaining the Expanded Disability Status Scale Score From the Electronic Health Records of Patients With Multiple Sclerosis: Algorithm Development and Validation Study
title_short	Assessment of Natural Language Processing Methods for Ascertaining the Expanded Disability Status Scale Score From the Electronic Health Records of Patients With Multiple Sclerosis: Algorithm Development and Validation Study
title_sort	assessment of natural language processing methods for ascertaining the expanded disability status scale score from the electronic health records of patients with multiple sclerosis: algorithm development and validation study
topic	Original Paper
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8792771/ https://www.ncbi.nlm.nih.gov/pubmed/35019849 http://dx.doi.org/10.2196/25157
work_keys_str_mv	AT yangzhen assessmentofnaturallanguageprocessingmethodsforascertainingtheexpandeddisabilitystatusscalescorefromtheelectronichealthrecordsofpatientswithmultiplesclerosisalgorithmdevelopmentandvalidationstudy AT poupromchloe assessmentofnaturallanguageprocessingmethodsforascertainingtheexpandeddisabilitystatusscalescorefromtheelectronichealthrecordsofpatientswithmultiplesclerosisalgorithmdevelopmentandvalidationstudy AT jonesashley assessmentofnaturallanguageprocessingmethodsforascertainingtheexpandeddisabilitystatusscalescorefromtheelectronichealthrecordsofpatientswithmultiplesclerosisalgorithmdevelopmentandvalidationstudy AT banningmichaelia assessmentofnaturallanguageprocessingmethodsforascertainingtheexpandeddisabilitystatusscalescorefromtheelectronichealthrecordsofpatientswithmultiplesclerosisalgorithmdevelopmentandvalidationstudy AT daidavid assessmentofnaturallanguageprocessingmethodsforascertainingtheexpandeddisabilitystatusscalescorefromtheelectronichealthrecordsofpatientswithmultiplesclerosisalgorithmdevelopmentandvalidationstudy AT mamdanimuhammad assessmentofnaturallanguageprocessingmethodsforascertainingtheexpandeddisabilitystatusscalescorefromtheelectronichealthrecordsofpatientswithmultiplesclerosisalgorithmdevelopmentandvalidationstudy AT ohjiwon assessmentofnaturallanguageprocessingmethodsforascertainingtheexpandeddisabilitystatusscalescorefromtheelectronichealthrecordsofpatientswithmultiplesclerosisalgorithmdevelopmentandvalidationstudy AT antonioutony assessmentofnaturallanguageprocessingmethodsforascertainingtheexpandeddisabilitystatusscalescorefromtheelectronichealthrecordsofpatientswithmultiplesclerosisalgorithmdevelopmentandvalidationstudy

Assessment of Natural Language Processing Methods for Ascertaining the Expanded Disability Status Scale Score From the Electronic Health Records of Patients With Multiple Sclerosis: Algorithm Development and Validation Study

Ejemplares similares