Cargando…

Automated extraction of clinical traits of multiple sclerosis in electronic medical records

OBJECTIVES: The clinical course of multiple sclerosis (MS) is highly variable, and research data collection is costly and time consuming. We evaluated natural language processing techniques applied to electronic medical records (EMR) to identify MS patients and the key clinical traits of their disea...

Descripción completa

Detalles Bibliográficos
Autores principales: Davis, Mary F, Sriram, Subramaniam, Bush, William S, Denny, Joshua C, Haines, Jonathan L
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BMJ Publishing Group 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3861927/
https://www.ncbi.nlm.nih.gov/pubmed/24148554
http://dx.doi.org/10.1136/amiajnl-2013-001999
_version_ 1782295697132879872
author Davis, Mary F
Sriram, Subramaniam
Bush, William S
Denny, Joshua C
Haines, Jonathan L
author_facet Davis, Mary F
Sriram, Subramaniam
Bush, William S
Denny, Joshua C
Haines, Jonathan L
author_sort Davis, Mary F
collection PubMed
description OBJECTIVES: The clinical course of multiple sclerosis (MS) is highly variable, and research data collection is costly and time consuming. We evaluated natural language processing techniques applied to electronic medical records (EMR) to identify MS patients and the key clinical traits of their disease course. MATERIALS AND METHODS: We used four algorithms based on ICD-9 codes, text keywords, and medications to identify individuals with MS from a de-identified, research version of the EMR at Vanderbilt University. Using a training dataset of the records of 899 individuals, algorithms were constructed to identify and extract detailed information regarding the clinical course of MS from the text of the medical records, including clinical subtype, presence of oligoclonal bands, year of diagnosis, year and origin of first symptom, Expanded Disability Status Scale (EDSS) scores, timed 25-foot walk scores, and MS medications. Algorithms were evaluated on a test set validated by two independent reviewers. RESULTS: We identified 5789 individuals with MS. For all clinical traits extracted, precision was at least 87% and specificity was greater than 80%. Recall values for clinical subtype, EDSS scores, and timed 25-foot walk scores were greater than 80%. DISCUSSION AND CONCLUSION: This collection of clinical data represents one of the largest databases of detailed, clinical traits available for research on MS. This work demonstrates that detailed clinical information is recorded in the EMR and can be extracted for research purposes with high reliability.
format Online
Article
Text
id pubmed-3861927
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher BMJ Publishing Group
record_format MEDLINE/PubMed
spelling pubmed-38619272013-12-13 Automated extraction of clinical traits of multiple sclerosis in electronic medical records Davis, Mary F Sriram, Subramaniam Bush, William S Denny, Joshua C Haines, Jonathan L J Am Med Inform Assoc Research and Applications OBJECTIVES: The clinical course of multiple sclerosis (MS) is highly variable, and research data collection is costly and time consuming. We evaluated natural language processing techniques applied to electronic medical records (EMR) to identify MS patients and the key clinical traits of their disease course. MATERIALS AND METHODS: We used four algorithms based on ICD-9 codes, text keywords, and medications to identify individuals with MS from a de-identified, research version of the EMR at Vanderbilt University. Using a training dataset of the records of 899 individuals, algorithms were constructed to identify and extract detailed information regarding the clinical course of MS from the text of the medical records, including clinical subtype, presence of oligoclonal bands, year of diagnosis, year and origin of first symptom, Expanded Disability Status Scale (EDSS) scores, timed 25-foot walk scores, and MS medications. Algorithms were evaluated on a test set validated by two independent reviewers. RESULTS: We identified 5789 individuals with MS. For all clinical traits extracted, precision was at least 87% and specificity was greater than 80%. Recall values for clinical subtype, EDSS scores, and timed 25-foot walk scores were greater than 80%. DISCUSSION AND CONCLUSION: This collection of clinical data represents one of the largest databases of detailed, clinical traits available for research on MS. This work demonstrates that detailed clinical information is recorded in the EMR and can be extracted for research purposes with high reliability. BMJ Publishing Group 2013-12 2013-10-22 /pmc/articles/PMC3861927/ /pubmed/24148554 http://dx.doi.org/10.1136/amiajnl-2013-001999 Text en Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions This is an Open Access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 3.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/3.0/
spellingShingle Research and Applications
Davis, Mary F
Sriram, Subramaniam
Bush, William S
Denny, Joshua C
Haines, Jonathan L
Automated extraction of clinical traits of multiple sclerosis in electronic medical records
title Automated extraction of clinical traits of multiple sclerosis in electronic medical records
title_full Automated extraction of clinical traits of multiple sclerosis in electronic medical records
title_fullStr Automated extraction of clinical traits of multiple sclerosis in electronic medical records
title_full_unstemmed Automated extraction of clinical traits of multiple sclerosis in electronic medical records
title_short Automated extraction of clinical traits of multiple sclerosis in electronic medical records
title_sort automated extraction of clinical traits of multiple sclerosis in electronic medical records
topic Research and Applications
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3861927/
https://www.ncbi.nlm.nih.gov/pubmed/24148554
http://dx.doi.org/10.1136/amiajnl-2013-001999
work_keys_str_mv AT davismaryf automatedextractionofclinicaltraitsofmultiplesclerosisinelectronicmedicalrecords
AT sriramsubramaniam automatedextractionofclinicaltraitsofmultiplesclerosisinelectronicmedicalrecords
AT bushwilliams automatedextractionofclinicaltraitsofmultiplesclerosisinelectronicmedicalrecords
AT dennyjoshuac automatedextractionofclinicaltraitsofmultiplesclerosisinelectronicmedicalrecords
AT hainesjonathanl automatedextractionofclinicaltraitsofmultiplesclerosisinelectronicmedicalrecords