Cargando…
Automated extraction of clinical traits of multiple sclerosis in electronic medical records
OBJECTIVES: The clinical course of multiple sclerosis (MS) is highly variable, and research data collection is costly and time consuming. We evaluated natural language processing techniques applied to electronic medical records (EMR) to identify MS patients and the key clinical traits of their disea...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BMJ Publishing Group
2013
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3861927/ https://www.ncbi.nlm.nih.gov/pubmed/24148554 http://dx.doi.org/10.1136/amiajnl-2013-001999 |
_version_ | 1782295697132879872 |
---|---|
author | Davis, Mary F Sriram, Subramaniam Bush, William S Denny, Joshua C Haines, Jonathan L |
author_facet | Davis, Mary F Sriram, Subramaniam Bush, William S Denny, Joshua C Haines, Jonathan L |
author_sort | Davis, Mary F |
collection | PubMed |
description | OBJECTIVES: The clinical course of multiple sclerosis (MS) is highly variable, and research data collection is costly and time consuming. We evaluated natural language processing techniques applied to electronic medical records (EMR) to identify MS patients and the key clinical traits of their disease course. MATERIALS AND METHODS: We used four algorithms based on ICD-9 codes, text keywords, and medications to identify individuals with MS from a de-identified, research version of the EMR at Vanderbilt University. Using a training dataset of the records of 899 individuals, algorithms were constructed to identify and extract detailed information regarding the clinical course of MS from the text of the medical records, including clinical subtype, presence of oligoclonal bands, year of diagnosis, year and origin of first symptom, Expanded Disability Status Scale (EDSS) scores, timed 25-foot walk scores, and MS medications. Algorithms were evaluated on a test set validated by two independent reviewers. RESULTS: We identified 5789 individuals with MS. For all clinical traits extracted, precision was at least 87% and specificity was greater than 80%. Recall values for clinical subtype, EDSS scores, and timed 25-foot walk scores were greater than 80%. DISCUSSION AND CONCLUSION: This collection of clinical data represents one of the largest databases of detailed, clinical traits available for research on MS. This work demonstrates that detailed clinical information is recorded in the EMR and can be extracted for research purposes with high reliability. |
format | Online Article Text |
id | pubmed-3861927 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2013 |
publisher | BMJ Publishing Group |
record_format | MEDLINE/PubMed |
spelling | pubmed-38619272013-12-13 Automated extraction of clinical traits of multiple sclerosis in electronic medical records Davis, Mary F Sriram, Subramaniam Bush, William S Denny, Joshua C Haines, Jonathan L J Am Med Inform Assoc Research and Applications OBJECTIVES: The clinical course of multiple sclerosis (MS) is highly variable, and research data collection is costly and time consuming. We evaluated natural language processing techniques applied to electronic medical records (EMR) to identify MS patients and the key clinical traits of their disease course. MATERIALS AND METHODS: We used four algorithms based on ICD-9 codes, text keywords, and medications to identify individuals with MS from a de-identified, research version of the EMR at Vanderbilt University. Using a training dataset of the records of 899 individuals, algorithms were constructed to identify and extract detailed information regarding the clinical course of MS from the text of the medical records, including clinical subtype, presence of oligoclonal bands, year of diagnosis, year and origin of first symptom, Expanded Disability Status Scale (EDSS) scores, timed 25-foot walk scores, and MS medications. Algorithms were evaluated on a test set validated by two independent reviewers. RESULTS: We identified 5789 individuals with MS. For all clinical traits extracted, precision was at least 87% and specificity was greater than 80%. Recall values for clinical subtype, EDSS scores, and timed 25-foot walk scores were greater than 80%. DISCUSSION AND CONCLUSION: This collection of clinical data represents one of the largest databases of detailed, clinical traits available for research on MS. This work demonstrates that detailed clinical information is recorded in the EMR and can be extracted for research purposes with high reliability. BMJ Publishing Group 2013-12 2013-10-22 /pmc/articles/PMC3861927/ /pubmed/24148554 http://dx.doi.org/10.1136/amiajnl-2013-001999 Text en Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions This is an Open Access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 3.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/3.0/ |
spellingShingle | Research and Applications Davis, Mary F Sriram, Subramaniam Bush, William S Denny, Joshua C Haines, Jonathan L Automated extraction of clinical traits of multiple sclerosis in electronic medical records |
title | Automated extraction of clinical traits of multiple sclerosis in electronic medical records |
title_full | Automated extraction of clinical traits of multiple sclerosis in electronic medical records |
title_fullStr | Automated extraction of clinical traits of multiple sclerosis in electronic medical records |
title_full_unstemmed | Automated extraction of clinical traits of multiple sclerosis in electronic medical records |
title_short | Automated extraction of clinical traits of multiple sclerosis in electronic medical records |
title_sort | automated extraction of clinical traits of multiple sclerosis in electronic medical records |
topic | Research and Applications |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3861927/ https://www.ncbi.nlm.nih.gov/pubmed/24148554 http://dx.doi.org/10.1136/amiajnl-2013-001999 |
work_keys_str_mv | AT davismaryf automatedextractionofclinicaltraitsofmultiplesclerosisinelectronicmedicalrecords AT sriramsubramaniam automatedextractionofclinicaltraitsofmultiplesclerosisinelectronicmedicalrecords AT bushwilliams automatedextractionofclinicaltraitsofmultiplesclerosisinelectronicmedicalrecords AT dennyjoshuac automatedextractionofclinicaltraitsofmultiplesclerosisinelectronicmedicalrecords AT hainesjonathanl automatedextractionofclinicaltraitsofmultiplesclerosisinelectronicmedicalrecords |