Cargando…

Determining Multiple Sclerosis Phenotype from Electronic Medical Records

BACKGROUND: Multiple sclerosis (MS), a central nervous system disease in which nerve signals are disrupted by scarring and demyelination, is classified into phenotypes depending on the patterns of cognitive or physical impairment progression: relapsing-remitting MS (RRMS), primary-progressive MS (PP...

Descripción completa

Detalles Bibliográficos
Autores principales: Nelson, Richard E., Butler, Jorie, LaFleur, Joanne, Knippenberg, Kristin, C. Kamauu, Aaron W., DuVall, Scott L.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Academy of Managed Care Pharmacy 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10398245/
https://www.ncbi.nlm.nih.gov/pubmed/27882837
http://dx.doi.org/10.18553/jmcp.2016.22.12.1377
_version_ 1785084025030836224
author Nelson, Richard E.
Butler, Jorie
LaFleur, Joanne
Knippenberg, Kristin
C. Kamauu, Aaron W.
DuVall, Scott L.
author_facet Nelson, Richard E.
Butler, Jorie
LaFleur, Joanne
Knippenberg, Kristin
C. Kamauu, Aaron W.
DuVall, Scott L.
author_sort Nelson, Richard E.
collection PubMed
description BACKGROUND: Multiple sclerosis (MS), a central nervous system disease in which nerve signals are disrupted by scarring and demyelination, is classified into phenotypes depending on the patterns of cognitive or physical impairment progression: relapsing-remitting MS (RRMS), primary-progressive MS (PPMS), secondary-progressive MS (SPMS), or progressive-relapsing MS (PRMS). The phenotype is important in managing the disease and determining appropriate treatment. The ICD-9-CM code 340.0 is uninformative about MS phenotype, which increases the difficulty of studying the effects of phenotype on disease. OBJECTIVE: To identify MS phenotype using natural language processing (NLP) techniques on progress notes and other clinical text in the electronic medical record (EMR). METHODS: Patients with at least 2 ICD-9-CM codes for MS (340.0) from 1999 through 2010 were identified from nationwide EMR data in the Department of Veterans Affairs. Clinical experts were interviewed for possible keywords and phrases denoting MS phenotype in order to develop a data dictionary for NLP. For each patient, NLP was used to search EMR clinical notes, since the first MS diagnosis date for these keywords and phrases. Presence of phenotype-related keywords and phrases were analyzed in context to remove mentions that were negated (e.g., “not relapsing-remitting”) or unrelated to MS (e.g., “RR” meaning “respiratory rate”). One thousand mentions of MS phenotype were validated, and all records of 150 patients were reviewed for missed mentions. RESULTS: There were 7,756 MS patients identified by ICD-9-CM code 340.0. MS phenotype was identified for 2,854 (36.8%) patients, with 1,836 (64.3%) of those having just 1 phenotype mentioned in their EMR clinical notes: 1,118 (39.2%) RRMS, 325 (11.4%) PPMS, 374 (13.1%) SPMS, and 19 (0.7%) PRMS. A total of 747 patients (26.2%) had 2 phenotypes, the most common being 459 patients (16.1%) with RRMS and SPMS. A total of 213 patients (7.5%) had 3 phenotypes, and 58 patients (2.0%) had 4 phenotypes mentioned in their EMR clinical notes. Positive predictive value of phenotype identification was 93.8% with sensitivity of 94.0%. CONCLUSIONS: Phenotype was documented for slightly more than one third of MS patients, an important but disappointing finding that sets a limit on studying the effects of phenotype on MS in general. However, for cases where the phenotype was documented, NLP accurately identified the phenotypes. Having multiple phenotypes documented is consistent with disease progression. The most common misidentification was because of ambiguity while clinicians were trying to determine phenotype. This study brings attention to the need for care providers to document MS phenotype more consistently and provides a solution for capturing phenotype from clinical text.
format Online
Article
Text
id pubmed-10398245
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher Academy of Managed Care Pharmacy
record_format MEDLINE/PubMed
spelling pubmed-103982452023-08-04 Determining Multiple Sclerosis Phenotype from Electronic Medical Records Nelson, Richard E. Butler, Jorie LaFleur, Joanne Knippenberg, Kristin C. Kamauu, Aaron W. DuVall, Scott L. J Manag Care Spec Pharm Research BACKGROUND: Multiple sclerosis (MS), a central nervous system disease in which nerve signals are disrupted by scarring and demyelination, is classified into phenotypes depending on the patterns of cognitive or physical impairment progression: relapsing-remitting MS (RRMS), primary-progressive MS (PPMS), secondary-progressive MS (SPMS), or progressive-relapsing MS (PRMS). The phenotype is important in managing the disease and determining appropriate treatment. The ICD-9-CM code 340.0 is uninformative about MS phenotype, which increases the difficulty of studying the effects of phenotype on disease. OBJECTIVE: To identify MS phenotype using natural language processing (NLP) techniques on progress notes and other clinical text in the electronic medical record (EMR). METHODS: Patients with at least 2 ICD-9-CM codes for MS (340.0) from 1999 through 2010 were identified from nationwide EMR data in the Department of Veterans Affairs. Clinical experts were interviewed for possible keywords and phrases denoting MS phenotype in order to develop a data dictionary for NLP. For each patient, NLP was used to search EMR clinical notes, since the first MS diagnosis date for these keywords and phrases. Presence of phenotype-related keywords and phrases were analyzed in context to remove mentions that were negated (e.g., “not relapsing-remitting”) or unrelated to MS (e.g., “RR” meaning “respiratory rate”). One thousand mentions of MS phenotype were validated, and all records of 150 patients were reviewed for missed mentions. RESULTS: There were 7,756 MS patients identified by ICD-9-CM code 340.0. MS phenotype was identified for 2,854 (36.8%) patients, with 1,836 (64.3%) of those having just 1 phenotype mentioned in their EMR clinical notes: 1,118 (39.2%) RRMS, 325 (11.4%) PPMS, 374 (13.1%) SPMS, and 19 (0.7%) PRMS. A total of 747 patients (26.2%) had 2 phenotypes, the most common being 459 patients (16.1%) with RRMS and SPMS. A total of 213 patients (7.5%) had 3 phenotypes, and 58 patients (2.0%) had 4 phenotypes mentioned in their EMR clinical notes. Positive predictive value of phenotype identification was 93.8% with sensitivity of 94.0%. CONCLUSIONS: Phenotype was documented for slightly more than one third of MS patients, an important but disappointing finding that sets a limit on studying the effects of phenotype on MS in general. However, for cases where the phenotype was documented, NLP accurately identified the phenotypes. Having multiple phenotypes documented is consistent with disease progression. The most common misidentification was because of ambiguity while clinicians were trying to determine phenotype. This study brings attention to the need for care providers to document MS phenotype more consistently and provides a solution for capturing phenotype from clinical text. Academy of Managed Care Pharmacy 2016-12 /pmc/articles/PMC10398245/ /pubmed/27882837 http://dx.doi.org/10.18553/jmcp.2016.22.12.1377 Text en © 2016, Academy of Managed Care Pharmacy. All rights reserved. https://creativecommons.org/licenses/by/4.0/This article is licensed under a Creative Commons Attribution 4.0 International License, which permits unrestricted use and redistribution provided that the original author and source are credited.
spellingShingle Research
Nelson, Richard E.
Butler, Jorie
LaFleur, Joanne
Knippenberg, Kristin
C. Kamauu, Aaron W.
DuVall, Scott L.
Determining Multiple Sclerosis Phenotype from Electronic Medical Records
title Determining Multiple Sclerosis Phenotype from Electronic Medical Records
title_full Determining Multiple Sclerosis Phenotype from Electronic Medical Records
title_fullStr Determining Multiple Sclerosis Phenotype from Electronic Medical Records
title_full_unstemmed Determining Multiple Sclerosis Phenotype from Electronic Medical Records
title_short Determining Multiple Sclerosis Phenotype from Electronic Medical Records
title_sort determining multiple sclerosis phenotype from electronic medical records
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10398245/
https://www.ncbi.nlm.nih.gov/pubmed/27882837
http://dx.doi.org/10.18553/jmcp.2016.22.12.1377
work_keys_str_mv AT nelsonricharde determiningmultiplesclerosisphenotypefromelectronicmedicalrecords
AT butlerjorie determiningmultiplesclerosisphenotypefromelectronicmedicalrecords
AT lafleurjoanne determiningmultiplesclerosisphenotypefromelectronicmedicalrecords
AT knippenbergkristin determiningmultiplesclerosisphenotypefromelectronicmedicalrecords
AT ckamauuaaronw determiningmultiplesclerosisphenotypefromelectronicmedicalrecords
AT duvallscottl determiningmultiplesclerosisphenotypefromelectronicmedicalrecords