Cargando…

Identification of a potential fibromyalgia diagnosis using random forest modeling applied to electronic medical records

BACKGROUND: Diagnosis of fibromyalgia (FM), a chronic musculoskeletal condition characterized by widespread pain and a constellation of symptoms, remains challenging and is often delayed. METHODS: Random forest modeling of electronic medical records was used to identify variables that may facilitate...

Descripción completa

Detalles Bibliográficos
Autores principales: Emir, Birol, Masters, Elizabeth T, Mardekian, Jack, Clair, Andrew, Kuhn, Max, Silverman, Stuart L
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Dove Medical Press 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4467741/
https://www.ncbi.nlm.nih.gov/pubmed/26089700
http://dx.doi.org/10.2147/JPR.S8256
_version_ 1782376405491777536
author Emir, Birol
Masters, Elizabeth T
Mardekian, Jack
Clair, Andrew
Kuhn, Max
Silverman, Stuart L
author_facet Emir, Birol
Masters, Elizabeth T
Mardekian, Jack
Clair, Andrew
Kuhn, Max
Silverman, Stuart L
author_sort Emir, Birol
collection PubMed
description BACKGROUND: Diagnosis of fibromyalgia (FM), a chronic musculoskeletal condition characterized by widespread pain and a constellation of symptoms, remains challenging and is often delayed. METHODS: Random forest modeling of electronic medical records was used to identify variables that may facilitate earlier FM identification and diagnosis. Subjects aged ≥18 years with two or more listings of the International Classification of Diseases, Ninth Revision, (ICD-9) code for FM (ICD-9 729.1) ≥30 days apart during the 2012 calendar year were defined as cases among subjects associated with an integrated delivery network and who had one or more health care provider encounter in the Humedica database in calendar years 2011 and 2012. Controls were without the FM ICD-9 codes. Seventy-two demographic, clinical, and health care resource utilization variables were entered into a random forest model with downsampling to account for cohort imbalances (<1% subjects had FM). Importance of the top ten variables was ranked based on normalization to 100% for the variable with the largest loss in predicting performance by its omission from the model. Since random forest is a complex prediction method, a set of simple rules was derived to help understand what factors drive individual predictions. RESULTS: The ten variables identified by the model were: number of visits where laboratory/non-imaging diagnostic tests were ordered; number of outpatient visits excluding office visits; age; number of office visits; number of opioid prescriptions; number of medications prescribed; number of pain medications excluding opioids; number of medications administered/ordered; number of emergency room visits; and number of musculoskeletal conditions. A receiver operating characteristic curve confirmed the model’s predictive accuracy using an independent test set (area under the curve, 0.810). To enhance interpretability, nine rules were developed that could be used with good predictive probability of an FM diagnosis and to identify no-FM subjects. CONCLUSION: Random forest modeling may help to quantify the predictive probability of an FM diagnosis. Rules can be developed to simplify interpretability. Further validation of these models may facilitate earlier diagnosis and enhance management.
format Online
Article
Text
id pubmed-4467741
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher Dove Medical Press
record_format MEDLINE/PubMed
spelling pubmed-44677412015-06-18 Identification of a potential fibromyalgia diagnosis using random forest modeling applied to electronic medical records Emir, Birol Masters, Elizabeth T Mardekian, Jack Clair, Andrew Kuhn, Max Silverman, Stuart L J Pain Res Original Research BACKGROUND: Diagnosis of fibromyalgia (FM), a chronic musculoskeletal condition characterized by widespread pain and a constellation of symptoms, remains challenging and is often delayed. METHODS: Random forest modeling of electronic medical records was used to identify variables that may facilitate earlier FM identification and diagnosis. Subjects aged ≥18 years with two or more listings of the International Classification of Diseases, Ninth Revision, (ICD-9) code for FM (ICD-9 729.1) ≥30 days apart during the 2012 calendar year were defined as cases among subjects associated with an integrated delivery network and who had one or more health care provider encounter in the Humedica database in calendar years 2011 and 2012. Controls were without the FM ICD-9 codes. Seventy-two demographic, clinical, and health care resource utilization variables were entered into a random forest model with downsampling to account for cohort imbalances (<1% subjects had FM). Importance of the top ten variables was ranked based on normalization to 100% for the variable with the largest loss in predicting performance by its omission from the model. Since random forest is a complex prediction method, a set of simple rules was derived to help understand what factors drive individual predictions. RESULTS: The ten variables identified by the model were: number of visits where laboratory/non-imaging diagnostic tests were ordered; number of outpatient visits excluding office visits; age; number of office visits; number of opioid prescriptions; number of medications prescribed; number of pain medications excluding opioids; number of medications administered/ordered; number of emergency room visits; and number of musculoskeletal conditions. A receiver operating characteristic curve confirmed the model’s predictive accuracy using an independent test set (area under the curve, 0.810). To enhance interpretability, nine rules were developed that could be used with good predictive probability of an FM diagnosis and to identify no-FM subjects. CONCLUSION: Random forest modeling may help to quantify the predictive probability of an FM diagnosis. Rules can be developed to simplify interpretability. Further validation of these models may facilitate earlier diagnosis and enhance management. Dove Medical Press 2015-06-10 /pmc/articles/PMC4467741/ /pubmed/26089700 http://dx.doi.org/10.2147/JPR.S8256 Text en © 2015 Emir et al. This work is published by Dove Medical Press Limited, and licensed under Creative Commons Attribution – Non Commercial (unported, v3.0) License The full terms of the License are available at http://creativecommons.org/licenses/by-nc/3.0/. Non-commercial uses of the work are permitted without any further permission from Dove Medical Press Limited, provided the work is properly attributed.
spellingShingle Original Research
Emir, Birol
Masters, Elizabeth T
Mardekian, Jack
Clair, Andrew
Kuhn, Max
Silverman, Stuart L
Identification of a potential fibromyalgia diagnosis using random forest modeling applied to electronic medical records
title Identification of a potential fibromyalgia diagnosis using random forest modeling applied to electronic medical records
title_full Identification of a potential fibromyalgia diagnosis using random forest modeling applied to electronic medical records
title_fullStr Identification of a potential fibromyalgia diagnosis using random forest modeling applied to electronic medical records
title_full_unstemmed Identification of a potential fibromyalgia diagnosis using random forest modeling applied to electronic medical records
title_short Identification of a potential fibromyalgia diagnosis using random forest modeling applied to electronic medical records
title_sort identification of a potential fibromyalgia diagnosis using random forest modeling applied to electronic medical records
topic Original Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4467741/
https://www.ncbi.nlm.nih.gov/pubmed/26089700
http://dx.doi.org/10.2147/JPR.S8256
work_keys_str_mv AT emirbirol identificationofapotentialfibromyalgiadiagnosisusingrandomforestmodelingappliedtoelectronicmedicalrecords
AT masterselizabetht identificationofapotentialfibromyalgiadiagnosisusingrandomforestmodelingappliedtoelectronicmedicalrecords
AT mardekianjack identificationofapotentialfibromyalgiadiagnosisusingrandomforestmodelingappliedtoelectronicmedicalrecords
AT clairandrew identificationofapotentialfibromyalgiadiagnosisusingrandomforestmodelingappliedtoelectronicmedicalrecords
AT kuhnmax identificationofapotentialfibromyalgiadiagnosisusingrandomforestmodelingappliedtoelectronicmedicalrecords
AT silvermanstuartl identificationofapotentialfibromyalgiadiagnosisusingrandomforestmodelingappliedtoelectronicmedicalrecords