Cargando…
Identification of a potential fibromyalgia diagnosis using random forest modeling applied to electronic medical records
BACKGROUND: Diagnosis of fibromyalgia (FM), a chronic musculoskeletal condition characterized by widespread pain and a constellation of symptoms, remains challenging and is often delayed. METHODS: Random forest modeling of electronic medical records was used to identify variables that may facilitate...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Dove Medical Press
2015
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4467741/ https://www.ncbi.nlm.nih.gov/pubmed/26089700 http://dx.doi.org/10.2147/JPR.S8256 |
_version_ | 1782376405491777536 |
---|---|
author | Emir, Birol Masters, Elizabeth T Mardekian, Jack Clair, Andrew Kuhn, Max Silverman, Stuart L |
author_facet | Emir, Birol Masters, Elizabeth T Mardekian, Jack Clair, Andrew Kuhn, Max Silverman, Stuart L |
author_sort | Emir, Birol |
collection | PubMed |
description | BACKGROUND: Diagnosis of fibromyalgia (FM), a chronic musculoskeletal condition characterized by widespread pain and a constellation of symptoms, remains challenging and is often delayed. METHODS: Random forest modeling of electronic medical records was used to identify variables that may facilitate earlier FM identification and diagnosis. Subjects aged ≥18 years with two or more listings of the International Classification of Diseases, Ninth Revision, (ICD-9) code for FM (ICD-9 729.1) ≥30 days apart during the 2012 calendar year were defined as cases among subjects associated with an integrated delivery network and who had one or more health care provider encounter in the Humedica database in calendar years 2011 and 2012. Controls were without the FM ICD-9 codes. Seventy-two demographic, clinical, and health care resource utilization variables were entered into a random forest model with downsampling to account for cohort imbalances (<1% subjects had FM). Importance of the top ten variables was ranked based on normalization to 100% for the variable with the largest loss in predicting performance by its omission from the model. Since random forest is a complex prediction method, a set of simple rules was derived to help understand what factors drive individual predictions. RESULTS: The ten variables identified by the model were: number of visits where laboratory/non-imaging diagnostic tests were ordered; number of outpatient visits excluding office visits; age; number of office visits; number of opioid prescriptions; number of medications prescribed; number of pain medications excluding opioids; number of medications administered/ordered; number of emergency room visits; and number of musculoskeletal conditions. A receiver operating characteristic curve confirmed the model’s predictive accuracy using an independent test set (area under the curve, 0.810). To enhance interpretability, nine rules were developed that could be used with good predictive probability of an FM diagnosis and to identify no-FM subjects. CONCLUSION: Random forest modeling may help to quantify the predictive probability of an FM diagnosis. Rules can be developed to simplify interpretability. Further validation of these models may facilitate earlier diagnosis and enhance management. |
format | Online Article Text |
id | pubmed-4467741 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2015 |
publisher | Dove Medical Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-44677412015-06-18 Identification of a potential fibromyalgia diagnosis using random forest modeling applied to electronic medical records Emir, Birol Masters, Elizabeth T Mardekian, Jack Clair, Andrew Kuhn, Max Silverman, Stuart L J Pain Res Original Research BACKGROUND: Diagnosis of fibromyalgia (FM), a chronic musculoskeletal condition characterized by widespread pain and a constellation of symptoms, remains challenging and is often delayed. METHODS: Random forest modeling of electronic medical records was used to identify variables that may facilitate earlier FM identification and diagnosis. Subjects aged ≥18 years with two or more listings of the International Classification of Diseases, Ninth Revision, (ICD-9) code for FM (ICD-9 729.1) ≥30 days apart during the 2012 calendar year were defined as cases among subjects associated with an integrated delivery network and who had one or more health care provider encounter in the Humedica database in calendar years 2011 and 2012. Controls were without the FM ICD-9 codes. Seventy-two demographic, clinical, and health care resource utilization variables were entered into a random forest model with downsampling to account for cohort imbalances (<1% subjects had FM). Importance of the top ten variables was ranked based on normalization to 100% for the variable with the largest loss in predicting performance by its omission from the model. Since random forest is a complex prediction method, a set of simple rules was derived to help understand what factors drive individual predictions. RESULTS: The ten variables identified by the model were: number of visits where laboratory/non-imaging diagnostic tests were ordered; number of outpatient visits excluding office visits; age; number of office visits; number of opioid prescriptions; number of medications prescribed; number of pain medications excluding opioids; number of medications administered/ordered; number of emergency room visits; and number of musculoskeletal conditions. A receiver operating characteristic curve confirmed the model’s predictive accuracy using an independent test set (area under the curve, 0.810). To enhance interpretability, nine rules were developed that could be used with good predictive probability of an FM diagnosis and to identify no-FM subjects. CONCLUSION: Random forest modeling may help to quantify the predictive probability of an FM diagnosis. Rules can be developed to simplify interpretability. Further validation of these models may facilitate earlier diagnosis and enhance management. Dove Medical Press 2015-06-10 /pmc/articles/PMC4467741/ /pubmed/26089700 http://dx.doi.org/10.2147/JPR.S8256 Text en © 2015 Emir et al. This work is published by Dove Medical Press Limited, and licensed under Creative Commons Attribution – Non Commercial (unported, v3.0) License The full terms of the License are available at http://creativecommons.org/licenses/by-nc/3.0/. Non-commercial uses of the work are permitted without any further permission from Dove Medical Press Limited, provided the work is properly attributed. |
spellingShingle | Original Research Emir, Birol Masters, Elizabeth T Mardekian, Jack Clair, Andrew Kuhn, Max Silverman, Stuart L Identification of a potential fibromyalgia diagnosis using random forest modeling applied to electronic medical records |
title | Identification of a potential fibromyalgia diagnosis using random forest modeling applied to electronic medical records |
title_full | Identification of a potential fibromyalgia diagnosis using random forest modeling applied to electronic medical records |
title_fullStr | Identification of a potential fibromyalgia diagnosis using random forest modeling applied to electronic medical records |
title_full_unstemmed | Identification of a potential fibromyalgia diagnosis using random forest modeling applied to electronic medical records |
title_short | Identification of a potential fibromyalgia diagnosis using random forest modeling applied to electronic medical records |
title_sort | identification of a potential fibromyalgia diagnosis using random forest modeling applied to electronic medical records |
topic | Original Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4467741/ https://www.ncbi.nlm.nih.gov/pubmed/26089700 http://dx.doi.org/10.2147/JPR.S8256 |
work_keys_str_mv | AT emirbirol identificationofapotentialfibromyalgiadiagnosisusingrandomforestmodelingappliedtoelectronicmedicalrecords AT masterselizabetht identificationofapotentialfibromyalgiadiagnosisusingrandomforestmodelingappliedtoelectronicmedicalrecords AT mardekianjack identificationofapotentialfibromyalgiadiagnosisusingrandomforestmodelingappliedtoelectronicmedicalrecords AT clairandrew identificationofapotentialfibromyalgiadiagnosisusingrandomforestmodelingappliedtoelectronicmedicalrecords AT kuhnmax identificationofapotentialfibromyalgiadiagnosisusingrandomforestmodelingappliedtoelectronicmedicalrecords AT silvermanstuartl identificationofapotentialfibromyalgiadiagnosisusingrandomforestmodelingappliedtoelectronicmedicalrecords |