Cargando…

Validation of a machine learning approach to estimate Clinical Disease Activity Index Scores for rheumatoid arthritis

OBJECTIVE: Disease activity measures, such as the Clinical Disease Activity Index (CDAI), are important tools for informing treatment decisions and monitoring patient outcomes in rheumatoid arthritis (RA). Yet, documentation of CDAI scores in electronic medical records and other real-world data sour...

Descripción completa

Detalles Bibliográficos
Autores principales: Spencer, Alison K., Bandaria, Jigar, Leavy, Michelle B., Gliklich, Benjamin, Su, Zhaohui, Curhan, Gary, Boussios, Costas
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BMJ Publishing Group 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8614150/
https://www.ncbi.nlm.nih.gov/pubmed/34819386
http://dx.doi.org/10.1136/rmdopen-2021-001781
_version_ 1784603798675652608
author Spencer, Alison K.
Bandaria, Jigar
Leavy, Michelle B.
Gliklich, Benjamin
Su, Zhaohui
Curhan, Gary
Boussios, Costas
author_facet Spencer, Alison K.
Bandaria, Jigar
Leavy, Michelle B.
Gliklich, Benjamin
Su, Zhaohui
Curhan, Gary
Boussios, Costas
author_sort Spencer, Alison K.
collection PubMed
description OBJECTIVE: Disease activity measures, such as the Clinical Disease Activity Index (CDAI), are important tools for informing treatment decisions and monitoring patient outcomes in rheumatoid arthritis (RA). Yet, documentation of CDAI scores in electronic medical records and other real-world data sources is inconsistent, making it challenging to use these data for research. The purpose of this study was to validate a machine learning model to estimate CDAI scores for patients with RA using clinical notes. METHODS: A machine learning model was developed to estimate CDAI score values using clinical notes from a specific rheumatology visit. Data from the OM1 RA Registry were used to create a training cohort of 56 177 encounters and a separate validation cohort of 18 726 encounters, 11 985 of which passed a model-derived confidence filter; all included encounters had both a clinician-recorded CDAI score and a clinical note. Model performance was assessed using the area under the receiver operating characteristic curve (AUC), positive predictive value (PPV) and negative predictive value (NPV), calculated using a binarised version of the outcome. The Spearman’s R and Pearson’s R values were also calculated. RESULTS: The model had a PPV of 0.80, NPV of 0.84 and AUC of 0.88 when evaluating performance using the binarised version of the outcome. The model had a Spearman’s R value of 0.72 and a Pearson’s R value of 0.69 when evaluating performance using the continuous CDAI numeric scores. CONCLUSION: A machine learning model estimates CDAI scores from clinical notes with good performance. Application of the model to real-world data sets may allow estimated CDAI scores to be used for research purposes.
format Online
Article
Text
id pubmed-8614150
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher BMJ Publishing Group
record_format MEDLINE/PubMed
spelling pubmed-86141502021-12-10 Validation of a machine learning approach to estimate Clinical Disease Activity Index Scores for rheumatoid arthritis Spencer, Alison K. Bandaria, Jigar Leavy, Michelle B. Gliklich, Benjamin Su, Zhaohui Curhan, Gary Boussios, Costas RMD Open Rheumatoid Arthritis OBJECTIVE: Disease activity measures, such as the Clinical Disease Activity Index (CDAI), are important tools for informing treatment decisions and monitoring patient outcomes in rheumatoid arthritis (RA). Yet, documentation of CDAI scores in electronic medical records and other real-world data sources is inconsistent, making it challenging to use these data for research. The purpose of this study was to validate a machine learning model to estimate CDAI scores for patients with RA using clinical notes. METHODS: A machine learning model was developed to estimate CDAI score values using clinical notes from a specific rheumatology visit. Data from the OM1 RA Registry were used to create a training cohort of 56 177 encounters and a separate validation cohort of 18 726 encounters, 11 985 of which passed a model-derived confidence filter; all included encounters had both a clinician-recorded CDAI score and a clinical note. Model performance was assessed using the area under the receiver operating characteristic curve (AUC), positive predictive value (PPV) and negative predictive value (NPV), calculated using a binarised version of the outcome. The Spearman’s R and Pearson’s R values were also calculated. RESULTS: The model had a PPV of 0.80, NPV of 0.84 and AUC of 0.88 when evaluating performance using the binarised version of the outcome. The model had a Spearman’s R value of 0.72 and a Pearson’s R value of 0.69 when evaluating performance using the continuous CDAI numeric scores. CONCLUSION: A machine learning model estimates CDAI scores from clinical notes with good performance. Application of the model to real-world data sets may allow estimated CDAI scores to be used for research purposes. BMJ Publishing Group 2021-11-24 /pmc/articles/PMC8614150/ /pubmed/34819386 http://dx.doi.org/10.1136/rmdopen-2021-001781 Text en © Author(s) (or their employer(s)) 2021. Re-use permitted under CC BY-NC. No commercial re-use. See rights and permissions. Published by BMJ. https://creativecommons.org/licenses/by-nc/4.0/This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/ (https://creativecommons.org/licenses/by-nc/4.0/) .
spellingShingle Rheumatoid Arthritis
Spencer, Alison K.
Bandaria, Jigar
Leavy, Michelle B.
Gliklich, Benjamin
Su, Zhaohui
Curhan, Gary
Boussios, Costas
Validation of a machine learning approach to estimate Clinical Disease Activity Index Scores for rheumatoid arthritis
title Validation of a machine learning approach to estimate Clinical Disease Activity Index Scores for rheumatoid arthritis
title_full Validation of a machine learning approach to estimate Clinical Disease Activity Index Scores for rheumatoid arthritis
title_fullStr Validation of a machine learning approach to estimate Clinical Disease Activity Index Scores for rheumatoid arthritis
title_full_unstemmed Validation of a machine learning approach to estimate Clinical Disease Activity Index Scores for rheumatoid arthritis
title_short Validation of a machine learning approach to estimate Clinical Disease Activity Index Scores for rheumatoid arthritis
title_sort validation of a machine learning approach to estimate clinical disease activity index scores for rheumatoid arthritis
topic Rheumatoid Arthritis
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8614150/
https://www.ncbi.nlm.nih.gov/pubmed/34819386
http://dx.doi.org/10.1136/rmdopen-2021-001781
work_keys_str_mv AT spenceralisonk validationofamachinelearningapproachtoestimateclinicaldiseaseactivityindexscoresforrheumatoidarthritis
AT bandariajigar validationofamachinelearningapproachtoestimateclinicaldiseaseactivityindexscoresforrheumatoidarthritis
AT leavymichelleb validationofamachinelearningapproachtoestimateclinicaldiseaseactivityindexscoresforrheumatoidarthritis
AT gliklichbenjamin validationofamachinelearningapproachtoestimateclinicaldiseaseactivityindexscoresforrheumatoidarthritis
AT suzhaohui validationofamachinelearningapproachtoestimateclinicaldiseaseactivityindexscoresforrheumatoidarthritis
AT curhangary validationofamachinelearningapproachtoestimateclinicaldiseaseactivityindexscoresforrheumatoidarthritis
AT boussioscostas validationofamachinelearningapproachtoestimateclinicaldiseaseactivityindexscoresforrheumatoidarthritis