Cargando…
Validation of a machine learning approach to estimate Systemic Lupus Erythematosus Disease Activity Index score categories and application in a real-world dataset
OBJECTIVE: Use of the Systemic Lupus Erythematosus Disease Activity Index (SLEDAI) in routine clinical practice is inconsistent, and availability of clinician-recorded SLEDAI scores in real-world datasets is limited. This study aimed to validate a machine learning model to estimate SLEDAI score cate...
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BMJ Publishing Group
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8141448/ https://www.ncbi.nlm.nih.gov/pubmed/34016712 http://dx.doi.org/10.1136/rmdopen-2021-001586 |
_version_ | 1783696365788856320 |
---|---|
author | Alves, Pedro Bandaria, Jigar Leavy, Michelle B Gliklich, Benjamin Boussios, Costas Su, Zhaohui Curhan, Gary |
author_facet | Alves, Pedro Bandaria, Jigar Leavy, Michelle B Gliklich, Benjamin Boussios, Costas Su, Zhaohui Curhan, Gary |
author_sort | Alves, Pedro |
collection | PubMed |
description | OBJECTIVE: Use of the Systemic Lupus Erythematosus Disease Activity Index (SLEDAI) in routine clinical practice is inconsistent, and availability of clinician-recorded SLEDAI scores in real-world datasets is limited. This study aimed to validate a machine learning model to estimate SLEDAI score categories using clinical notes and to apply the model to a large, real-world dataset to generate estimated score categories for use in future research studies. METHODS: A machine learning model was developed to estimate an individual patient’s SLEDAI score category (no activity, mild activity, moderate activity or high/very high activity) for a specific encounter date using clinical notes. A training cohort of 3504 encounters and a separate validation cohort of 1576 encounters were created from the OM1 SLE Registry. Model performance was assessed using the area under the receiver operating characteristic curve (AUC), calculated using a binarised version of the outcome that sets the positive class to be those records with clinician-recorded SLEDAI scores >5 and the negative class to be records with scores ≤5. Model performance was evaluated by categorising the scores into the four disease activity categories and by calculating the Spearman’s R value and Pearson’s R value. RESULTS: The AUC for the two categories was 0.93 for the development cohort and 0.91 for the validation cohort. The model had a Spearman’s R value of 0.7 and a Pearson’s R value of 0.7 when calculated using the four disease activity categories. CONCLUSION: The model performs well when estimating SLEDAI score categories using unstructured clinical notes. |
format | Online Article Text |
id | pubmed-8141448 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | BMJ Publishing Group |
record_format | MEDLINE/PubMed |
spelling | pubmed-81414482021-06-07 Validation of a machine learning approach to estimate Systemic Lupus Erythematosus Disease Activity Index score categories and application in a real-world dataset Alves, Pedro Bandaria, Jigar Leavy, Michelle B Gliklich, Benjamin Boussios, Costas Su, Zhaohui Curhan, Gary RMD Open Lupus OBJECTIVE: Use of the Systemic Lupus Erythematosus Disease Activity Index (SLEDAI) in routine clinical practice is inconsistent, and availability of clinician-recorded SLEDAI scores in real-world datasets is limited. This study aimed to validate a machine learning model to estimate SLEDAI score categories using clinical notes and to apply the model to a large, real-world dataset to generate estimated score categories for use in future research studies. METHODS: A machine learning model was developed to estimate an individual patient’s SLEDAI score category (no activity, mild activity, moderate activity or high/very high activity) for a specific encounter date using clinical notes. A training cohort of 3504 encounters and a separate validation cohort of 1576 encounters were created from the OM1 SLE Registry. Model performance was assessed using the area under the receiver operating characteristic curve (AUC), calculated using a binarised version of the outcome that sets the positive class to be those records with clinician-recorded SLEDAI scores >5 and the negative class to be records with scores ≤5. Model performance was evaluated by categorising the scores into the four disease activity categories and by calculating the Spearman’s R value and Pearson’s R value. RESULTS: The AUC for the two categories was 0.93 for the development cohort and 0.91 for the validation cohort. The model had a Spearman’s R value of 0.7 and a Pearson’s R value of 0.7 when calculated using the four disease activity categories. CONCLUSION: The model performs well when estimating SLEDAI score categories using unstructured clinical notes. BMJ Publishing Group 2021-05-20 /pmc/articles/PMC8141448/ /pubmed/34016712 http://dx.doi.org/10.1136/rmdopen-2021-001586 Text en © Author(s) (or their employer(s)) 2021. Re-use permitted under CC BY-NC. No commercial re-use. See rights and permissions. Published by BMJ. https://creativecommons.org/licenses/by-nc/4.0/This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/ (https://creativecommons.org/licenses/by-nc/4.0/) . |
spellingShingle | Lupus Alves, Pedro Bandaria, Jigar Leavy, Michelle B Gliklich, Benjamin Boussios, Costas Su, Zhaohui Curhan, Gary Validation of a machine learning approach to estimate Systemic Lupus Erythematosus Disease Activity Index score categories and application in a real-world dataset |
title | Validation of a machine learning approach to estimate Systemic Lupus Erythematosus Disease Activity Index score categories and application in a real-world dataset |
title_full | Validation of a machine learning approach to estimate Systemic Lupus Erythematosus Disease Activity Index score categories and application in a real-world dataset |
title_fullStr | Validation of a machine learning approach to estimate Systemic Lupus Erythematosus Disease Activity Index score categories and application in a real-world dataset |
title_full_unstemmed | Validation of a machine learning approach to estimate Systemic Lupus Erythematosus Disease Activity Index score categories and application in a real-world dataset |
title_short | Validation of a machine learning approach to estimate Systemic Lupus Erythematosus Disease Activity Index score categories and application in a real-world dataset |
title_sort | validation of a machine learning approach to estimate systemic lupus erythematosus disease activity index score categories and application in a real-world dataset |
topic | Lupus |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8141448/ https://www.ncbi.nlm.nih.gov/pubmed/34016712 http://dx.doi.org/10.1136/rmdopen-2021-001586 |
work_keys_str_mv | AT alvespedro validationofamachinelearningapproachtoestimatesystemiclupuserythematosusdiseaseactivityindexscorecategoriesandapplicationinarealworlddataset AT bandariajigar validationofamachinelearningapproachtoestimatesystemiclupuserythematosusdiseaseactivityindexscorecategoriesandapplicationinarealworlddataset AT leavymichelleb validationofamachinelearningapproachtoestimatesystemiclupuserythematosusdiseaseactivityindexscorecategoriesandapplicationinarealworlddataset AT gliklichbenjamin validationofamachinelearningapproachtoestimatesystemiclupuserythematosusdiseaseactivityindexscorecategoriesandapplicationinarealworlddataset AT boussioscostas validationofamachinelearningapproachtoestimatesystemiclupuserythematosusdiseaseactivityindexscorecategoriesandapplicationinarealworlddataset AT suzhaohui validationofamachinelearningapproachtoestimatesystemiclupuserythematosusdiseaseactivityindexscorecategoriesandapplicationinarealworlddataset AT curhangary validationofamachinelearningapproachtoestimatesystemiclupuserythematosusdiseaseactivityindexscorecategoriesandapplicationinarealworlddataset |