Cargando…

Detection of primary Sjögren’s syndrome in primary care: developing a classification model with the use of routine healthcare data and machine learning

BACKGROUND: Primary Sjögren’s Syndrome (pSS) is a rare autoimmune disease that is difficult to diagnose due to a variety of clinical presentations, resulting in misdiagnosis and late referral to specialists. To improve early-stage disease recognition, this study aimed to develop an algorithm to iden...

Descripción completa

Detalles Bibliográficos
Autores principales: Dros, Jesper T., Bos, Isabelle, Bennis, Frank C., Wiegersma, Sytske, Paget, John, Seghieri, Chiara, Cortés, Jaime Barrio, Verheij, Robert A.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9361661/
https://www.ncbi.nlm.nih.gov/pubmed/35945489
http://dx.doi.org/10.1186/s12875-022-01804-w
_version_ 1784764573997334528
author Dros, Jesper T.
Bos, Isabelle
Bennis, Frank C.
Wiegersma, Sytske
Paget, John
Seghieri, Chiara
Cortés, Jaime Barrio
Verheij, Robert A.
author_facet Dros, Jesper T.
Bos, Isabelle
Bennis, Frank C.
Wiegersma, Sytske
Paget, John
Seghieri, Chiara
Cortés, Jaime Barrio
Verheij, Robert A.
author_sort Dros, Jesper T.
collection PubMed
description BACKGROUND: Primary Sjögren’s Syndrome (pSS) is a rare autoimmune disease that is difficult to diagnose due to a variety of clinical presentations, resulting in misdiagnosis and late referral to specialists. To improve early-stage disease recognition, this study aimed to develop an algorithm to identify possible pSS patients in primary care. We built a machine learning algorithm which was based on combined healthcare data as a first step towards a clinical decision support system. METHOD: Routine healthcare data, consisting of primary care electronic health records (EHRs) data and hospital claims data (HCD), were linked on patient level and consisted of 1411 pSS and 929,179 non-pSS patients. Logistic regression (LR) and random forest (RF) models were used to classify patients using age, gender, diseases and symptoms, prescriptions and GP visits. RESULTS: The LR and RF models had an AUC of 0.82 and 0.84, respectively. Many actual pSS patients were found (sensitivity LR = 72.3%, RF = 70.1%), specificity was 74.0% (LR) and 77.9% (RF) and the negative predictive value was 99.9% for both models. However, most patients classified as pSS patients did not have a diagnosis of pSS in secondary care (positive predictive value LR = 0.4%, RF = 0.5%). CONCLUSION: This is the first study to use machine learning to classify patients with pSS in primary care using GP EHR data. Our algorithm has the potential to support the early recognition of pSS in primary care and should be validated and optimized in clinical practice. To further enhance the algorithm in detecting pSS in primary care, we suggest it is improved by working with experienced clinicians. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12875-022-01804-w.
format Online
Article
Text
id pubmed-9361661
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-93616612022-08-10 Detection of primary Sjögren’s syndrome in primary care: developing a classification model with the use of routine healthcare data and machine learning Dros, Jesper T. Bos, Isabelle Bennis, Frank C. Wiegersma, Sytske Paget, John Seghieri, Chiara Cortés, Jaime Barrio Verheij, Robert A. BMC Prim Care Research BACKGROUND: Primary Sjögren’s Syndrome (pSS) is a rare autoimmune disease that is difficult to diagnose due to a variety of clinical presentations, resulting in misdiagnosis and late referral to specialists. To improve early-stage disease recognition, this study aimed to develop an algorithm to identify possible pSS patients in primary care. We built a machine learning algorithm which was based on combined healthcare data as a first step towards a clinical decision support system. METHOD: Routine healthcare data, consisting of primary care electronic health records (EHRs) data and hospital claims data (HCD), were linked on patient level and consisted of 1411 pSS and 929,179 non-pSS patients. Logistic regression (LR) and random forest (RF) models were used to classify patients using age, gender, diseases and symptoms, prescriptions and GP visits. RESULTS: The LR and RF models had an AUC of 0.82 and 0.84, respectively. Many actual pSS patients were found (sensitivity LR = 72.3%, RF = 70.1%), specificity was 74.0% (LR) and 77.9% (RF) and the negative predictive value was 99.9% for both models. However, most patients classified as pSS patients did not have a diagnosis of pSS in secondary care (positive predictive value LR = 0.4%, RF = 0.5%). CONCLUSION: This is the first study to use machine learning to classify patients with pSS in primary care using GP EHR data. Our algorithm has the potential to support the early recognition of pSS in primary care and should be validated and optimized in clinical practice. To further enhance the algorithm in detecting pSS in primary care, we suggest it is improved by working with experienced clinicians. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12875-022-01804-w. BioMed Central 2022-08-09 /pmc/articles/PMC9361661/ /pubmed/35945489 http://dx.doi.org/10.1186/s12875-022-01804-w Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research
Dros, Jesper T.
Bos, Isabelle
Bennis, Frank C.
Wiegersma, Sytske
Paget, John
Seghieri, Chiara
Cortés, Jaime Barrio
Verheij, Robert A.
Detection of primary Sjögren’s syndrome in primary care: developing a classification model with the use of routine healthcare data and machine learning
title Detection of primary Sjögren’s syndrome in primary care: developing a classification model with the use of routine healthcare data and machine learning
title_full Detection of primary Sjögren’s syndrome in primary care: developing a classification model with the use of routine healthcare data and machine learning
title_fullStr Detection of primary Sjögren’s syndrome in primary care: developing a classification model with the use of routine healthcare data and machine learning
title_full_unstemmed Detection of primary Sjögren’s syndrome in primary care: developing a classification model with the use of routine healthcare data and machine learning
title_short Detection of primary Sjögren’s syndrome in primary care: developing a classification model with the use of routine healthcare data and machine learning
title_sort detection of primary sjögren’s syndrome in primary care: developing a classification model with the use of routine healthcare data and machine learning
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9361661/
https://www.ncbi.nlm.nih.gov/pubmed/35945489
http://dx.doi.org/10.1186/s12875-022-01804-w
work_keys_str_mv AT drosjespert detectionofprimarysjogrenssyndromeinprimarycaredevelopingaclassificationmodelwiththeuseofroutinehealthcaredataandmachinelearning
AT bosisabelle detectionofprimarysjogrenssyndromeinprimarycaredevelopingaclassificationmodelwiththeuseofroutinehealthcaredataandmachinelearning
AT bennisfrankc detectionofprimarysjogrenssyndromeinprimarycaredevelopingaclassificationmodelwiththeuseofroutinehealthcaredataandmachinelearning
AT wiegersmasytske detectionofprimarysjogrenssyndromeinprimarycaredevelopingaclassificationmodelwiththeuseofroutinehealthcaredataandmachinelearning
AT pagetjohn detectionofprimarysjogrenssyndromeinprimarycaredevelopingaclassificationmodelwiththeuseofroutinehealthcaredataandmachinelearning
AT seghierichiara detectionofprimarysjogrenssyndromeinprimarycaredevelopingaclassificationmodelwiththeuseofroutinehealthcaredataandmachinelearning
AT cortesjaimebarrio detectionofprimarysjogrenssyndromeinprimarycaredevelopingaclassificationmodelwiththeuseofroutinehealthcaredataandmachinelearning
AT verheijroberta detectionofprimarysjogrenssyndromeinprimarycaredevelopingaclassificationmodelwiththeuseofroutinehealthcaredataandmachinelearning