Cargando…

Predicting a diagnosis of ankylosing spondylitis using primary care health records–A machine learning approach

Ankylosing spondylitis is the second most common cause of inflammatory arthritis. However, a successful diagnosis can take a decade to confirm from symptom onset (via x-rays). The aim of this study was to use machine learning methods to develop a profile of the characteristics of people who are like...

Descripción completa

Detalles Bibliográficos
Autores principales: Kennedy, Jonathan, Kennedy, Natasha, Cooksey, Roxanne, Choy, Ernest, Siebert, Stefan, Rahman, Muhammad, Brophy, Sinead
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10065228/
https://www.ncbi.nlm.nih.gov/pubmed/37000839
http://dx.doi.org/10.1371/journal.pone.0279076
_version_ 1785018058968924160
author Kennedy, Jonathan
Kennedy, Natasha
Cooksey, Roxanne
Choy, Ernest
Siebert, Stefan
Rahman, Muhammad
Brophy, Sinead
author_facet Kennedy, Jonathan
Kennedy, Natasha
Cooksey, Roxanne
Choy, Ernest
Siebert, Stefan
Rahman, Muhammad
Brophy, Sinead
author_sort Kennedy, Jonathan
collection PubMed
description Ankylosing spondylitis is the second most common cause of inflammatory arthritis. However, a successful diagnosis can take a decade to confirm from symptom onset (via x-rays). The aim of this study was to use machine learning methods to develop a profile of the characteristics of people who are likely to be given a diagnosis of AS in future. The Secure Anonymised Information Linkage databank was used. Patients with ankylosing spondylitis were identified using their routine data and matched with controls who had no record of a diagnosis of ankylosing spondylitis or axial spondyloarthritis. Data was analysed separately for men and women. The model was developed using feature/variable selection and principal component analysis to develop decision trees. The decision tree with the highest average F value was selected and validated with a test dataset. The model for men indicated that lower back pain, uveitis, and NSAID use under age 20 is associated with AS development. The model for women showed an older age of symptom presentation compared to men with back pain and multiple pain relief medications. The models showed good prediction (positive predictive value 70%-80%) in test data but in the general population where prevalence is very low (0.09% of the population in this dataset) the positive predictive value would be very low (0.33%-0.25%). Machine learning can be used to help profile and understand the characteristics of people who will develop AS, and in test datasets with artificially high prevalence, will perform well. However, when applied to a general population with low prevalence rates, such as that in primary care, the positive predictive value for even the best model would be 1.4%. Multiple models may be needed to narrow down the population over time to improve the predictive value and therefore reduce the time to diagnosis of ankylosing spondylitis.
format Online
Article
Text
id pubmed-10065228
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-100652282023-04-01 Predicting a diagnosis of ankylosing spondylitis using primary care health records–A machine learning approach Kennedy, Jonathan Kennedy, Natasha Cooksey, Roxanne Choy, Ernest Siebert, Stefan Rahman, Muhammad Brophy, Sinead PLoS One Research Article Ankylosing spondylitis is the second most common cause of inflammatory arthritis. However, a successful diagnosis can take a decade to confirm from symptom onset (via x-rays). The aim of this study was to use machine learning methods to develop a profile of the characteristics of people who are likely to be given a diagnosis of AS in future. The Secure Anonymised Information Linkage databank was used. Patients with ankylosing spondylitis were identified using their routine data and matched with controls who had no record of a diagnosis of ankylosing spondylitis or axial spondyloarthritis. Data was analysed separately for men and women. The model was developed using feature/variable selection and principal component analysis to develop decision trees. The decision tree with the highest average F value was selected and validated with a test dataset. The model for men indicated that lower back pain, uveitis, and NSAID use under age 20 is associated with AS development. The model for women showed an older age of symptom presentation compared to men with back pain and multiple pain relief medications. The models showed good prediction (positive predictive value 70%-80%) in test data but in the general population where prevalence is very low (0.09% of the population in this dataset) the positive predictive value would be very low (0.33%-0.25%). Machine learning can be used to help profile and understand the characteristics of people who will develop AS, and in test datasets with artificially high prevalence, will perform well. However, when applied to a general population with low prevalence rates, such as that in primary care, the positive predictive value for even the best model would be 1.4%. Multiple models may be needed to narrow down the population over time to improve the predictive value and therefore reduce the time to diagnosis of ankylosing spondylitis. Public Library of Science 2023-03-31 /pmc/articles/PMC10065228/ /pubmed/37000839 http://dx.doi.org/10.1371/journal.pone.0279076 Text en © 2023 Kennedy et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Kennedy, Jonathan
Kennedy, Natasha
Cooksey, Roxanne
Choy, Ernest
Siebert, Stefan
Rahman, Muhammad
Brophy, Sinead
Predicting a diagnosis of ankylosing spondylitis using primary care health records–A machine learning approach
title Predicting a diagnosis of ankylosing spondylitis using primary care health records–A machine learning approach
title_full Predicting a diagnosis of ankylosing spondylitis using primary care health records–A machine learning approach
title_fullStr Predicting a diagnosis of ankylosing spondylitis using primary care health records–A machine learning approach
title_full_unstemmed Predicting a diagnosis of ankylosing spondylitis using primary care health records–A machine learning approach
title_short Predicting a diagnosis of ankylosing spondylitis using primary care health records–A machine learning approach
title_sort predicting a diagnosis of ankylosing spondylitis using primary care health records–a machine learning approach
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10065228/
https://www.ncbi.nlm.nih.gov/pubmed/37000839
http://dx.doi.org/10.1371/journal.pone.0279076
work_keys_str_mv AT kennedyjonathan predictingadiagnosisofankylosingspondylitisusingprimarycarehealthrecordsamachinelearningapproach
AT kennedynatasha predictingadiagnosisofankylosingspondylitisusingprimarycarehealthrecordsamachinelearningapproach
AT cookseyroxanne predictingadiagnosisofankylosingspondylitisusingprimarycarehealthrecordsamachinelearningapproach
AT choyernest predictingadiagnosisofankylosingspondylitisusingprimarycarehealthrecordsamachinelearningapproach
AT siebertstefan predictingadiagnosisofankylosingspondylitisusingprimarycarehealthrecordsamachinelearningapproach
AT rahmanmuhammad predictingadiagnosisofankylosingspondylitisusingprimarycarehealthrecordsamachinelearningapproach
AT brophysinead predictingadiagnosisofankylosingspondylitisusingprimarycarehealthrecordsamachinelearningapproach