Cargando…

Algorithmic prediction of HIV status using nation-wide electronic registry data

BACKGROUND: Late HIV diagnosis is detrimental both to the individual and to society. Strategies to improve early diagnosis of HIV must be a key health care priority. We examined whether nation-wide electronic registry data could be used to predict HIV status using machine learning algorithms. METHOD...

Descripción completa

Detalles Bibliográficos
Autores principales: Ahlström, Magnus G., Ronit, Andreas, Omland, Lars Haukali, Vedel, Søren, Obel, Niels
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6933258/
https://www.ncbi.nlm.nih.gov/pubmed/31891137
http://dx.doi.org/10.1016/j.eclinm.2019.10.016
_version_ 1783483172923637760
author Ahlström, Magnus G.
Ronit, Andreas
Omland, Lars Haukali
Vedel, Søren
Obel, Niels
author_facet Ahlström, Magnus G.
Ronit, Andreas
Omland, Lars Haukali
Vedel, Søren
Obel, Niels
author_sort Ahlström, Magnus G.
collection PubMed
description BACKGROUND: Late HIV diagnosis is detrimental both to the individual and to society. Strategies to improve early diagnosis of HIV must be a key health care priority. We examined whether nation-wide electronic registry data could be used to predict HIV status using machine learning algorithms. METHODS: We extracted individual level data from Danish registries and used algorithms to predict HIV status. We used various algorithms to train prediction models and validated these models. We calibrated the models to mimic different clinical scenarios and created confusion matrices based on the calibrated models. FINDINGS: A total 4,384,178 individuals, including 4,350 with incident HIV, were included in the analyses. The full model that included all variables that included demographic variables and information on past medical history had the highest area under the receiver operating characteristics curves of 88·4% (95%CI: 87·5% – 89·4%) in the validation dataset. Performance measures did not differ substantially with regards to which machine learning algorithm was used. When we calibrated the models to a specificity of 99·9% (pre-exposure prophylaxis (PrEP) scenario), we found a positive predictive value (PPV) of 8·3% in the full model. When we calibrated the models to a sensitivity of 90% (screening scenario), 384 individuals would have to be tested to find one undiagnosed person with HIV. INTERPRETATION: Machine learning algorithms can learn from electronic registry data and help to predict HIV status with a fairly high level of accuracy. Integration of prediction models into clinical software systems may complement existing strategies such as indicator condition-guided HIV testing and prove useful for identifying individuals suitable for PrEP. FUNDING: The study was supported by funds from the Preben and Anne Simonsens Foundation, the Novo Nordisk Foundation, Rigshospitalet, Copenhagen University, the Danish AIDS Foundation, the Augustinus Foundation and the Danish Health Foundation.
format Online
Article
Text
id pubmed-6933258
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Elsevier
record_format MEDLINE/PubMed
spelling pubmed-69332582019-12-30 Algorithmic prediction of HIV status using nation-wide electronic registry data Ahlström, Magnus G. Ronit, Andreas Omland, Lars Haukali Vedel, Søren Obel, Niels EClinicalMedicine Research Paper BACKGROUND: Late HIV diagnosis is detrimental both to the individual and to society. Strategies to improve early diagnosis of HIV must be a key health care priority. We examined whether nation-wide electronic registry data could be used to predict HIV status using machine learning algorithms. METHODS: We extracted individual level data from Danish registries and used algorithms to predict HIV status. We used various algorithms to train prediction models and validated these models. We calibrated the models to mimic different clinical scenarios and created confusion matrices based on the calibrated models. FINDINGS: A total 4,384,178 individuals, including 4,350 with incident HIV, were included in the analyses. The full model that included all variables that included demographic variables and information on past medical history had the highest area under the receiver operating characteristics curves of 88·4% (95%CI: 87·5% – 89·4%) in the validation dataset. Performance measures did not differ substantially with regards to which machine learning algorithm was used. When we calibrated the models to a specificity of 99·9% (pre-exposure prophylaxis (PrEP) scenario), we found a positive predictive value (PPV) of 8·3% in the full model. When we calibrated the models to a sensitivity of 90% (screening scenario), 384 individuals would have to be tested to find one undiagnosed person with HIV. INTERPRETATION: Machine learning algorithms can learn from electronic registry data and help to predict HIV status with a fairly high level of accuracy. Integration of prediction models into clinical software systems may complement existing strategies such as indicator condition-guided HIV testing and prove useful for identifying individuals suitable for PrEP. FUNDING: The study was supported by funds from the Preben and Anne Simonsens Foundation, the Novo Nordisk Foundation, Rigshospitalet, Copenhagen University, the Danish AIDS Foundation, the Augustinus Foundation and the Danish Health Foundation. Elsevier 2019-11-05 /pmc/articles/PMC6933258/ /pubmed/31891137 http://dx.doi.org/10.1016/j.eclinm.2019.10.016 Text en © 2019 Published by Elsevier Ltd. http://creativecommons.org/licenses/by-nc-nd/4.0/ This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
spellingShingle Research Paper
Ahlström, Magnus G.
Ronit, Andreas
Omland, Lars Haukali
Vedel, Søren
Obel, Niels
Algorithmic prediction of HIV status using nation-wide electronic registry data
title Algorithmic prediction of HIV status using nation-wide electronic registry data
title_full Algorithmic prediction of HIV status using nation-wide electronic registry data
title_fullStr Algorithmic prediction of HIV status using nation-wide electronic registry data
title_full_unstemmed Algorithmic prediction of HIV status using nation-wide electronic registry data
title_short Algorithmic prediction of HIV status using nation-wide electronic registry data
title_sort algorithmic prediction of hiv status using nation-wide electronic registry data
topic Research Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6933258/
https://www.ncbi.nlm.nih.gov/pubmed/31891137
http://dx.doi.org/10.1016/j.eclinm.2019.10.016
work_keys_str_mv AT ahlstrommagnusg algorithmicpredictionofhivstatususingnationwideelectronicregistrydata
AT ronitandreas algorithmicpredictionofhivstatususingnationwideelectronicregistrydata
AT omlandlarshaukali algorithmicpredictionofhivstatususingnationwideelectronicregistrydata
AT vedelsøren algorithmicpredictionofhivstatususingnationwideelectronicregistrydata
AT obelniels algorithmicpredictionofhivstatususingnationwideelectronicregistrydata