Cargando…

Prediction of HIV status based on socio-behavioural characteristics in East and Southern Africa

INTRODUCTION: High yield HIV testing strategies are critical to reach epidemic control in high prevalence and low-resource settings such as East and Southern Africa. In this study, we aimed to predict the HIV status of individuals living in Angola, Burundi, Ethiopia, Lesotho, Malawi, Mozambique, Nam...

Descripción completa

Detalles Bibliográficos
Autores principales: Orel, Erol, Esra, Rachel, Estill, Janne, Thiabaud, Amaury, Marchand-Maillet, Stéphane, Merzouki, Aziza, Keiser, Olivia
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8893684/
https://www.ncbi.nlm.nih.gov/pubmed/35239697
http://dx.doi.org/10.1371/journal.pone.0264429
_version_ 1784662463805915136
author Orel, Erol
Esra, Rachel
Estill, Janne
Thiabaud, Amaury
Marchand-Maillet, Stéphane
Merzouki, Aziza
Keiser, Olivia
author_facet Orel, Erol
Esra, Rachel
Estill, Janne
Thiabaud, Amaury
Marchand-Maillet, Stéphane
Merzouki, Aziza
Keiser, Olivia
author_sort Orel, Erol
collection PubMed
description INTRODUCTION: High yield HIV testing strategies are critical to reach epidemic control in high prevalence and low-resource settings such as East and Southern Africa. In this study, we aimed to predict the HIV status of individuals living in Angola, Burundi, Ethiopia, Lesotho, Malawi, Mozambique, Namibia, Rwanda, Zambia and Zimbabwe with the highest precision and sensitivity for different policy targets and constraints based on a minimal set of socio-behavioural characteristics. METHODS: We analysed the most recent Demographic and Health Survey from these 10 countries to predict individual’s HIV status using four different algorithms (a penalized logistic regression, a generalized additive model, a support vector machine, and a gradient boosting trees). The algorithms were trained and validated on 80% of the data, and tested on the remaining 20%. We compared the predictions based on the F1 score, the harmonic mean of sensitivity and positive predictive value (PPV), and we assessed the generalization of our models by testing them against an independent left-out country. The best performing algorithm was trained on a minimal subset of variables which were identified as the most predictive, and used to 1) identify 95% of people living with HIV (PLHIV) while maximising precision and 2) identify groups of individuals by adjusting the probability threshold of being HIV positive (90% in our scenario) for achieving specific testing strategies. RESULTS: Overall 55,151 males and 69,626 females were included in the analysis. The gradient boosting trees algorithm performed best in predicting HIV status with a mean F1 score of 76.8% [95% confidence interval (CI) 76.0%-77.6%] for males (vs [CI 67.8%-70.6%] for SVM) and 78.8% [CI 78.2%-79.4%] for females (vs [CI 73.4%-75.8%] for SVM). Among the ten most predictive variables for each sex, nine were identical: longitude, latitude and, altitude of place of residence, current age, age of most recent partner, total lifetime number of sexual partners, years lived in current place of residence, condom use during last intercourse and, wealth index. Only age at first sex for male (ranked 10th) and Rohrer’s index for female (ranked 6th) were not similar for both sexes. Our large-scale scenario, which consisted in identifying 95% of all PLHIV, would have required testing 49.4% of males and 48.1% of females while achieving a precision of 15.4% for males and 22.7% for females. For the second scenario, only 4.6% of males and 6.0% of females would have had to be tested to find 55.7% of all males and 50.5% of all females living with HIV. CONCLUSIONS: We trained a gradient boosting trees algorithm to find 95% of PLHIV with a precision twice higher than with general population testing by using only a limited number of socio-behavioural characteristics. We also successfully identified people at high risk of infection who may be offered pre-exposure prophylaxis or voluntary medical male circumcision. These findings can inform the implementation of new high-yield HIV tests and help develop very precise strategies based on low-resource settings constraints.
format Online
Article
Text
id pubmed-8893684
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-88936842022-03-04 Prediction of HIV status based on socio-behavioural characteristics in East and Southern Africa Orel, Erol Esra, Rachel Estill, Janne Thiabaud, Amaury Marchand-Maillet, Stéphane Merzouki, Aziza Keiser, Olivia PLoS One Research Article INTRODUCTION: High yield HIV testing strategies are critical to reach epidemic control in high prevalence and low-resource settings such as East and Southern Africa. In this study, we aimed to predict the HIV status of individuals living in Angola, Burundi, Ethiopia, Lesotho, Malawi, Mozambique, Namibia, Rwanda, Zambia and Zimbabwe with the highest precision and sensitivity for different policy targets and constraints based on a minimal set of socio-behavioural characteristics. METHODS: We analysed the most recent Demographic and Health Survey from these 10 countries to predict individual’s HIV status using four different algorithms (a penalized logistic regression, a generalized additive model, a support vector machine, and a gradient boosting trees). The algorithms were trained and validated on 80% of the data, and tested on the remaining 20%. We compared the predictions based on the F1 score, the harmonic mean of sensitivity and positive predictive value (PPV), and we assessed the generalization of our models by testing them against an independent left-out country. The best performing algorithm was trained on a minimal subset of variables which were identified as the most predictive, and used to 1) identify 95% of people living with HIV (PLHIV) while maximising precision and 2) identify groups of individuals by adjusting the probability threshold of being HIV positive (90% in our scenario) for achieving specific testing strategies. RESULTS: Overall 55,151 males and 69,626 females were included in the analysis. The gradient boosting trees algorithm performed best in predicting HIV status with a mean F1 score of 76.8% [95% confidence interval (CI) 76.0%-77.6%] for males (vs [CI 67.8%-70.6%] for SVM) and 78.8% [CI 78.2%-79.4%] for females (vs [CI 73.4%-75.8%] for SVM). Among the ten most predictive variables for each sex, nine were identical: longitude, latitude and, altitude of place of residence, current age, age of most recent partner, total lifetime number of sexual partners, years lived in current place of residence, condom use during last intercourse and, wealth index. Only age at first sex for male (ranked 10th) and Rohrer’s index for female (ranked 6th) were not similar for both sexes. Our large-scale scenario, which consisted in identifying 95% of all PLHIV, would have required testing 49.4% of males and 48.1% of females while achieving a precision of 15.4% for males and 22.7% for females. For the second scenario, only 4.6% of males and 6.0% of females would have had to be tested to find 55.7% of all males and 50.5% of all females living with HIV. CONCLUSIONS: We trained a gradient boosting trees algorithm to find 95% of PLHIV with a precision twice higher than with general population testing by using only a limited number of socio-behavioural characteristics. We also successfully identified people at high risk of infection who may be offered pre-exposure prophylaxis or voluntary medical male circumcision. These findings can inform the implementation of new high-yield HIV tests and help develop very precise strategies based on low-resource settings constraints. Public Library of Science 2022-03-03 /pmc/articles/PMC8893684/ /pubmed/35239697 http://dx.doi.org/10.1371/journal.pone.0264429 Text en © 2022 Orel et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Orel, Erol
Esra, Rachel
Estill, Janne
Thiabaud, Amaury
Marchand-Maillet, Stéphane
Merzouki, Aziza
Keiser, Olivia
Prediction of HIV status based on socio-behavioural characteristics in East and Southern Africa
title Prediction of HIV status based on socio-behavioural characteristics in East and Southern Africa
title_full Prediction of HIV status based on socio-behavioural characteristics in East and Southern Africa
title_fullStr Prediction of HIV status based on socio-behavioural characteristics in East and Southern Africa
title_full_unstemmed Prediction of HIV status based on socio-behavioural characteristics in East and Southern Africa
title_short Prediction of HIV status based on socio-behavioural characteristics in East and Southern Africa
title_sort prediction of hiv status based on socio-behavioural characteristics in east and southern africa
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8893684/
https://www.ncbi.nlm.nih.gov/pubmed/35239697
http://dx.doi.org/10.1371/journal.pone.0264429
work_keys_str_mv AT orelerol predictionofhivstatusbasedonsociobehaviouralcharacteristicsineastandsouthernafrica
AT esrarachel predictionofhivstatusbasedonsociobehaviouralcharacteristicsineastandsouthernafrica
AT estilljanne predictionofhivstatusbasedonsociobehaviouralcharacteristicsineastandsouthernafrica
AT thiabaudamaury predictionofhivstatusbasedonsociobehaviouralcharacteristicsineastandsouthernafrica
AT marchandmailletstephane predictionofhivstatusbasedonsociobehaviouralcharacteristicsineastandsouthernafrica
AT merzoukiaziza predictionofhivstatusbasedonsociobehaviouralcharacteristicsineastandsouthernafrica
AT keiserolivia predictionofhivstatusbasedonsociobehaviouralcharacteristicsineastandsouthernafrica