Cargando…

Predicting Risky Sexual Behavior Among College Students Through Machine Learning Approaches: Cross-sectional Analysis of Individual Data From 1264 Universities in 31 Provinces in China

BACKGROUND: Risky sexual behavior (RSB), the most direct risk factor for sexually transmitted infections (STIs), is common among college students. Thus, identifying relevant risk factors and predicting RSB are important to intervene and prevent RSB among college students. OBJECTIVE: We aim to establ...

Descripción completa

Detalles Bibliográficos
Autores principales: Li, Xuan, Zhang, Hanxiyue, Zhao, Shuangyu, Tang, Kun
Formato: Online Artículo Texto
Lenguaje:English
Publicado: JMIR Publications 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9909517/
https://www.ncbi.nlm.nih.gov/pubmed/36696166
http://dx.doi.org/10.2196/41162
_version_ 1784884592743809024
author Li, Xuan
Zhang, Hanxiyue
Zhao, Shuangyu
Tang, Kun
author_facet Li, Xuan
Zhang, Hanxiyue
Zhao, Shuangyu
Tang, Kun
author_sort Li, Xuan
collection PubMed
description BACKGROUND: Risky sexual behavior (RSB), the most direct risk factor for sexually transmitted infections (STIs), is common among college students. Thus, identifying relevant risk factors and predicting RSB are important to intervene and prevent RSB among college students. OBJECTIVE: We aim to establish a predictive model for RSB among college students to facilitate timely intervention and the prevention of RSB to help limit STI contraction. METHODS: We included a total of 8794 heterosexual Chinese students who self-reported engaging in sexual intercourse from November 2019 to February 2020. We identified RSB among those students and attributed it to 4 dimensions: whether contraception was used, whether the contraceptive method was safe, whether students engaged in casual sex or sex with multiple partners, and integrated RSB (which combined the first 3 dimensions). Overall, 126 predictors were included in this study, including demographic characteristics, daily habits, physical and mental health, relationship status, sexual knowledge, sexual education, sexual attitude, and previous sexual experience. For each type of RSB, we compared 8 machine learning (ML) models: multiple logistic regression (MLR), naive Bayes (BYS), linear discriminant analysis (LDA), random forest (RF), gradient boosting machine (GBM), extreme gradient boosting (XGBoost), deep learning (DL), and the ensemble model. The optimal model for both RSB prediction and risk factor identification was selected based on a set of validation indicators. An MLR model was applied to investigate the association between RSB and identified risk factors through ML methods. RESULTS: In total, 5328 (60.59%) students were found to have previously engaged in RSB. Among them, 3682 (41.87%) did not use contraception every time they had sexual intercourse, 3602 (40.96%) had previously used an ineffective or unsafe contraceptive method, and 1157 (13.16%) had engaged in casual sex or sex with multiple partners. XGBoost achieved the optimal predictive performance on all 4 types of RSB, with the area under the receiver operator characteristic curve (AUROC) reaching 0.78, 0.72, 0.94, and 0.80 for contraceptive use, safe contraceptive method use, engagement in casual sex or with multiple partners, and integrated RSB, respectively. By ensuring the stability of various validation indicators, the 12 most predictive variables were then selected using XGBoost, including the participants’ relationship status, sexual knowledge, sexual attitude, and previous sexual experience. Through MLR, RSB was found to be significantly associated with less sexual knowledge, more liberal sexual attitudes, single relationship status, and increased sexual experience. CONCLUSIONS: RSB is prevalent among college students. The XGBoost model is an effective approach to predict RSB and identify corresponding risk factors. This study presented an opportunity to promote sexual and reproductive health through ML models, which can help targeted interventions aimed at different subgroups and the precise surveillance and prevention of RSB among college students through risk probability prediction.
format Online
Article
Text
id pubmed-9909517
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher JMIR Publications
record_format MEDLINE/PubMed
spelling pubmed-99095172023-02-10 Predicting Risky Sexual Behavior Among College Students Through Machine Learning Approaches: Cross-sectional Analysis of Individual Data From 1264 Universities in 31 Provinces in China Li, Xuan Zhang, Hanxiyue Zhao, Shuangyu Tang, Kun JMIR Public Health Surveill Original Paper BACKGROUND: Risky sexual behavior (RSB), the most direct risk factor for sexually transmitted infections (STIs), is common among college students. Thus, identifying relevant risk factors and predicting RSB are important to intervene and prevent RSB among college students. OBJECTIVE: We aim to establish a predictive model for RSB among college students to facilitate timely intervention and the prevention of RSB to help limit STI contraction. METHODS: We included a total of 8794 heterosexual Chinese students who self-reported engaging in sexual intercourse from November 2019 to February 2020. We identified RSB among those students and attributed it to 4 dimensions: whether contraception was used, whether the contraceptive method was safe, whether students engaged in casual sex or sex with multiple partners, and integrated RSB (which combined the first 3 dimensions). Overall, 126 predictors were included in this study, including demographic characteristics, daily habits, physical and mental health, relationship status, sexual knowledge, sexual education, sexual attitude, and previous sexual experience. For each type of RSB, we compared 8 machine learning (ML) models: multiple logistic regression (MLR), naive Bayes (BYS), linear discriminant analysis (LDA), random forest (RF), gradient boosting machine (GBM), extreme gradient boosting (XGBoost), deep learning (DL), and the ensemble model. The optimal model for both RSB prediction and risk factor identification was selected based on a set of validation indicators. An MLR model was applied to investigate the association between RSB and identified risk factors through ML methods. RESULTS: In total, 5328 (60.59%) students were found to have previously engaged in RSB. Among them, 3682 (41.87%) did not use contraception every time they had sexual intercourse, 3602 (40.96%) had previously used an ineffective or unsafe contraceptive method, and 1157 (13.16%) had engaged in casual sex or sex with multiple partners. XGBoost achieved the optimal predictive performance on all 4 types of RSB, with the area under the receiver operator characteristic curve (AUROC) reaching 0.78, 0.72, 0.94, and 0.80 for contraceptive use, safe contraceptive method use, engagement in casual sex or with multiple partners, and integrated RSB, respectively. By ensuring the stability of various validation indicators, the 12 most predictive variables were then selected using XGBoost, including the participants’ relationship status, sexual knowledge, sexual attitude, and previous sexual experience. Through MLR, RSB was found to be significantly associated with less sexual knowledge, more liberal sexual attitudes, single relationship status, and increased sexual experience. CONCLUSIONS: RSB is prevalent among college students. The XGBoost model is an effective approach to predict RSB and identify corresponding risk factors. This study presented an opportunity to promote sexual and reproductive health through ML models, which can help targeted interventions aimed at different subgroups and the precise surveillance and prevention of RSB among college students through risk probability prediction. JMIR Publications 2023-01-25 /pmc/articles/PMC9909517/ /pubmed/36696166 http://dx.doi.org/10.2196/41162 Text en ©Xuan Li, Hanxiyue Zhang, Shuangyu Zhao, Kun Tang. Originally published in JMIR Public Health and Surveillance (https://publichealth.jmir.org), 25.01.2023. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Public Health and Surveillance, is properly cited. The complete bibliographic information, a link to the original publication on https://publichealth.jmir.org, as well as this copyright and license information must be included.
spellingShingle Original Paper
Li, Xuan
Zhang, Hanxiyue
Zhao, Shuangyu
Tang, Kun
Predicting Risky Sexual Behavior Among College Students Through Machine Learning Approaches: Cross-sectional Analysis of Individual Data From 1264 Universities in 31 Provinces in China
title Predicting Risky Sexual Behavior Among College Students Through Machine Learning Approaches: Cross-sectional Analysis of Individual Data From 1264 Universities in 31 Provinces in China
title_full Predicting Risky Sexual Behavior Among College Students Through Machine Learning Approaches: Cross-sectional Analysis of Individual Data From 1264 Universities in 31 Provinces in China
title_fullStr Predicting Risky Sexual Behavior Among College Students Through Machine Learning Approaches: Cross-sectional Analysis of Individual Data From 1264 Universities in 31 Provinces in China
title_full_unstemmed Predicting Risky Sexual Behavior Among College Students Through Machine Learning Approaches: Cross-sectional Analysis of Individual Data From 1264 Universities in 31 Provinces in China
title_short Predicting Risky Sexual Behavior Among College Students Through Machine Learning Approaches: Cross-sectional Analysis of Individual Data From 1264 Universities in 31 Provinces in China
title_sort predicting risky sexual behavior among college students through machine learning approaches: cross-sectional analysis of individual data from 1264 universities in 31 provinces in china
topic Original Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9909517/
https://www.ncbi.nlm.nih.gov/pubmed/36696166
http://dx.doi.org/10.2196/41162
work_keys_str_mv AT lixuan predictingriskysexualbehavioramongcollegestudentsthroughmachinelearningapproachescrosssectionalanalysisofindividualdatafrom1264universitiesin31provincesinchina
AT zhanghanxiyue predictingriskysexualbehavioramongcollegestudentsthroughmachinelearningapproachescrosssectionalanalysisofindividualdatafrom1264universitiesin31provincesinchina
AT zhaoshuangyu predictingriskysexualbehavioramongcollegestudentsthroughmachinelearningapproachescrosssectionalanalysisofindividualdatafrom1264universitiesin31provincesinchina
AT tangkun predictingriskysexualbehavioramongcollegestudentsthroughmachinelearningapproachescrosssectionalanalysisofindividualdatafrom1264universitiesin31provincesinchina