Cargando…

Models of Gender Dysphoria Using Social Media Data for Use in Technology-Delivered Interventions: Machine Learning and Natural Language Processing Validation Study

BACKGROUND: The optimal treatment for gender dysphoria is medical intervention, but many transgender and nonbinary people face significant treatment barriers when seeking help for gender dysphoria. When untreated, gender dysphoria is associated with depression, anxiety, suicidality, and substance mi...

Descripción completa

Detalles Bibliográficos
Autores principales: Cascalheira, Cory J, Flinn, Ryan E, Zhao, Yuxuan, Klooster, Dannie, Laprade, Danica, Hamdi, Shah Muhammad, Scheer, Jillian R, Gonzalez, Alejandra, Lund, Emily M, Gomez, Ivan N, Saha, Koustuv, De Choudhury, Munmun
Formato: Online Artículo Texto
Lenguaje:English
Publicado: JMIR Publications 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10337393/
https://www.ncbi.nlm.nih.gov/pubmed/37327053
http://dx.doi.org/10.2196/47256
_version_ 1785071413592326144
author Cascalheira, Cory J
Flinn, Ryan E
Zhao, Yuxuan
Klooster, Dannie
Laprade, Danica
Hamdi, Shah Muhammad
Scheer, Jillian R
Gonzalez, Alejandra
Lund, Emily M
Gomez, Ivan N
Saha, Koustuv
De Choudhury, Munmun
author_facet Cascalheira, Cory J
Flinn, Ryan E
Zhao, Yuxuan
Klooster, Dannie
Laprade, Danica
Hamdi, Shah Muhammad
Scheer, Jillian R
Gonzalez, Alejandra
Lund, Emily M
Gomez, Ivan N
Saha, Koustuv
De Choudhury, Munmun
author_sort Cascalheira, Cory J
collection PubMed
description BACKGROUND: The optimal treatment for gender dysphoria is medical intervention, but many transgender and nonbinary people face significant treatment barriers when seeking help for gender dysphoria. When untreated, gender dysphoria is associated with depression, anxiety, suicidality, and substance misuse. Technology-delivered interventions for transgender and nonbinary people can be used discretely, safely, and flexibly, thereby reducing treatment barriers and increasing access to psychological interventions to manage distress that accompanies gender dysphoria. Technology-delivered interventions are beginning to incorporate machine learning (ML) and natural language processing (NLP) to automate intervention components and tailor intervention content. A critical step in using ML and NLP in technology-delivered interventions is demonstrating how accurately these methods model clinical constructs. OBJECTIVE: This study aimed to determine the preliminary effectiveness of modeling gender dysphoria with ML and NLP, using transgender and nonbinary people’s social media data. METHODS: Overall, 6 ML models and 949 NLP-generated independent variables were used to model gender dysphoria from the text data of 1573 Reddit (Reddit Inc) posts created on transgender- and nonbinary-specific web-based forums. After developing a codebook grounded in clinical science, a research team of clinicians and students experienced in working with transgender and nonbinary clients used qualitative content analysis to determine whether gender dysphoria was present in each Reddit post (ie, the dependent variable). NLP (eg, n-grams, Linguistic Inquiry and Word Count, word embedding, sentiment, and transfer learning) was used to transform the linguistic content of each post into predictors for ML algorithms. A k-fold cross-validation was performed. Hyperparameters were tuned with random search. Feature selection was performed to demonstrate the relative importance of each NLP-generated independent variable in predicting gender dysphoria. Misclassified posts were analyzed to improve future modeling of gender dysphoria. RESULTS: Results indicated that a supervised ML algorithm (ie, optimized extreme gradient boosting [XGBoost]) modeled gender dysphoria with a high degree of accuracy (0.84), precision (0.83), and speed (1.23 seconds). Of the NLP-generated independent variables, Diagnostic and Statistical Manual of Mental Disorders, Fifth Edition (DSM-5) clinical keywords (eg, dysphoria and disorder) were most predictive of gender dysphoria. Misclassifications of gender dysphoria were common in posts that expressed uncertainty, featured a stressful experience unrelated to gender dysphoria, were incorrectly coded, expressed insufficient linguistic markers of gender dysphoria, described past experiences of gender dysphoria, showed evidence of identity exploration, expressed aspects of human sexuality unrelated to gender dysphoria, described socially based gender dysphoria, expressed strong affective or cognitive reactions unrelated to gender dysphoria, or discussed body image. CONCLUSIONS: Findings suggest that ML- and NLP-based models of gender dysphoria have significant potential to be integrated into technology-delivered interventions. The results contribute to the growing evidence on the importance of incorporating ML and NLP designs in clinical science, especially when studying marginalized populations.
format Online
Article
Text
id pubmed-10337393
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher JMIR Publications
record_format MEDLINE/PubMed
spelling pubmed-103373932023-07-13 Models of Gender Dysphoria Using Social Media Data for Use in Technology-Delivered Interventions: Machine Learning and Natural Language Processing Validation Study Cascalheira, Cory J Flinn, Ryan E Zhao, Yuxuan Klooster, Dannie Laprade, Danica Hamdi, Shah Muhammad Scheer, Jillian R Gonzalez, Alejandra Lund, Emily M Gomez, Ivan N Saha, Koustuv De Choudhury, Munmun JMIR Form Res Original Paper BACKGROUND: The optimal treatment for gender dysphoria is medical intervention, but many transgender and nonbinary people face significant treatment barriers when seeking help for gender dysphoria. When untreated, gender dysphoria is associated with depression, anxiety, suicidality, and substance misuse. Technology-delivered interventions for transgender and nonbinary people can be used discretely, safely, and flexibly, thereby reducing treatment barriers and increasing access to psychological interventions to manage distress that accompanies gender dysphoria. Technology-delivered interventions are beginning to incorporate machine learning (ML) and natural language processing (NLP) to automate intervention components and tailor intervention content. A critical step in using ML and NLP in technology-delivered interventions is demonstrating how accurately these methods model clinical constructs. OBJECTIVE: This study aimed to determine the preliminary effectiveness of modeling gender dysphoria with ML and NLP, using transgender and nonbinary people’s social media data. METHODS: Overall, 6 ML models and 949 NLP-generated independent variables were used to model gender dysphoria from the text data of 1573 Reddit (Reddit Inc) posts created on transgender- and nonbinary-specific web-based forums. After developing a codebook grounded in clinical science, a research team of clinicians and students experienced in working with transgender and nonbinary clients used qualitative content analysis to determine whether gender dysphoria was present in each Reddit post (ie, the dependent variable). NLP (eg, n-grams, Linguistic Inquiry and Word Count, word embedding, sentiment, and transfer learning) was used to transform the linguistic content of each post into predictors for ML algorithms. A k-fold cross-validation was performed. Hyperparameters were tuned with random search. Feature selection was performed to demonstrate the relative importance of each NLP-generated independent variable in predicting gender dysphoria. Misclassified posts were analyzed to improve future modeling of gender dysphoria. RESULTS: Results indicated that a supervised ML algorithm (ie, optimized extreme gradient boosting [XGBoost]) modeled gender dysphoria with a high degree of accuracy (0.84), precision (0.83), and speed (1.23 seconds). Of the NLP-generated independent variables, Diagnostic and Statistical Manual of Mental Disorders, Fifth Edition (DSM-5) clinical keywords (eg, dysphoria and disorder) were most predictive of gender dysphoria. Misclassifications of gender dysphoria were common in posts that expressed uncertainty, featured a stressful experience unrelated to gender dysphoria, were incorrectly coded, expressed insufficient linguistic markers of gender dysphoria, described past experiences of gender dysphoria, showed evidence of identity exploration, expressed aspects of human sexuality unrelated to gender dysphoria, described socially based gender dysphoria, expressed strong affective or cognitive reactions unrelated to gender dysphoria, or discussed body image. CONCLUSIONS: Findings suggest that ML- and NLP-based models of gender dysphoria have significant potential to be integrated into technology-delivered interventions. The results contribute to the growing evidence on the importance of incorporating ML and NLP designs in clinical science, especially when studying marginalized populations. JMIR Publications 2023-06-16 /pmc/articles/PMC10337393/ /pubmed/37327053 http://dx.doi.org/10.2196/47256 Text en ©Cory J Cascalheira, Ryan E Flinn, Yuxuan Zhao, Dannie Klooster, Danica Laprade, Shah Muhammad Hamdi, Jillian R Scheer, Alejandra Gonzalez, Emily M Lund, Ivan N Gomez, Koustuv Saha, Munmun De Choudhury. Originally published in JMIR Formative Research (https://formative.jmir.org), 16.06.2023. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Formative Research, is properly cited. The complete bibliographic information, a link to the original publication on https://formative.jmir.org, as well as this copyright and license information must be included.
spellingShingle Original Paper
Cascalheira, Cory J
Flinn, Ryan E
Zhao, Yuxuan
Klooster, Dannie
Laprade, Danica
Hamdi, Shah Muhammad
Scheer, Jillian R
Gonzalez, Alejandra
Lund, Emily M
Gomez, Ivan N
Saha, Koustuv
De Choudhury, Munmun
Models of Gender Dysphoria Using Social Media Data for Use in Technology-Delivered Interventions: Machine Learning and Natural Language Processing Validation Study
title Models of Gender Dysphoria Using Social Media Data for Use in Technology-Delivered Interventions: Machine Learning and Natural Language Processing Validation Study
title_full Models of Gender Dysphoria Using Social Media Data for Use in Technology-Delivered Interventions: Machine Learning and Natural Language Processing Validation Study
title_fullStr Models of Gender Dysphoria Using Social Media Data for Use in Technology-Delivered Interventions: Machine Learning and Natural Language Processing Validation Study
title_full_unstemmed Models of Gender Dysphoria Using Social Media Data for Use in Technology-Delivered Interventions: Machine Learning and Natural Language Processing Validation Study
title_short Models of Gender Dysphoria Using Social Media Data for Use in Technology-Delivered Interventions: Machine Learning and Natural Language Processing Validation Study
title_sort models of gender dysphoria using social media data for use in technology-delivered interventions: machine learning and natural language processing validation study
topic Original Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10337393/
https://www.ncbi.nlm.nih.gov/pubmed/37327053
http://dx.doi.org/10.2196/47256
work_keys_str_mv AT cascalheiracoryj modelsofgenderdysphoriausingsocialmediadataforuseintechnologydeliveredinterventionsmachinelearningandnaturallanguageprocessingvalidationstudy
AT flinnryane modelsofgenderdysphoriausingsocialmediadataforuseintechnologydeliveredinterventionsmachinelearningandnaturallanguageprocessingvalidationstudy
AT zhaoyuxuan modelsofgenderdysphoriausingsocialmediadataforuseintechnologydeliveredinterventionsmachinelearningandnaturallanguageprocessingvalidationstudy
AT kloosterdannie modelsofgenderdysphoriausingsocialmediadataforuseintechnologydeliveredinterventionsmachinelearningandnaturallanguageprocessingvalidationstudy
AT lapradedanica modelsofgenderdysphoriausingsocialmediadataforuseintechnologydeliveredinterventionsmachinelearningandnaturallanguageprocessingvalidationstudy
AT hamdishahmuhammad modelsofgenderdysphoriausingsocialmediadataforuseintechnologydeliveredinterventionsmachinelearningandnaturallanguageprocessingvalidationstudy
AT scheerjillianr modelsofgenderdysphoriausingsocialmediadataforuseintechnologydeliveredinterventionsmachinelearningandnaturallanguageprocessingvalidationstudy
AT gonzalezalejandra modelsofgenderdysphoriausingsocialmediadataforuseintechnologydeliveredinterventionsmachinelearningandnaturallanguageprocessingvalidationstudy
AT lundemilym modelsofgenderdysphoriausingsocialmediadataforuseintechnologydeliveredinterventionsmachinelearningandnaturallanguageprocessingvalidationstudy
AT gomezivann modelsofgenderdysphoriausingsocialmediadataforuseintechnologydeliveredinterventionsmachinelearningandnaturallanguageprocessingvalidationstudy
AT sahakoustuv modelsofgenderdysphoriausingsocialmediadataforuseintechnologydeliveredinterventionsmachinelearningandnaturallanguageprocessingvalidationstudy
AT dechoudhurymunmun modelsofgenderdysphoriausingsocialmediadataforuseintechnologydeliveredinterventionsmachinelearningandnaturallanguageprocessingvalidationstudy