Cargando…

Natural Language Processing Reveals Vulnerable Mental Health Support Groups and Heightened Health Anxiety on Reddit During COVID-19: Observational Study

BACKGROUND: The COVID-19 pandemic is impacting mental health, but it is not clear how people with different types of mental health problems were differentially impacted as the initial wave of cases hit. OBJECTIVE: The aim of this study is to leverage natural language processing (NLP) with the goal o...

Descripción completa

Detalles Bibliográficos
Autores principales: Low, Daniel M, Rumker, Laurie, Talkar, Tanya, Torous, John, Cecchi, Guillermo, Ghosh, Satrajit S
Formato: Online Artículo Texto
Lenguaje:English
Publicado: JMIR Publications 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7575341/
https://www.ncbi.nlm.nih.gov/pubmed/32936777
http://dx.doi.org/10.2196/22635
_version_ 1783597791057018880
author Low, Daniel M
Rumker, Laurie
Talkar, Tanya
Torous, John
Cecchi, Guillermo
Ghosh, Satrajit S
author_facet Low, Daniel M
Rumker, Laurie
Talkar, Tanya
Torous, John
Cecchi, Guillermo
Ghosh, Satrajit S
author_sort Low, Daniel M
collection PubMed
description BACKGROUND: The COVID-19 pandemic is impacting mental health, but it is not clear how people with different types of mental health problems were differentially impacted as the initial wave of cases hit. OBJECTIVE: The aim of this study is to leverage natural language processing (NLP) with the goal of characterizing changes in 15 of the world’s largest mental health support groups (eg, r/schizophrenia, r/SuicideWatch, r/Depression) found on the website Reddit, along with 11 non–mental health groups (eg, r/PersonalFinance, r/conspiracy) during the initial stage of the pandemic. METHODS: We created and released the Reddit Mental Health Dataset including posts from 826,961 unique users from 2018 to 2020. Using regression, we analyzed trends from 90 text-derived features such as sentiment analysis, personal pronouns, and semantic categories. Using supervised machine learning, we classified posts into their respective support groups and interpreted important features to understand how different problems manifest in language. We applied unsupervised methods such as topic modeling and unsupervised clustering to uncover concerns throughout Reddit before and during the pandemic. RESULTS: We found that the r/HealthAnxiety forum showed spikes in posts about COVID-19 early on in January, approximately 2 months before other support groups started posting about the pandemic. There were many features that significantly increased during COVID-19 for specific groups including the categories “economic stress,” “isolation,” and “home,” while others such as “motion” significantly decreased. We found that support groups related to attention-deficit/hyperactivity disorder, eating disorders, and anxiety showed the most negative semantic change during the pandemic out of all mental health groups. Health anxiety emerged as a general theme across Reddit through independent supervised and unsupervised machine learning analyses. For instance, we provide evidence that the concerns of a diverse set of individuals are converging in this unique moment of history; we discovered that the more users posted about COVID-19, the more linguistically similar (less distant) the mental health support groups became to r/HealthAnxiety (ρ=–0.96, P<.001). Using unsupervised clustering, we found the suicidality and loneliness clusters more than doubled in the number of posts during the pandemic. Specifically, the support groups for borderline personality disorder and posttraumatic stress disorder became significantly associated with the suicidality cluster. Furthermore, clusters surrounding self-harm and entertainment emerged. CONCLUSIONS: By using a broad set of NLP techniques and analyzing a baseline of prepandemic posts, we uncovered patterns of how specific mental health problems manifest in language, identified at-risk users, and revealed the distribution of concerns across Reddit, which could help provide better resources to its millions of users. We then demonstrated that textual analysis is sensitive to uncover mental health complaints as they appear in real time, identifying vulnerable groups and alarming themes during COVID-19, and thus may have utility during the ongoing pandemic and other world-changing events such as elections and protests.
format Online
Article
Text
id pubmed-7575341
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher JMIR Publications
record_format MEDLINE/PubMed
spelling pubmed-75753412020-10-27 Natural Language Processing Reveals Vulnerable Mental Health Support Groups and Heightened Health Anxiety on Reddit During COVID-19: Observational Study Low, Daniel M Rumker, Laurie Talkar, Tanya Torous, John Cecchi, Guillermo Ghosh, Satrajit S J Med Internet Res Original Paper BACKGROUND: The COVID-19 pandemic is impacting mental health, but it is not clear how people with different types of mental health problems were differentially impacted as the initial wave of cases hit. OBJECTIVE: The aim of this study is to leverage natural language processing (NLP) with the goal of characterizing changes in 15 of the world’s largest mental health support groups (eg, r/schizophrenia, r/SuicideWatch, r/Depression) found on the website Reddit, along with 11 non–mental health groups (eg, r/PersonalFinance, r/conspiracy) during the initial stage of the pandemic. METHODS: We created and released the Reddit Mental Health Dataset including posts from 826,961 unique users from 2018 to 2020. Using regression, we analyzed trends from 90 text-derived features such as sentiment analysis, personal pronouns, and semantic categories. Using supervised machine learning, we classified posts into their respective support groups and interpreted important features to understand how different problems manifest in language. We applied unsupervised methods such as topic modeling and unsupervised clustering to uncover concerns throughout Reddit before and during the pandemic. RESULTS: We found that the r/HealthAnxiety forum showed spikes in posts about COVID-19 early on in January, approximately 2 months before other support groups started posting about the pandemic. There were many features that significantly increased during COVID-19 for specific groups including the categories “economic stress,” “isolation,” and “home,” while others such as “motion” significantly decreased. We found that support groups related to attention-deficit/hyperactivity disorder, eating disorders, and anxiety showed the most negative semantic change during the pandemic out of all mental health groups. Health anxiety emerged as a general theme across Reddit through independent supervised and unsupervised machine learning analyses. For instance, we provide evidence that the concerns of a diverse set of individuals are converging in this unique moment of history; we discovered that the more users posted about COVID-19, the more linguistically similar (less distant) the mental health support groups became to r/HealthAnxiety (ρ=–0.96, P<.001). Using unsupervised clustering, we found the suicidality and loneliness clusters more than doubled in the number of posts during the pandemic. Specifically, the support groups for borderline personality disorder and posttraumatic stress disorder became significantly associated with the suicidality cluster. Furthermore, clusters surrounding self-harm and entertainment emerged. CONCLUSIONS: By using a broad set of NLP techniques and analyzing a baseline of prepandemic posts, we uncovered patterns of how specific mental health problems manifest in language, identified at-risk users, and revealed the distribution of concerns across Reddit, which could help provide better resources to its millions of users. We then demonstrated that textual analysis is sensitive to uncover mental health complaints as they appear in real time, identifying vulnerable groups and alarming themes during COVID-19, and thus may have utility during the ongoing pandemic and other world-changing events such as elections and protests. JMIR Publications 2020-10-12 /pmc/articles/PMC7575341/ /pubmed/32936777 http://dx.doi.org/10.2196/22635 Text en ©Daniel M Low, Laurie Rumker, Tanya Talkar, John Torous, Guillermo Cecchi, Satrajit S Ghosh. Originally published in the Journal of Medical Internet Research (http://www.jmir.org), 12.10.2020. https://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in the Journal of Medical Internet Research, is properly cited. The complete bibliographic information, a link to the original publication on http://www.jmir.org/, as well as this copyright and license information must be included.
spellingShingle Original Paper
Low, Daniel M
Rumker, Laurie
Talkar, Tanya
Torous, John
Cecchi, Guillermo
Ghosh, Satrajit S
Natural Language Processing Reveals Vulnerable Mental Health Support Groups and Heightened Health Anxiety on Reddit During COVID-19: Observational Study
title Natural Language Processing Reveals Vulnerable Mental Health Support Groups and Heightened Health Anxiety on Reddit During COVID-19: Observational Study
title_full Natural Language Processing Reveals Vulnerable Mental Health Support Groups and Heightened Health Anxiety on Reddit During COVID-19: Observational Study
title_fullStr Natural Language Processing Reveals Vulnerable Mental Health Support Groups and Heightened Health Anxiety on Reddit During COVID-19: Observational Study
title_full_unstemmed Natural Language Processing Reveals Vulnerable Mental Health Support Groups and Heightened Health Anxiety on Reddit During COVID-19: Observational Study
title_short Natural Language Processing Reveals Vulnerable Mental Health Support Groups and Heightened Health Anxiety on Reddit During COVID-19: Observational Study
title_sort natural language processing reveals vulnerable mental health support groups and heightened health anxiety on reddit during covid-19: observational study
topic Original Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7575341/
https://www.ncbi.nlm.nih.gov/pubmed/32936777
http://dx.doi.org/10.2196/22635
work_keys_str_mv AT lowdanielm naturallanguageprocessingrevealsvulnerablementalhealthsupportgroupsandheightenedhealthanxietyonredditduringcovid19observationalstudy
AT rumkerlaurie naturallanguageprocessingrevealsvulnerablementalhealthsupportgroupsandheightenedhealthanxietyonredditduringcovid19observationalstudy
AT talkartanya naturallanguageprocessingrevealsvulnerablementalhealthsupportgroupsandheightenedhealthanxietyonredditduringcovid19observationalstudy
AT torousjohn naturallanguageprocessingrevealsvulnerablementalhealthsupportgroupsandheightenedhealthanxietyonredditduringcovid19observationalstudy
AT cecchiguillermo naturallanguageprocessingrevealsvulnerablementalhealthsupportgroupsandheightenedhealthanxietyonredditduringcovid19observationalstudy
AT ghoshsatrajits naturallanguageprocessingrevealsvulnerablementalhealthsupportgroupsandheightenedhealthanxietyonredditduringcovid19observationalstudy