Cargando…

Linguistic Analysis for Identifying Depression and Subsequent Suicidal Ideation on Weibo: Machine Learning Approaches

Depression is one of the most common mental illnesses but remains underdiagnosed. Suicide, as a core symptom of depression, urgently needs to be monitored at an early stage, i.e., the suicidal ideation (SI) stage. Depression and subsequent suicidal ideation should be supervised on social media. In t...

Descripción completa

Detalles Bibliográficos
Autores principales: Pan, Wei, Wang, Xianbin, Zhou, Wenwei, Hang, Bowen, Guo, Liwen
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9915029/
https://www.ncbi.nlm.nih.gov/pubmed/36768053
http://dx.doi.org/10.3390/ijerph20032688
_version_ 1784885806038515712
author Pan, Wei
Wang, Xianbin
Zhou, Wenwei
Hang, Bowen
Guo, Liwen
author_facet Pan, Wei
Wang, Xianbin
Zhou, Wenwei
Hang, Bowen
Guo, Liwen
author_sort Pan, Wei
collection PubMed
description Depression is one of the most common mental illnesses but remains underdiagnosed. Suicide, as a core symptom of depression, urgently needs to be monitored at an early stage, i.e., the suicidal ideation (SI) stage. Depression and subsequent suicidal ideation should be supervised on social media. In this research, we investigated depression and concomitant suicidal ideation by identifying individuals’ linguistic characteristics through machine learning approaches. On Weibo, we sampled 487,251 posts from 3196 users from the depression super topic community (DSTC) as the depression group and 357,939 posts from 5167 active users on Weibo as the control group. The results of the logistic regression model showed that the SCLIWC (simplified Chinese version of LIWC) features such as affection, positive emotion, negative emotion, sadness, health, and death significantly predicted depression (Nagelkerke’s R(2) = 0.64). For model performance: F-measure = 0.78, area under the curve (AUC) = 0.82. The independent samples’ t-test showed that SI was significantly different between the depression (0.28 ± 0.5) and control groups (−0.29 ± 0.72) (t = 24.71, p < 0.001). The results of the linear regression model showed that the SCLIWC features, such as social, family, affection, positive emotion, negative emotion, sadness, health, work, achieve, and death, significantly predicted suicidal ideation. The adjusted R(2) was 0.42. For model performance, the correlation between the actual SI and predicted SI on the test set was significant (r = 0.65, p < 0.001). The topic modeling results were in accordance with the machine learning results. This study systematically investigated depression and subsequent SI-related linguistic characteristics based on a large-scale Weibo dataset. The findings suggest that analyzing the linguistic characteristics on online depression communities serves as an efficient approach to identify depression and subsequent suicidal ideation, assisting further prevention and intervention.
format Online
Article
Text
id pubmed-9915029
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-99150292023-02-11 Linguistic Analysis for Identifying Depression and Subsequent Suicidal Ideation on Weibo: Machine Learning Approaches Pan, Wei Wang, Xianbin Zhou, Wenwei Hang, Bowen Guo, Liwen Int J Environ Res Public Health Article Depression is one of the most common mental illnesses but remains underdiagnosed. Suicide, as a core symptom of depression, urgently needs to be monitored at an early stage, i.e., the suicidal ideation (SI) stage. Depression and subsequent suicidal ideation should be supervised on social media. In this research, we investigated depression and concomitant suicidal ideation by identifying individuals’ linguistic characteristics through machine learning approaches. On Weibo, we sampled 487,251 posts from 3196 users from the depression super topic community (DSTC) as the depression group and 357,939 posts from 5167 active users on Weibo as the control group. The results of the logistic regression model showed that the SCLIWC (simplified Chinese version of LIWC) features such as affection, positive emotion, negative emotion, sadness, health, and death significantly predicted depression (Nagelkerke’s R(2) = 0.64). For model performance: F-measure = 0.78, area under the curve (AUC) = 0.82. The independent samples’ t-test showed that SI was significantly different between the depression (0.28 ± 0.5) and control groups (−0.29 ± 0.72) (t = 24.71, p < 0.001). The results of the linear regression model showed that the SCLIWC features, such as social, family, affection, positive emotion, negative emotion, sadness, health, work, achieve, and death, significantly predicted suicidal ideation. The adjusted R(2) was 0.42. For model performance, the correlation between the actual SI and predicted SI on the test set was significant (r = 0.65, p < 0.001). The topic modeling results were in accordance with the machine learning results. This study systematically investigated depression and subsequent SI-related linguistic characteristics based on a large-scale Weibo dataset. The findings suggest that analyzing the linguistic characteristics on online depression communities serves as an efficient approach to identify depression and subsequent suicidal ideation, assisting further prevention and intervention. MDPI 2023-02-02 /pmc/articles/PMC9915029/ /pubmed/36768053 http://dx.doi.org/10.3390/ijerph20032688 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Pan, Wei
Wang, Xianbin
Zhou, Wenwei
Hang, Bowen
Guo, Liwen
Linguistic Analysis for Identifying Depression and Subsequent Suicidal Ideation on Weibo: Machine Learning Approaches
title Linguistic Analysis for Identifying Depression and Subsequent Suicidal Ideation on Weibo: Machine Learning Approaches
title_full Linguistic Analysis for Identifying Depression and Subsequent Suicidal Ideation on Weibo: Machine Learning Approaches
title_fullStr Linguistic Analysis for Identifying Depression and Subsequent Suicidal Ideation on Weibo: Machine Learning Approaches
title_full_unstemmed Linguistic Analysis for Identifying Depression and Subsequent Suicidal Ideation on Weibo: Machine Learning Approaches
title_short Linguistic Analysis for Identifying Depression and Subsequent Suicidal Ideation on Weibo: Machine Learning Approaches
title_sort linguistic analysis for identifying depression and subsequent suicidal ideation on weibo: machine learning approaches
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9915029/
https://www.ncbi.nlm.nih.gov/pubmed/36768053
http://dx.doi.org/10.3390/ijerph20032688
work_keys_str_mv AT panwei linguisticanalysisforidentifyingdepressionandsubsequentsuicidalideationonweibomachinelearningapproaches
AT wangxianbin linguisticanalysisforidentifyingdepressionandsubsequentsuicidalideationonweibomachinelearningapproaches
AT zhouwenwei linguisticanalysisforidentifyingdepressionandsubsequentsuicidalideationonweibomachinelearningapproaches
AT hangbowen linguisticanalysisforidentifyingdepressionandsubsequentsuicidalideationonweibomachinelearningapproaches
AT guoliwen linguisticanalysisforidentifyingdepressionandsubsequentsuicidalideationonweibomachinelearningapproaches