Cargando…

Depression Risk Prediction for Chinese Microblogs via Deep-Learning Methods: Content Analysis

BACKGROUND: Depression is a serious personal and public mental health problem. Self-reporting is the main method used to diagnose depression and to determine the severity of depression. However, it is not easy to discover patients with depression owing to feelings of shame in disclosing or discussin...

Descripción completa

Detalles Bibliográficos
Autores principales: Wang, Xiaofeng, Chen, Shuai, Li, Tao, Li, Wanting, Zhou, Yejie, Zheng, Jie, Chen, Qingcai, Yan, Jun, Tang, Buzhou
Formato: Online Artículo Texto
Lenguaje:English
Publicado: JMIR Publications 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7424493/
https://www.ncbi.nlm.nih.gov/pubmed/32723719
http://dx.doi.org/10.2196/17958
_version_ 1783570350383038464
author Wang, Xiaofeng
Chen, Shuai
Li, Tao
Li, Wanting
Zhou, Yejie
Zheng, Jie
Chen, Qingcai
Yan, Jun
Tang, Buzhou
author_facet Wang, Xiaofeng
Chen, Shuai
Li, Tao
Li, Wanting
Zhou, Yejie
Zheng, Jie
Chen, Qingcai
Yan, Jun
Tang, Buzhou
author_sort Wang, Xiaofeng
collection PubMed
description BACKGROUND: Depression is a serious personal and public mental health problem. Self-reporting is the main method used to diagnose depression and to determine the severity of depression. However, it is not easy to discover patients with depression owing to feelings of shame in disclosing or discussing their mental health conditions with others. Moreover, self-reporting is time-consuming, and usually leads to missing a certain number of cases. Therefore, automatic discovery of patients with depression from other sources such as social media has been attracting increasing attention. Social media, as one of the most important daily communication systems, connects large quantities of people, including individuals with depression, and provides a channel to discover patients with depression. In this study, we investigated deep-learning methods for depression risk prediction using data from Chinese microblogs, which have potential to discover more patients with depression and to trace their mental health conditions. OBJECTIVE: The aim of this study was to explore the potential of state-of-the-art deep-learning methods on depression risk prediction from Chinese microblogs. METHODS: Deep-learning methods with pretrained language representation models, including bidirectional encoder representations from transformers (BERT), robustly optimized BERT pretraining approach (RoBERTa), and generalized autoregressive pretraining for language understanding (XLNET), were investigated for depression risk prediction, and were compared with previous methods on a manually annotated benchmark dataset. Depression risk was assessed at four levels from 0 to 3, where 0, 1, 2, and 3 denote no inclination, and mild, moderate, and severe depression risk, respectively. The dataset was collected from the Chinese microblog Weibo. We also compared different deep-learning methods with pretrained language representation models in two settings: (1) publicly released pretrained language representation models, and (2) language representation models further pretrained on a large-scale unlabeled dataset collected from Weibo. Precision, recall, and F1 scores were used as performance evaluation measures. RESULTS: Among the three deep-learning methods, BERT achieved the best performance with a microaveraged F1 score of 0.856. RoBERTa achieved the best performance with a macroaveraged F1 score of 0.424 on depression risk at levels 1, 2, and 3, which represents a new benchmark result on the dataset. The further pretrained language representation models demonstrated improvement over publicly released prediction models. CONCLUSIONS: We applied deep-learning methods with pretrained language representation models to automatically predict depression risk using data from Chinese microblogs. The experimental results showed that the deep-learning methods performed better than previous methods, and have greater potential to discover patients with depression and to trace their mental health conditions.
format Online
Article
Text
id pubmed-7424493
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher JMIR Publications
record_format MEDLINE/PubMed
spelling pubmed-74244932020-08-20 Depression Risk Prediction for Chinese Microblogs via Deep-Learning Methods: Content Analysis Wang, Xiaofeng Chen, Shuai Li, Tao Li, Wanting Zhou, Yejie Zheng, Jie Chen, Qingcai Yan, Jun Tang, Buzhou JMIR Med Inform Original Paper BACKGROUND: Depression is a serious personal and public mental health problem. Self-reporting is the main method used to diagnose depression and to determine the severity of depression. However, it is not easy to discover patients with depression owing to feelings of shame in disclosing or discussing their mental health conditions with others. Moreover, self-reporting is time-consuming, and usually leads to missing a certain number of cases. Therefore, automatic discovery of patients with depression from other sources such as social media has been attracting increasing attention. Social media, as one of the most important daily communication systems, connects large quantities of people, including individuals with depression, and provides a channel to discover patients with depression. In this study, we investigated deep-learning methods for depression risk prediction using data from Chinese microblogs, which have potential to discover more patients with depression and to trace their mental health conditions. OBJECTIVE: The aim of this study was to explore the potential of state-of-the-art deep-learning methods on depression risk prediction from Chinese microblogs. METHODS: Deep-learning methods with pretrained language representation models, including bidirectional encoder representations from transformers (BERT), robustly optimized BERT pretraining approach (RoBERTa), and generalized autoregressive pretraining for language understanding (XLNET), were investigated for depression risk prediction, and were compared with previous methods on a manually annotated benchmark dataset. Depression risk was assessed at four levels from 0 to 3, where 0, 1, 2, and 3 denote no inclination, and mild, moderate, and severe depression risk, respectively. The dataset was collected from the Chinese microblog Weibo. We also compared different deep-learning methods with pretrained language representation models in two settings: (1) publicly released pretrained language representation models, and (2) language representation models further pretrained on a large-scale unlabeled dataset collected from Weibo. Precision, recall, and F1 scores were used as performance evaluation measures. RESULTS: Among the three deep-learning methods, BERT achieved the best performance with a microaveraged F1 score of 0.856. RoBERTa achieved the best performance with a macroaveraged F1 score of 0.424 on depression risk at levels 1, 2, and 3, which represents a new benchmark result on the dataset. The further pretrained language representation models demonstrated improvement over publicly released prediction models. CONCLUSIONS: We applied deep-learning methods with pretrained language representation models to automatically predict depression risk using data from Chinese microblogs. The experimental results showed that the deep-learning methods performed better than previous methods, and have greater potential to discover patients with depression and to trace their mental health conditions. JMIR Publications 2020-07-29 /pmc/articles/PMC7424493/ /pubmed/32723719 http://dx.doi.org/10.2196/17958 Text en ©Xiaofeng Wang, Shuai Chen, Tao Li, Wanting Li, Yejie Zhou, Jie Zheng, Qingcai Chen, Jun Yan, Buzhou Tang. Originally published in JMIR Medical Informatics (http://medinform.jmir.org), 29.07.2020. https://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Medical Informatics, is properly cited. The complete bibliographic information, a link to the original publication on http://medinform.jmir.org/, as well as this copyright and license information must be included.
spellingShingle Original Paper
Wang, Xiaofeng
Chen, Shuai
Li, Tao
Li, Wanting
Zhou, Yejie
Zheng, Jie
Chen, Qingcai
Yan, Jun
Tang, Buzhou
Depression Risk Prediction for Chinese Microblogs via Deep-Learning Methods: Content Analysis
title Depression Risk Prediction for Chinese Microblogs via Deep-Learning Methods: Content Analysis
title_full Depression Risk Prediction for Chinese Microblogs via Deep-Learning Methods: Content Analysis
title_fullStr Depression Risk Prediction for Chinese Microblogs via Deep-Learning Methods: Content Analysis
title_full_unstemmed Depression Risk Prediction for Chinese Microblogs via Deep-Learning Methods: Content Analysis
title_short Depression Risk Prediction for Chinese Microblogs via Deep-Learning Methods: Content Analysis
title_sort depression risk prediction for chinese microblogs via deep-learning methods: content analysis
topic Original Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7424493/
https://www.ncbi.nlm.nih.gov/pubmed/32723719
http://dx.doi.org/10.2196/17958
work_keys_str_mv AT wangxiaofeng depressionriskpredictionforchinesemicroblogsviadeeplearningmethodscontentanalysis
AT chenshuai depressionriskpredictionforchinesemicroblogsviadeeplearningmethodscontentanalysis
AT litao depressionriskpredictionforchinesemicroblogsviadeeplearningmethodscontentanalysis
AT liwanting depressionriskpredictionforchinesemicroblogsviadeeplearningmethodscontentanalysis
AT zhouyejie depressionriskpredictionforchinesemicroblogsviadeeplearningmethodscontentanalysis
AT zhengjie depressionriskpredictionforchinesemicroblogsviadeeplearningmethodscontentanalysis
AT chenqingcai depressionriskpredictionforchinesemicroblogsviadeeplearningmethodscontentanalysis
AT yanjun depressionriskpredictionforchinesemicroblogsviadeeplearningmethodscontentanalysis
AT tangbuzhou depressionriskpredictionforchinesemicroblogsviadeeplearningmethodscontentanalysis