Cargando…
Detecting Suicidal Ideation on Forums: Proof-of-Concept Study
BACKGROUND: In 2016, 44,965 people in the United States died by suicide. It is common to see people with suicidal ideation seek help or leave suicide notes on social media before attempting suicide. Many prefer to express their feelings with longer passages on forums such as Reddit and blogs. Becaus...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
JMIR Publications
2018
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6035349/ https://www.ncbi.nlm.nih.gov/pubmed/29929945 http://dx.doi.org/10.2196/jmir.9840 |
_version_ | 1783338034729582592 |
---|---|
author | Aladağ, Ahmet Emre Muderrisoglu, Serra Akbas, Naz Berfu Zahmacioglu, Oguzhan Bingol, Haluk O |
author_facet | Aladağ, Ahmet Emre Muderrisoglu, Serra Akbas, Naz Berfu Zahmacioglu, Oguzhan Bingol, Haluk O |
author_sort | Aladağ, Ahmet Emre |
collection | PubMed |
description | BACKGROUND: In 2016, 44,965 people in the United States died by suicide. It is common to see people with suicidal ideation seek help or leave suicide notes on social media before attempting suicide. Many prefer to express their feelings with longer passages on forums such as Reddit and blogs. Because these expressive posts follow regular language patterns, potential suicide attempts can be prevented by detecting suicidal posts as they are written. OBJECTIVE: This study aims to build a classifier that differentiates suicidal and nonsuicidal forum posts via text mining methods applied on post titles and bodies. METHODS: A total of 508,398 Reddit posts longer than 100 characters and posted between 2008 and 2016 on SuicideWatch, Depression, Anxiety, and ShowerThoughts subreddits were downloaded from the publicly available Reddit dataset. Of these, 10,785 posts were randomly selected and 785 were manually annotated as suicidal or nonsuicidal. Features were extracted using term frequency-inverse document frequency, linguistic inquiry and word count, and sentiment analysis on post titles and bodies. Logistic regression, random forest, and support vector machine (SVM) classification algorithms were applied on resulting corpus and prediction performance is evaluated. RESULTS: The logistic regression and SVM classifiers correctly identified suicidality of posts with 80% to 92% accuracy and F1 score, respectively, depending on different data compositions closely followed by random forest, compared to baseline ZeroR algorithm achieving 50% accuracy and 66% F1 score. CONCLUSIONS: This study demonstrated that it is possible to detect people with suicidal ideation on online forums with high accuracy. The logistic regression classifier in this study can potentially be embedded on blogs and forums to make the decision to offer real-time online counseling in case a suicidal post is being written. |
format | Online Article Text |
id | pubmed-6035349 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2018 |
publisher | JMIR Publications |
record_format | MEDLINE/PubMed |
spelling | pubmed-60353492018-07-12 Detecting Suicidal Ideation on Forums: Proof-of-Concept Study Aladağ, Ahmet Emre Muderrisoglu, Serra Akbas, Naz Berfu Zahmacioglu, Oguzhan Bingol, Haluk O J Med Internet Res Original Paper BACKGROUND: In 2016, 44,965 people in the United States died by suicide. It is common to see people with suicidal ideation seek help or leave suicide notes on social media before attempting suicide. Many prefer to express their feelings with longer passages on forums such as Reddit and blogs. Because these expressive posts follow regular language patterns, potential suicide attempts can be prevented by detecting suicidal posts as they are written. OBJECTIVE: This study aims to build a classifier that differentiates suicidal and nonsuicidal forum posts via text mining methods applied on post titles and bodies. METHODS: A total of 508,398 Reddit posts longer than 100 characters and posted between 2008 and 2016 on SuicideWatch, Depression, Anxiety, and ShowerThoughts subreddits were downloaded from the publicly available Reddit dataset. Of these, 10,785 posts were randomly selected and 785 were manually annotated as suicidal or nonsuicidal. Features were extracted using term frequency-inverse document frequency, linguistic inquiry and word count, and sentiment analysis on post titles and bodies. Logistic regression, random forest, and support vector machine (SVM) classification algorithms were applied on resulting corpus and prediction performance is evaluated. RESULTS: The logistic regression and SVM classifiers correctly identified suicidality of posts with 80% to 92% accuracy and F1 score, respectively, depending on different data compositions closely followed by random forest, compared to baseline ZeroR algorithm achieving 50% accuracy and 66% F1 score. CONCLUSIONS: This study demonstrated that it is possible to detect people with suicidal ideation on online forums with high accuracy. The logistic regression classifier in this study can potentially be embedded on blogs and forums to make the decision to offer real-time online counseling in case a suicidal post is being written. JMIR Publications 2018-06-21 /pmc/articles/PMC6035349/ /pubmed/29929945 http://dx.doi.org/10.2196/jmir.9840 Text en ©Ahmet Emre Aladağ, Serra Muderrisoglu, Naz Berfu Akbas, Oguzhan Zahmacioglu, Haluk O. Bingol. Originally published in the Journal of Medical Internet Research (http://www.jmir.org), 21.06.2018. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in the Journal of Medical Internet Research, is properly cited. The complete bibliographic information, a link to the original publication on http://www.jmir.org/, as well as this copyright and license information must be included. |
spellingShingle | Original Paper Aladağ, Ahmet Emre Muderrisoglu, Serra Akbas, Naz Berfu Zahmacioglu, Oguzhan Bingol, Haluk O Detecting Suicidal Ideation on Forums: Proof-of-Concept Study |
title | Detecting Suicidal Ideation on Forums: Proof-of-Concept Study |
title_full | Detecting Suicidal Ideation on Forums: Proof-of-Concept Study |
title_fullStr | Detecting Suicidal Ideation on Forums: Proof-of-Concept Study |
title_full_unstemmed | Detecting Suicidal Ideation on Forums: Proof-of-Concept Study |
title_short | Detecting Suicidal Ideation on Forums: Proof-of-Concept Study |
title_sort | detecting suicidal ideation on forums: proof-of-concept study |
topic | Original Paper |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6035349/ https://www.ncbi.nlm.nih.gov/pubmed/29929945 http://dx.doi.org/10.2196/jmir.9840 |
work_keys_str_mv | AT aladagahmetemre detectingsuicidalideationonforumsproofofconceptstudy AT muderrisogluserra detectingsuicidalideationonforumsproofofconceptstudy AT akbasnazberfu detectingsuicidalideationonforumsproofofconceptstudy AT zahmaciogluoguzhan detectingsuicidalideationonforumsproofofconceptstudy AT bingolhaluko detectingsuicidalideationonforumsproofofconceptstudy |