Cargando…

Bootstrapping semi-supervised annotation method for potential suicidal messages

The suicide of a person is a tragedy that deeply affects families, communities, and countries. According to the standardized rate of suicides per number of inhabitants worldwide, in 2022 there will be approximately about 903,450 suicides and 18,069,000 unconsummated suicides, affecting people of all...

Descripción completa

Detalles Bibliográficos
Autores principales: Acuña Caicedo, Roberto Wellington, Gómez Soriano, José Manuel, Melgar Sasieta, Héctor Andrés
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8913319/
https://www.ncbi.nlm.nih.gov/pubmed/35281704
http://dx.doi.org/10.1016/j.invent.2022.100519
_version_ 1784667411696320512
author Acuña Caicedo, Roberto Wellington
Gómez Soriano, José Manuel
Melgar Sasieta, Héctor Andrés
author_facet Acuña Caicedo, Roberto Wellington
Gómez Soriano, José Manuel
Melgar Sasieta, Héctor Andrés
author_sort Acuña Caicedo, Roberto Wellington
collection PubMed
description The suicide of a person is a tragedy that deeply affects families, communities, and countries. According to the standardized rate of suicides per number of inhabitants worldwide, in 2022 there will be approximately about 903,450 suicides and 18,069,000 unconsummated suicides, affecting people of all ages, countries, races, beliefs, social status, economic status, sex, etc. The publication of suicidal intentions by users of social networks has led to the initiation of research processes in this field, to detect them and encourage them not to commit suicide. This study focused on determining a semi-supervised method to populate the Life Corpus, using a bootstrapping technique, to automatically detect and classify texts extracted from social networks and forums related to suicide and depression based on initial supervised samples. To carry out the experiments we used two different classifiers: Support Vector Machine (SVM) (with Bag of Words (BoW) features with and without Term-Frequency/Inverse Document Frequency (Tf/Idf), as a weighted term, and with or without stopwords) and Rasa (with the default feature extraction system). In addition, we performed the experiments using five data collections: Life, Reddit, Life+Reddit, Life_en, and Life_en + Reddit. Using the semi-supervised method, we managed to increase the size of the Life Corpus from 102 to 273 samples with texts from the social network Reddit, in a combination Life+Reddit+BoW_Embeddings, with the SVM classifier, with which a macro f1 value of 0.80 was achieved. These texts were in turn evaluated by annotators manually with a Cohen's Kappa level of agreement of 0.86.
format Online
Article
Text
id pubmed-8913319
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Elsevier
record_format MEDLINE/PubMed
spelling pubmed-89133192022-03-12 Bootstrapping semi-supervised annotation method for potential suicidal messages Acuña Caicedo, Roberto Wellington Gómez Soriano, José Manuel Melgar Sasieta, Héctor Andrés Internet Interv Review Article The suicide of a person is a tragedy that deeply affects families, communities, and countries. According to the standardized rate of suicides per number of inhabitants worldwide, in 2022 there will be approximately about 903,450 suicides and 18,069,000 unconsummated suicides, affecting people of all ages, countries, races, beliefs, social status, economic status, sex, etc. The publication of suicidal intentions by users of social networks has led to the initiation of research processes in this field, to detect them and encourage them not to commit suicide. This study focused on determining a semi-supervised method to populate the Life Corpus, using a bootstrapping technique, to automatically detect and classify texts extracted from social networks and forums related to suicide and depression based on initial supervised samples. To carry out the experiments we used two different classifiers: Support Vector Machine (SVM) (with Bag of Words (BoW) features with and without Term-Frequency/Inverse Document Frequency (Tf/Idf), as a weighted term, and with or without stopwords) and Rasa (with the default feature extraction system). In addition, we performed the experiments using five data collections: Life, Reddit, Life+Reddit, Life_en, and Life_en + Reddit. Using the semi-supervised method, we managed to increase the size of the Life Corpus from 102 to 273 samples with texts from the social network Reddit, in a combination Life+Reddit+BoW_Embeddings, with the SVM classifier, with which a macro f1 value of 0.80 was achieved. These texts were in turn evaluated by annotators manually with a Cohen's Kappa level of agreement of 0.86. Elsevier 2022-02-28 /pmc/articles/PMC8913319/ /pubmed/35281704 http://dx.doi.org/10.1016/j.invent.2022.100519 Text en © 2022 The Authors. Published by Elsevier B.V. https://creativecommons.org/licenses/by-nc-nd/4.0/This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
spellingShingle Review Article
Acuña Caicedo, Roberto Wellington
Gómez Soriano, José Manuel
Melgar Sasieta, Héctor Andrés
Bootstrapping semi-supervised annotation method for potential suicidal messages
title Bootstrapping semi-supervised annotation method for potential suicidal messages
title_full Bootstrapping semi-supervised annotation method for potential suicidal messages
title_fullStr Bootstrapping semi-supervised annotation method for potential suicidal messages
title_full_unstemmed Bootstrapping semi-supervised annotation method for potential suicidal messages
title_short Bootstrapping semi-supervised annotation method for potential suicidal messages
title_sort bootstrapping semi-supervised annotation method for potential suicidal messages
topic Review Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8913319/
https://www.ncbi.nlm.nih.gov/pubmed/35281704
http://dx.doi.org/10.1016/j.invent.2022.100519
work_keys_str_mv AT acunacaicedorobertowellington bootstrappingsemisupervisedannotationmethodforpotentialsuicidalmessages
AT gomezsorianojosemanuel bootstrappingsemisupervisedannotationmethodforpotentialsuicidalmessages
AT melgarsasietahectorandres bootstrappingsemisupervisedannotationmethodforpotentialsuicidalmessages