Cargando…

Automatic Detection of Twitter Users Who Express Chronic Stress Experiences via Supervised Machine Learning and Natural Language Processing

Americans bear a high chronic stress burden, particularly during the COVID-19 pandemic. Although social media have many strengths to complement the weaknesses of conventional stress measures, including surveys, they have been rarely utilized to detect individuals self-reporting chronic stress. Thus,...

Descripción completa

Detalles Bibliográficos
Autores principales: Yang, Yuan-Chi, Xie, Angel, Kim, Sangmi, Hair, Jessica, Al-Garadi, Mohammed, Sarker, Abeed
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Lippincott Williams & Wilkins 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10510804/
https://www.ncbi.nlm.nih.gov/pubmed/36445331
http://dx.doi.org/10.1097/CIN.0000000000000985
_version_ 1785108021451423744
author Yang, Yuan-Chi
Xie, Angel
Kim, Sangmi
Hair, Jessica
Al-Garadi, Mohammed
Sarker, Abeed
author_facet Yang, Yuan-Chi
Xie, Angel
Kim, Sangmi
Hair, Jessica
Al-Garadi, Mohammed
Sarker, Abeed
author_sort Yang, Yuan-Chi
collection PubMed
description Americans bear a high chronic stress burden, particularly during the COVID-19 pandemic. Although social media have many strengths to complement the weaknesses of conventional stress measures, including surveys, they have been rarely utilized to detect individuals self-reporting chronic stress. Thus, this study aimed to develop and evaluate an automatic system on Twitter to identify users who have self-reported chronic stress experiences. Using the Twitter public streaming application programming interface, we collected tweets containing certain stress-related keywords (eg, “chronic,” “constant,” “stress”) and then filtered the data using pre-defined text patterns. We manually annotated tweets with (without) self-report of chronic stress as positive (negative). We trained multiple classifiers and tested them via accuracy and F(1) score. We annotated 4195 tweets (1560 positives, 2635 negatives), achieving an inter-annotator agreement of 0.83 (Cohen's kappa). The classifier based on Bidirectional Encoder Representation from Transformers performed the best (accuracy of 83.6% [81.0-86.1]), outperforming the second best-performing classifier (support vector machines: 76.4% [73.5-79.3]). The past tweets from the authors of positive tweets contained useful information, including sources and health impacts of chronic stress. Our study demonstrates that users' self-reported chronic stress experiences can be automatically identified on Twitter, which has a high potential for surveillance and large-scale intervention.
format Online
Article
Text
id pubmed-10510804
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Lippincott Williams & Wilkins
record_format MEDLINE/PubMed
spelling pubmed-105108042023-09-21 Automatic Detection of Twitter Users Who Express Chronic Stress Experiences via Supervised Machine Learning and Natural Language Processing Yang, Yuan-Chi Xie, Angel Kim, Sangmi Hair, Jessica Al-Garadi, Mohammed Sarker, Abeed Comput Inform Nurs Features Americans bear a high chronic stress burden, particularly during the COVID-19 pandemic. Although social media have many strengths to complement the weaknesses of conventional stress measures, including surveys, they have been rarely utilized to detect individuals self-reporting chronic stress. Thus, this study aimed to develop and evaluate an automatic system on Twitter to identify users who have self-reported chronic stress experiences. Using the Twitter public streaming application programming interface, we collected tweets containing certain stress-related keywords (eg, “chronic,” “constant,” “stress”) and then filtered the data using pre-defined text patterns. We manually annotated tweets with (without) self-report of chronic stress as positive (negative). We trained multiple classifiers and tested them via accuracy and F(1) score. We annotated 4195 tweets (1560 positives, 2635 negatives), achieving an inter-annotator agreement of 0.83 (Cohen's kappa). The classifier based on Bidirectional Encoder Representation from Transformers performed the best (accuracy of 83.6% [81.0-86.1]), outperforming the second best-performing classifier (support vector machines: 76.4% [73.5-79.3]). The past tweets from the authors of positive tweets contained useful information, including sources and health impacts of chronic stress. Our study demonstrates that users' self-reported chronic stress experiences can be automatically identified on Twitter, which has a high potential for surveillance and large-scale intervention. Lippincott Williams & Wilkins 2022-11-29 /pmc/articles/PMC10510804/ /pubmed/36445331 http://dx.doi.org/10.1097/CIN.0000000000000985 Text en Copyright © 2022 The Authors. Published by Wolters Kluwer Health, Inc. https://creativecommons.org/licenses/by-nc-nd/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution-Non Commercial-No Derivatives License 4.0 (CCBY-NC-ND) (https://creativecommons.org/licenses/by-nc-nd/4.0/) , where it is permissible to download and share the work provided it is properly cited. The work cannot be changed in any way or used commercially without permission from the journal.
spellingShingle Features
Yang, Yuan-Chi
Xie, Angel
Kim, Sangmi
Hair, Jessica
Al-Garadi, Mohammed
Sarker, Abeed
Automatic Detection of Twitter Users Who Express Chronic Stress Experiences via Supervised Machine Learning and Natural Language Processing
title Automatic Detection of Twitter Users Who Express Chronic Stress Experiences via Supervised Machine Learning and Natural Language Processing
title_full Automatic Detection of Twitter Users Who Express Chronic Stress Experiences via Supervised Machine Learning and Natural Language Processing
title_fullStr Automatic Detection of Twitter Users Who Express Chronic Stress Experiences via Supervised Machine Learning and Natural Language Processing
title_full_unstemmed Automatic Detection of Twitter Users Who Express Chronic Stress Experiences via Supervised Machine Learning and Natural Language Processing
title_short Automatic Detection of Twitter Users Who Express Chronic Stress Experiences via Supervised Machine Learning and Natural Language Processing
title_sort automatic detection of twitter users who express chronic stress experiences via supervised machine learning and natural language processing
topic Features
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10510804/
https://www.ncbi.nlm.nih.gov/pubmed/36445331
http://dx.doi.org/10.1097/CIN.0000000000000985
work_keys_str_mv AT yangyuanchi automaticdetectionoftwitteruserswhoexpresschronicstressexperiencesviasupervisedmachinelearningandnaturallanguageprocessing
AT xieangel automaticdetectionoftwitteruserswhoexpresschronicstressexperiencesviasupervisedmachinelearningandnaturallanguageprocessing
AT kimsangmi automaticdetectionoftwitteruserswhoexpresschronicstressexperiencesviasupervisedmachinelearningandnaturallanguageprocessing
AT hairjessica automaticdetectionoftwitteruserswhoexpresschronicstressexperiencesviasupervisedmachinelearningandnaturallanguageprocessing
AT algaradimohammed automaticdetectionoftwitteruserswhoexpresschronicstressexperiencesviasupervisedmachinelearningandnaturallanguageprocessing
AT sarkerabeed automaticdetectionoftwitteruserswhoexpresschronicstressexperiencesviasupervisedmachinelearningandnaturallanguageprocessing