Cargando…
Automatic Detection of Twitter Users Who Express Chronic Stress Experiences via Supervised Machine Learning and Natural Language Processing
Americans bear a high chronic stress burden, particularly during the COVID-19 pandemic. Although social media have many strengths to complement the weaknesses of conventional stress measures, including surveys, they have been rarely utilized to detect individuals self-reporting chronic stress. Thus,...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Lippincott Williams & Wilkins
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10510804/ https://www.ncbi.nlm.nih.gov/pubmed/36445331 http://dx.doi.org/10.1097/CIN.0000000000000985 |
_version_ | 1785108021451423744 |
---|---|
author | Yang, Yuan-Chi Xie, Angel Kim, Sangmi Hair, Jessica Al-Garadi, Mohammed Sarker, Abeed |
author_facet | Yang, Yuan-Chi Xie, Angel Kim, Sangmi Hair, Jessica Al-Garadi, Mohammed Sarker, Abeed |
author_sort | Yang, Yuan-Chi |
collection | PubMed |
description | Americans bear a high chronic stress burden, particularly during the COVID-19 pandemic. Although social media have many strengths to complement the weaknesses of conventional stress measures, including surveys, they have been rarely utilized to detect individuals self-reporting chronic stress. Thus, this study aimed to develop and evaluate an automatic system on Twitter to identify users who have self-reported chronic stress experiences. Using the Twitter public streaming application programming interface, we collected tweets containing certain stress-related keywords (eg, “chronic,” “constant,” “stress”) and then filtered the data using pre-defined text patterns. We manually annotated tweets with (without) self-report of chronic stress as positive (negative). We trained multiple classifiers and tested them via accuracy and F(1) score. We annotated 4195 tweets (1560 positives, 2635 negatives), achieving an inter-annotator agreement of 0.83 (Cohen's kappa). The classifier based on Bidirectional Encoder Representation from Transformers performed the best (accuracy of 83.6% [81.0-86.1]), outperforming the second best-performing classifier (support vector machines: 76.4% [73.5-79.3]). The past tweets from the authors of positive tweets contained useful information, including sources and health impacts of chronic stress. Our study demonstrates that users' self-reported chronic stress experiences can be automatically identified on Twitter, which has a high potential for surveillance and large-scale intervention. |
format | Online Article Text |
id | pubmed-10510804 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Lippincott Williams & Wilkins |
record_format | MEDLINE/PubMed |
spelling | pubmed-105108042023-09-21 Automatic Detection of Twitter Users Who Express Chronic Stress Experiences via Supervised Machine Learning and Natural Language Processing Yang, Yuan-Chi Xie, Angel Kim, Sangmi Hair, Jessica Al-Garadi, Mohammed Sarker, Abeed Comput Inform Nurs Features Americans bear a high chronic stress burden, particularly during the COVID-19 pandemic. Although social media have many strengths to complement the weaknesses of conventional stress measures, including surveys, they have been rarely utilized to detect individuals self-reporting chronic stress. Thus, this study aimed to develop and evaluate an automatic system on Twitter to identify users who have self-reported chronic stress experiences. Using the Twitter public streaming application programming interface, we collected tweets containing certain stress-related keywords (eg, “chronic,” “constant,” “stress”) and then filtered the data using pre-defined text patterns. We manually annotated tweets with (without) self-report of chronic stress as positive (negative). We trained multiple classifiers and tested them via accuracy and F(1) score. We annotated 4195 tweets (1560 positives, 2635 negatives), achieving an inter-annotator agreement of 0.83 (Cohen's kappa). The classifier based on Bidirectional Encoder Representation from Transformers performed the best (accuracy of 83.6% [81.0-86.1]), outperforming the second best-performing classifier (support vector machines: 76.4% [73.5-79.3]). The past tweets from the authors of positive tweets contained useful information, including sources and health impacts of chronic stress. Our study demonstrates that users' self-reported chronic stress experiences can be automatically identified on Twitter, which has a high potential for surveillance and large-scale intervention. Lippincott Williams & Wilkins 2022-11-29 /pmc/articles/PMC10510804/ /pubmed/36445331 http://dx.doi.org/10.1097/CIN.0000000000000985 Text en Copyright © 2022 The Authors. Published by Wolters Kluwer Health, Inc. https://creativecommons.org/licenses/by-nc-nd/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution-Non Commercial-No Derivatives License 4.0 (CCBY-NC-ND) (https://creativecommons.org/licenses/by-nc-nd/4.0/) , where it is permissible to download and share the work provided it is properly cited. The work cannot be changed in any way or used commercially without permission from the journal. |
spellingShingle | Features Yang, Yuan-Chi Xie, Angel Kim, Sangmi Hair, Jessica Al-Garadi, Mohammed Sarker, Abeed Automatic Detection of Twitter Users Who Express Chronic Stress Experiences via Supervised Machine Learning and Natural Language Processing |
title | Automatic Detection of Twitter Users Who Express Chronic Stress Experiences via Supervised Machine Learning and Natural Language Processing |
title_full | Automatic Detection of Twitter Users Who Express Chronic Stress Experiences via Supervised Machine Learning and Natural Language Processing |
title_fullStr | Automatic Detection of Twitter Users Who Express Chronic Stress Experiences via Supervised Machine Learning and Natural Language Processing |
title_full_unstemmed | Automatic Detection of Twitter Users Who Express Chronic Stress Experiences via Supervised Machine Learning and Natural Language Processing |
title_short | Automatic Detection of Twitter Users Who Express Chronic Stress Experiences via Supervised Machine Learning and Natural Language Processing |
title_sort | automatic detection of twitter users who express chronic stress experiences via supervised machine learning and natural language processing |
topic | Features |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10510804/ https://www.ncbi.nlm.nih.gov/pubmed/36445331 http://dx.doi.org/10.1097/CIN.0000000000000985 |
work_keys_str_mv | AT yangyuanchi automaticdetectionoftwitteruserswhoexpresschronicstressexperiencesviasupervisedmachinelearningandnaturallanguageprocessing AT xieangel automaticdetectionoftwitteruserswhoexpresschronicstressexperiencesviasupervisedmachinelearningandnaturallanguageprocessing AT kimsangmi automaticdetectionoftwitteruserswhoexpresschronicstressexperiencesviasupervisedmachinelearningandnaturallanguageprocessing AT hairjessica automaticdetectionoftwitteruserswhoexpresschronicstressexperiencesviasupervisedmachinelearningandnaturallanguageprocessing AT algaradimohammed automaticdetectionoftwitteruserswhoexpresschronicstressexperiencesviasupervisedmachinelearningandnaturallanguageprocessing AT sarkerabeed automaticdetectionoftwitteruserswhoexpresschronicstressexperiencesviasupervisedmachinelearningandnaturallanguageprocessing |