Cargando…
An aspect-level sentiment analysis dataset for therapies on Twitter
The dataset described is an aspect-level sentiment analysis dataset for therapies, including medication, behavioral and other therapies, created by leveraging user-generated text from Twitter. The dataset was constructed by collecting Twitter posts using keywords associated with the therapies (often...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Elsevier
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10558704/ https://www.ncbi.nlm.nih.gov/pubmed/37808542 http://dx.doi.org/10.1016/j.dib.2023.109618 |
_version_ | 1785117336904138752 |
---|---|
author | Guo, Yuting Das, Sudeshna Lakamana, Sahithi Sarker, Abeed |
author_facet | Guo, Yuting Das, Sudeshna Lakamana, Sahithi Sarker, Abeed |
author_sort | Guo, Yuting |
collection | PubMed |
description | The dataset described is an aspect-level sentiment analysis dataset for therapies, including medication, behavioral and other therapies, created by leveraging user-generated text from Twitter. The dataset was constructed by collecting Twitter posts using keywords associated with the therapies (often referred to as treatments). Subsequently, subsets of the collected posts were manually reviewed, and annotation guidelines were developed to categorize the posts as positive, negative, or neutral. The dataset contains a total of 5364 posts mentioning 32 therapies. These posts are further categorized manually into 998 (18.6%) positive, 619 (11.5%) negatives, and 3747 (69.9%) neutral sentiments. The inter-annotation agreement for the dataset was evaluated using Cohen's Kappa score, achieving an 0.82 score. The potential use of this dataset lies in the development of automatic systems that can detect users' sentiments toward therapies based on their posts. While there are other sentiment analysis datasets available, this is the first that encodes sentiments associated with specific therapies. Researchers and developers can utilize this dataset to train sentiment analysis models, natural language processing algorithms, or machine learning systems to accurately identify and analyze the sentiments expressed by consumers on social media platforms like Twitter. |
format | Online Article Text |
id | pubmed-10558704 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Elsevier |
record_format | MEDLINE/PubMed |
spelling | pubmed-105587042023-10-08 An aspect-level sentiment analysis dataset for therapies on Twitter Guo, Yuting Das, Sudeshna Lakamana, Sahithi Sarker, Abeed Data Brief Data Article The dataset described is an aspect-level sentiment analysis dataset for therapies, including medication, behavioral and other therapies, created by leveraging user-generated text from Twitter. The dataset was constructed by collecting Twitter posts using keywords associated with the therapies (often referred to as treatments). Subsequently, subsets of the collected posts were manually reviewed, and annotation guidelines were developed to categorize the posts as positive, negative, or neutral. The dataset contains a total of 5364 posts mentioning 32 therapies. These posts are further categorized manually into 998 (18.6%) positive, 619 (11.5%) negatives, and 3747 (69.9%) neutral sentiments. The inter-annotation agreement for the dataset was evaluated using Cohen's Kappa score, achieving an 0.82 score. The potential use of this dataset lies in the development of automatic systems that can detect users' sentiments toward therapies based on their posts. While there are other sentiment analysis datasets available, this is the first that encodes sentiments associated with specific therapies. Researchers and developers can utilize this dataset to train sentiment analysis models, natural language processing algorithms, or machine learning systems to accurately identify and analyze the sentiments expressed by consumers on social media platforms like Twitter. Elsevier 2023-09-23 /pmc/articles/PMC10558704/ /pubmed/37808542 http://dx.doi.org/10.1016/j.dib.2023.109618 Text en © 2023 The Author(s) https://creativecommons.org/licenses/by/4.0/This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Data Article Guo, Yuting Das, Sudeshna Lakamana, Sahithi Sarker, Abeed An aspect-level sentiment analysis dataset for therapies on Twitter |
title | An aspect-level sentiment analysis dataset for therapies on Twitter |
title_full | An aspect-level sentiment analysis dataset for therapies on Twitter |
title_fullStr | An aspect-level sentiment analysis dataset for therapies on Twitter |
title_full_unstemmed | An aspect-level sentiment analysis dataset for therapies on Twitter |
title_short | An aspect-level sentiment analysis dataset for therapies on Twitter |
title_sort | aspect-level sentiment analysis dataset for therapies on twitter |
topic | Data Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10558704/ https://www.ncbi.nlm.nih.gov/pubmed/37808542 http://dx.doi.org/10.1016/j.dib.2023.109618 |
work_keys_str_mv | AT guoyuting anaspectlevelsentimentanalysisdatasetfortherapiesontwitter AT dassudeshna anaspectlevelsentimentanalysisdatasetfortherapiesontwitter AT lakamanasahithi anaspectlevelsentimentanalysisdatasetfortherapiesontwitter AT sarkerabeed anaspectlevelsentimentanalysisdatasetfortherapiesontwitter AT guoyuting aspectlevelsentimentanalysisdatasetfortherapiesontwitter AT dassudeshna aspectlevelsentimentanalysisdatasetfortherapiesontwitter AT lakamanasahithi aspectlevelsentimentanalysisdatasetfortherapiesontwitter AT sarkerabeed aspectlevelsentimentanalysisdatasetfortherapiesontwitter |