Cargando…
Human-annotated dataset for social media sentiment analysis for Albanian language
Social media was a heavily used platform by people in different countries to express their opinions about different crises, especially during the Covid-19 pandemics. This dataset is created through collecting people's comments in the news items on the official Facebook site of the National Inst...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Elsevier
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9272335/ https://www.ncbi.nlm.nih.gov/pubmed/35832321 http://dx.doi.org/10.1016/j.dib.2022.108436 |
_version_ | 1784744853662334976 |
---|---|
author | Kadriu, Fatbardh Murtezaj, Doruntina Gashi, Fatbardh Ahmedi, Lule Kurti, Arianit Kastrati, Zenun |
author_facet | Kadriu, Fatbardh Murtezaj, Doruntina Gashi, Fatbardh Ahmedi, Lule Kurti, Arianit Kastrati, Zenun |
author_sort | Kadriu, Fatbardh |
collection | PubMed |
description | Social media was a heavily used platform by people in different countries to express their opinions about different crises, especially during the Covid-19 pandemics. This dataset is created through collecting people's comments in the news items on the official Facebook site of the National Institute of Public Health of Kosovo. The dataset contains a total of 10,132 comments that are human-annotated in the Albanian language as a low-resource language. The dataset was collected from March 12, 2020, and this coincides with the emergence of the first confirmed Covid-19 case in Kosovo until August 31, 2020, when the second wave started. Due to the scarcity of labeled data for low-resource languages, the dataset can be used by the research community in the field of machine learning, information retrieval, affective computing, as well as by the public agencies and decision makers. |
format | Online Article Text |
id | pubmed-9272335 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Elsevier |
record_format | MEDLINE/PubMed |
spelling | pubmed-92723352022-07-12 Human-annotated dataset for social media sentiment analysis for Albanian language Kadriu, Fatbardh Murtezaj, Doruntina Gashi, Fatbardh Ahmedi, Lule Kurti, Arianit Kastrati, Zenun Data Brief Data Article Social media was a heavily used platform by people in different countries to express their opinions about different crises, especially during the Covid-19 pandemics. This dataset is created through collecting people's comments in the news items on the official Facebook site of the National Institute of Public Health of Kosovo. The dataset contains a total of 10,132 comments that are human-annotated in the Albanian language as a low-resource language. The dataset was collected from March 12, 2020, and this coincides with the emergence of the first confirmed Covid-19 case in Kosovo until August 31, 2020, when the second wave started. Due to the scarcity of labeled data for low-resource languages, the dataset can be used by the research community in the field of machine learning, information retrieval, affective computing, as well as by the public agencies and decision makers. Elsevier 2022-07-02 /pmc/articles/PMC9272335/ /pubmed/35832321 http://dx.doi.org/10.1016/j.dib.2022.108436 Text en © 2022 The Author(s) https://creativecommons.org/licenses/by/4.0/This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Data Article Kadriu, Fatbardh Murtezaj, Doruntina Gashi, Fatbardh Ahmedi, Lule Kurti, Arianit Kastrati, Zenun Human-annotated dataset for social media sentiment analysis for Albanian language |
title | Human-annotated dataset for social media sentiment analysis for Albanian language |
title_full | Human-annotated dataset for social media sentiment analysis for Albanian language |
title_fullStr | Human-annotated dataset for social media sentiment analysis for Albanian language |
title_full_unstemmed | Human-annotated dataset for social media sentiment analysis for Albanian language |
title_short | Human-annotated dataset for social media sentiment analysis for Albanian language |
title_sort | human-annotated dataset for social media sentiment analysis for albanian language |
topic | Data Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9272335/ https://www.ncbi.nlm.nih.gov/pubmed/35832321 http://dx.doi.org/10.1016/j.dib.2022.108436 |
work_keys_str_mv | AT kadriufatbardh humanannotateddatasetforsocialmediasentimentanalysisforalbanianlanguage AT murtezajdoruntina humanannotateddatasetforsocialmediasentimentanalysisforalbanianlanguage AT gashifatbardh humanannotateddatasetforsocialmediasentimentanalysisforalbanianlanguage AT ahmedilule humanannotateddatasetforsocialmediasentimentanalysisforalbanianlanguage AT kurtiarianit humanannotateddatasetforsocialmediasentimentanalysisforalbanianlanguage AT kastratizenun humanannotateddatasetforsocialmediasentimentanalysisforalbanianlanguage |