Cargando…

Human-annotated dataset for social media sentiment analysis for Albanian language

Social media was a heavily used platform by people in different countries to express their opinions about different crises, especially during the Covid-19 pandemics. This dataset is created through collecting people's comments in the news items on the official Facebook site of the National Inst...

Descripción completa

Detalles Bibliográficos
Autores principales: Kadriu, Fatbardh, Murtezaj, Doruntina, Gashi, Fatbardh, Ahmedi, Lule, Kurti, Arianit, Kastrati, Zenun
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9272335/
https://www.ncbi.nlm.nih.gov/pubmed/35832321
http://dx.doi.org/10.1016/j.dib.2022.108436
_version_ 1784744853662334976
author Kadriu, Fatbardh
Murtezaj, Doruntina
Gashi, Fatbardh
Ahmedi, Lule
Kurti, Arianit
Kastrati, Zenun
author_facet Kadriu, Fatbardh
Murtezaj, Doruntina
Gashi, Fatbardh
Ahmedi, Lule
Kurti, Arianit
Kastrati, Zenun
author_sort Kadriu, Fatbardh
collection PubMed
description Social media was a heavily used platform by people in different countries to express their opinions about different crises, especially during the Covid-19 pandemics. This dataset is created through collecting people's comments in the news items on the official Facebook site of the National Institute of Public Health of Kosovo. The dataset contains a total of 10,132 comments that are human-annotated in the Albanian language as a low-resource language. The dataset was collected from March 12, 2020, and this coincides with the emergence of the first confirmed Covid-19 case in Kosovo until August 31, 2020, when the second wave started. Due to the scarcity of labeled data for low-resource languages, the dataset can be used by the research community in the field of machine learning, information retrieval, affective computing, as well as by the public agencies and decision makers.
format Online
Article
Text
id pubmed-9272335
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Elsevier
record_format MEDLINE/PubMed
spelling pubmed-92723352022-07-12 Human-annotated dataset for social media sentiment analysis for Albanian language Kadriu, Fatbardh Murtezaj, Doruntina Gashi, Fatbardh Ahmedi, Lule Kurti, Arianit Kastrati, Zenun Data Brief Data Article Social media was a heavily used platform by people in different countries to express their opinions about different crises, especially during the Covid-19 pandemics. This dataset is created through collecting people's comments in the news items on the official Facebook site of the National Institute of Public Health of Kosovo. The dataset contains a total of 10,132 comments that are human-annotated in the Albanian language as a low-resource language. The dataset was collected from March 12, 2020, and this coincides with the emergence of the first confirmed Covid-19 case in Kosovo until August 31, 2020, when the second wave started. Due to the scarcity of labeled data for low-resource languages, the dataset can be used by the research community in the field of machine learning, information retrieval, affective computing, as well as by the public agencies and decision makers. Elsevier 2022-07-02 /pmc/articles/PMC9272335/ /pubmed/35832321 http://dx.doi.org/10.1016/j.dib.2022.108436 Text en © 2022 The Author(s) https://creativecommons.org/licenses/by/4.0/This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Data Article
Kadriu, Fatbardh
Murtezaj, Doruntina
Gashi, Fatbardh
Ahmedi, Lule
Kurti, Arianit
Kastrati, Zenun
Human-annotated dataset for social media sentiment analysis for Albanian language
title Human-annotated dataset for social media sentiment analysis for Albanian language
title_full Human-annotated dataset for social media sentiment analysis for Albanian language
title_fullStr Human-annotated dataset for social media sentiment analysis for Albanian language
title_full_unstemmed Human-annotated dataset for social media sentiment analysis for Albanian language
title_short Human-annotated dataset for social media sentiment analysis for Albanian language
title_sort human-annotated dataset for social media sentiment analysis for albanian language
topic Data Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9272335/
https://www.ncbi.nlm.nih.gov/pubmed/35832321
http://dx.doi.org/10.1016/j.dib.2022.108436
work_keys_str_mv AT kadriufatbardh humanannotateddatasetforsocialmediasentimentanalysisforalbanianlanguage
AT murtezajdoruntina humanannotateddatasetforsocialmediasentimentanalysisforalbanianlanguage
AT gashifatbardh humanannotateddatasetforsocialmediasentimentanalysisforalbanianlanguage
AT ahmedilule humanannotateddatasetforsocialmediasentimentanalysisforalbanianlanguage
AT kurtiarianit humanannotateddatasetforsocialmediasentimentanalysisforalbanianlanguage
AT kastratizenun humanannotateddatasetforsocialmediasentimentanalysisforalbanianlanguage