Cargando…
AFND: Arabic fake news dataset for the detection and classification of articles credibility
The news credibility detection task has started to gain more attention recently due to the rapid increase of news on different social media platforms. This article provides a large, labeled, and diverse Arabic Fake News Dataset (AFND) that is collected from public Arabic news websites. This dataset...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Elsevier
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9048144/ https://www.ncbi.nlm.nih.gov/pubmed/35496492 http://dx.doi.org/10.1016/j.dib.2022.108141 |
_version_ | 1784695874632286208 |
---|---|
author | Khalil, Ashwaq Jarrah, Moath Aldwairi, Monther Jaradat, Manar |
author_facet | Khalil, Ashwaq Jarrah, Moath Aldwairi, Monther Jaradat, Manar |
author_sort | Khalil, Ashwaq |
collection | PubMed |
description | The news credibility detection task has started to gain more attention recently due to the rapid increase of news on different social media platforms. This article provides a large, labeled, and diverse Arabic Fake News Dataset (AFND) that is collected from public Arabic news websites. This dataset enables the research community to use supervised and unsupervised machine learning algorithms to classify the credibility of Arabic news articles. AFND consists of 606912 public news articles that were scraped from 134 public news websites of 19 different Arab countries over a 6-month period using Python scripts. The Arabic fact-check platform, Misbar, is used manually to classify each public news source into credible, not credible, or undecided. Weak supervision is applied to label news articles with the same label as the public source. AFND is imbalanced in the number of articles in each class. Hence, it is useful for researchers who focus on finding solutions for imbalanced datasets. The dataset is available in JSON format and can be accessed from Mendeley Data repository. |
format | Online Article Text |
id | pubmed-9048144 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Elsevier |
record_format | MEDLINE/PubMed |
spelling | pubmed-90481442022-04-29 AFND: Arabic fake news dataset for the detection and classification of articles credibility Khalil, Ashwaq Jarrah, Moath Aldwairi, Monther Jaradat, Manar Data Brief Data Article The news credibility detection task has started to gain more attention recently due to the rapid increase of news on different social media platforms. This article provides a large, labeled, and diverse Arabic Fake News Dataset (AFND) that is collected from public Arabic news websites. This dataset enables the research community to use supervised and unsupervised machine learning algorithms to classify the credibility of Arabic news articles. AFND consists of 606912 public news articles that were scraped from 134 public news websites of 19 different Arab countries over a 6-month period using Python scripts. The Arabic fact-check platform, Misbar, is used manually to classify each public news source into credible, not credible, or undecided. Weak supervision is applied to label news articles with the same label as the public source. AFND is imbalanced in the number of articles in each class. Hence, it is useful for researchers who focus on finding solutions for imbalanced datasets. The dataset is available in JSON format and can be accessed from Mendeley Data repository. Elsevier 2022-04-08 /pmc/articles/PMC9048144/ /pubmed/35496492 http://dx.doi.org/10.1016/j.dib.2022.108141 Text en © 2022 The Author(s) https://creativecommons.org/licenses/by/4.0/This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Data Article Khalil, Ashwaq Jarrah, Moath Aldwairi, Monther Jaradat, Manar AFND: Arabic fake news dataset for the detection and classification of articles credibility |
title | AFND: Arabic fake news dataset for the detection and classification of articles credibility |
title_full | AFND: Arabic fake news dataset for the detection and classification of articles credibility |
title_fullStr | AFND: Arabic fake news dataset for the detection and classification of articles credibility |
title_full_unstemmed | AFND: Arabic fake news dataset for the detection and classification of articles credibility |
title_short | AFND: Arabic fake news dataset for the detection and classification of articles credibility |
title_sort | afnd: arabic fake news dataset for the detection and classification of articles credibility |
topic | Data Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9048144/ https://www.ncbi.nlm.nih.gov/pubmed/35496492 http://dx.doi.org/10.1016/j.dib.2022.108141 |
work_keys_str_mv | AT khalilashwaq afndarabicfakenewsdatasetforthedetectionandclassificationofarticlescredibility AT jarrahmoath afndarabicfakenewsdatasetforthedetectionandclassificationofarticlescredibility AT aldwairimonther afndarabicfakenewsdatasetforthedetectionandclassificationofarticlescredibility AT jaradatmanar afndarabicfakenewsdatasetforthedetectionandclassificationofarticlescredibility |