Cargando…

Sanadset 650K: Data on Hadith narrators

The chain of narrators (Sanad) plays a vital role in deciding the authenticity of Islamic hadiths. However, the investigation and validation of such Sanad fully depend on scientists (Hadith Scholars). They ordinarily utilize their acquired knowledge, which in this manner needs a critical sum of exer...

Descripción completa

Detalles Bibliográficos
Autores principales:	Mghari, Mohammed, Bouras, Omar, El Hibaoui, Abdelaaziz
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Elsevier 2022
Materias:	Data Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9440281/ https://www.ncbi.nlm.nih.gov/pubmed/36065202 http://dx.doi.org/10.1016/j.dib.2022.108540

_version_	1784782307336388608
author	Mghari, Mohammed Bouras, Omar El Hibaoui, Abdelaaziz
author_facet	Mghari, Mohammed Bouras, Omar El Hibaoui, Abdelaaziz
author_sort	Mghari, Mohammed
collection	PubMed
description	The chain of narrators (Sanad) plays a vital role in deciding the authenticity of Islamic hadiths. However, the investigation and validation of such Sanad fully depend on scientists (Hadith Scholars). They ordinarily utilize their acquired knowledge, which in this manner needs a critical sum of exertion and time. Automated Sanad evaluation using machine learning algorithms is the best way to solve this problem. Therefore, a representative Sanad dataset is required. This paper presents a full hadith dataset which is named Sanadset and is made openly accessible for researchers. Sanadset corpus contains over 650,986 records collected from 926 historical Arabic books of hadith. This dataset can be used for further investigation and classification of hadiths (Strong/Weak), and narrators (trustworthy/not) using AI techniques, and also it can be used as a linguistic resource tool for Arabic Natural Language Processing. Our dataset is collected from online Hadith sources using data scraping and web crawling. The main contribution of this dataset is the extraction of narrator chains that were originally present in textual form within Hadith books. Each observation in the dataset contains complete information about a specific hadith, such as (original book, number, Hadith text, Matn, list of narrators, and the number of narrators).
format	Online Article Text
id	pubmed-9440281
institution	National Center for Biotechnology Information
language	English
publishDate	2022
publisher	Elsevier
record_format	MEDLINE/PubMed
spelling	pubmed-94402812022-09-04 Sanadset 650K: Data on Hadith narrators Mghari, Mohammed Bouras, Omar El Hibaoui, Abdelaaziz Data Brief Data Article The chain of narrators (Sanad) plays a vital role in deciding the authenticity of Islamic hadiths. However, the investigation and validation of such Sanad fully depend on scientists (Hadith Scholars). They ordinarily utilize their acquired knowledge, which in this manner needs a critical sum of exertion and time. Automated Sanad evaluation using machine learning algorithms is the best way to solve this problem. Therefore, a representative Sanad dataset is required. This paper presents a full hadith dataset which is named Sanadset and is made openly accessible for researchers. Sanadset corpus contains over 650,986 records collected from 926 historical Arabic books of hadith. This dataset can be used for further investigation and classification of hadiths (Strong/Weak), and narrators (trustworthy/not) using AI techniques, and also it can be used as a linguistic resource tool for Arabic Natural Language Processing. Our dataset is collected from online Hadith sources using data scraping and web crawling. The main contribution of this dataset is the extraction of narrator chains that were originally present in textual form within Hadith books. Each observation in the dataset contains complete information about a specific hadith, such as (original book, number, Hadith text, Matn, list of narrators, and the number of narrators). Elsevier 2022-08-17 /pmc/articles/PMC9440281/ /pubmed/36065202 http://dx.doi.org/10.1016/j.dib.2022.108540 Text en © 2022 The Authors https://creativecommons.org/licenses/by/4.0/This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle	Data Article Mghari, Mohammed Bouras, Omar El Hibaoui, Abdelaaziz Sanadset 650K: Data on Hadith narrators
title	Sanadset 650K: Data on Hadith narrators
title_full	Sanadset 650K: Data on Hadith narrators
title_fullStr	Sanadset 650K: Data on Hadith narrators
title_full_unstemmed	Sanadset 650K: Data on Hadith narrators
title_short	Sanadset 650K: Data on Hadith narrators
title_sort	sanadset 650k: data on hadith narrators
topic	Data Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9440281/ https://www.ncbi.nlm.nih.gov/pubmed/36065202 http://dx.doi.org/10.1016/j.dib.2022.108540
work_keys_str_mv	AT mgharimohammed sanadset650kdataonhadithnarrators AT bourasomar sanadset650kdataonhadithnarrators AT elhibaouiabdelaaziz sanadset650kdataonhadithnarrators

Sanadset 650K: Data on Hadith narrators

Ejemplares similares