Cargando…

Residual-based approach for authenticating pattern of multi-style diacritical Arabic texts

Arabic script is highly sensitive to changes in meaning with respect to the accurate arrangement of diacritics and other related symbols. The most sensitive Arabic text available online is the Digital Qur’an, the sacred book of Revelation in Islam that all Muslims including non-Arabs recite as part...

Descripción completa

Detalles Bibliográficos
Autores principales: Hakak, Saqib, Kamsin, Amirrudin, Palaiahnakote, Shivakumara, Tayan, Omar, Idna Idris, Mohd. Yamani, Abukhir, Khir Zuhaili
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6010264/
https://www.ncbi.nlm.nih.gov/pubmed/29924810
http://dx.doi.org/10.1371/journal.pone.0198284
_version_ 1783333548618416128
author Hakak, Saqib
Kamsin, Amirrudin
Palaiahnakote, Shivakumara
Tayan, Omar
Idna Idris, Mohd. Yamani
Abukhir, Khir Zuhaili
author_facet Hakak, Saqib
Kamsin, Amirrudin
Palaiahnakote, Shivakumara
Tayan, Omar
Idna Idris, Mohd. Yamani
Abukhir, Khir Zuhaili
author_sort Hakak, Saqib
collection PubMed
description Arabic script is highly sensitive to changes in meaning with respect to the accurate arrangement of diacritics and other related symbols. The most sensitive Arabic text available online is the Digital Qur’an, the sacred book of Revelation in Islam that all Muslims including non-Arabs recite as part of their worship. Due to the different characteristics of the Arabic letters like diacritics (punctuation symbols), kashida (extended letters) and other symbols, it is written and available in different styles like Kufi, Naskh, Thuluth, Uthmani, etc. As social media has become part of our daily life, posting downloaded Qur’anic verses from the web is common. This leads to the problem of authenticating the selected Qur’anic passages available in different styles. This paper presents a residual approach for authenticating Uthmani and plain Qur’an verses using one common database. Residual (difference) is obtained by analyzing the differences between Uthmani and plain Quranic styles using XOR operation. Based on predefined data, the proposed approach converts Uthmani text into plain text. Furthermore, we propose to use the Tuned BM algorithm (BMT) exact pattern matching algorithm to verify the substituted Uthmani verse with a given database of plain Qur’anic style. Experimental results show that the proposed approach is useful and effective in authenticating multi-style texts of the Qur’an with 87.1% accuracy.
format Online
Article
Text
id pubmed-6010264
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-60102642018-07-06 Residual-based approach for authenticating pattern of multi-style diacritical Arabic texts Hakak, Saqib Kamsin, Amirrudin Palaiahnakote, Shivakumara Tayan, Omar Idna Idris, Mohd. Yamani Abukhir, Khir Zuhaili PLoS One Research Article Arabic script is highly sensitive to changes in meaning with respect to the accurate arrangement of diacritics and other related symbols. The most sensitive Arabic text available online is the Digital Qur’an, the sacred book of Revelation in Islam that all Muslims including non-Arabs recite as part of their worship. Due to the different characteristics of the Arabic letters like diacritics (punctuation symbols), kashida (extended letters) and other symbols, it is written and available in different styles like Kufi, Naskh, Thuluth, Uthmani, etc. As social media has become part of our daily life, posting downloaded Qur’anic verses from the web is common. This leads to the problem of authenticating the selected Qur’anic passages available in different styles. This paper presents a residual approach for authenticating Uthmani and plain Qur’an verses using one common database. Residual (difference) is obtained by analyzing the differences between Uthmani and plain Quranic styles using XOR operation. Based on predefined data, the proposed approach converts Uthmani text into plain text. Furthermore, we propose to use the Tuned BM algorithm (BMT) exact pattern matching algorithm to verify the substituted Uthmani verse with a given database of plain Qur’anic style. Experimental results show that the proposed approach is useful and effective in authenticating multi-style texts of the Qur’an with 87.1% accuracy. Public Library of Science 2018-06-20 /pmc/articles/PMC6010264/ /pubmed/29924810 http://dx.doi.org/10.1371/journal.pone.0198284 Text en © 2018 Hakak et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Hakak, Saqib
Kamsin, Amirrudin
Palaiahnakote, Shivakumara
Tayan, Omar
Idna Idris, Mohd. Yamani
Abukhir, Khir Zuhaili
Residual-based approach for authenticating pattern of multi-style diacritical Arabic texts
title Residual-based approach for authenticating pattern of multi-style diacritical Arabic texts
title_full Residual-based approach for authenticating pattern of multi-style diacritical Arabic texts
title_fullStr Residual-based approach for authenticating pattern of multi-style diacritical Arabic texts
title_full_unstemmed Residual-based approach for authenticating pattern of multi-style diacritical Arabic texts
title_short Residual-based approach for authenticating pattern of multi-style diacritical Arabic texts
title_sort residual-based approach for authenticating pattern of multi-style diacritical arabic texts
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6010264/
https://www.ncbi.nlm.nih.gov/pubmed/29924810
http://dx.doi.org/10.1371/journal.pone.0198284
work_keys_str_mv AT hakaksaqib residualbasedapproachforauthenticatingpatternofmultistylediacriticalarabictexts
AT kamsinamirrudin residualbasedapproachforauthenticatingpatternofmultistylediacriticalarabictexts
AT palaiahnakoteshivakumara residualbasedapproachforauthenticatingpatternofmultistylediacriticalarabictexts
AT tayanomar residualbasedapproachforauthenticatingpatternofmultistylediacriticalarabictexts
AT idnaidrismohdyamani residualbasedapproachforauthenticatingpatternofmultistylediacriticalarabictexts
AT abukhirkhirzuhaili residualbasedapproachforauthenticatingpatternofmultistylediacriticalarabictexts