Cargando…

A Deep Learning Approach for Arabic Manuscripts Classification

For centuries, libraries worldwide have preserved ancient manuscripts due to their immense historical and cultural value. However, over time, both natural and human-made factors have led to the degradation of many ancient Arabic manuscripts, causing the loss of significant information, such as autho...

Descripción completa

Detalles Bibliográficos
Autores principales: Al-homed, Lutfieh S., Jambi, Kamal M., Al-Barhamtoshy, Hassanin M.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10575097/
https://www.ncbi.nlm.nih.gov/pubmed/37836963
http://dx.doi.org/10.3390/s23198133
_version_ 1785120844338429952
author Al-homed, Lutfieh S.
Jambi, Kamal M.
Al-Barhamtoshy, Hassanin M.
author_facet Al-homed, Lutfieh S.
Jambi, Kamal M.
Al-Barhamtoshy, Hassanin M.
author_sort Al-homed, Lutfieh S.
collection PubMed
description For centuries, libraries worldwide have preserved ancient manuscripts due to their immense historical and cultural value. However, over time, both natural and human-made factors have led to the degradation of many ancient Arabic manuscripts, causing the loss of significant information, such as authorship, titles, or subjects, rendering them as unknown manuscripts. Although catalog cards attached to these manuscripts might contain some of the missing details, these cards have degraded significantly in quality over the decades within libraries. This paper presents a framework for identifying these unknown ancient Arabic manuscripts by processing the catalog cards associated with them. Given the challenges posed by the degradation of these cards, simple optical character recognition (OCR) is often insufficient. The proposed framework uses deep learning architecture to identify unknown manuscripts within a collection of ancient Arabic documents. This involves locating, extracting, and classifying the text from these catalog cards, along with implementing processes for region-of-interest identification, rotation correction, feature extraction, and classification. The results demonstrate the effectiveness of the proposed method, achieving an accuracy rate of 92.5%, compared to 83.5% with classical image classification and 81.5% with OCR alone.
format Online
Article
Text
id pubmed-10575097
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-105750972023-10-14 A Deep Learning Approach for Arabic Manuscripts Classification Al-homed, Lutfieh S. Jambi, Kamal M. Al-Barhamtoshy, Hassanin M. Sensors (Basel) Article For centuries, libraries worldwide have preserved ancient manuscripts due to their immense historical and cultural value. However, over time, both natural and human-made factors have led to the degradation of many ancient Arabic manuscripts, causing the loss of significant information, such as authorship, titles, or subjects, rendering them as unknown manuscripts. Although catalog cards attached to these manuscripts might contain some of the missing details, these cards have degraded significantly in quality over the decades within libraries. This paper presents a framework for identifying these unknown ancient Arabic manuscripts by processing the catalog cards associated with them. Given the challenges posed by the degradation of these cards, simple optical character recognition (OCR) is often insufficient. The proposed framework uses deep learning architecture to identify unknown manuscripts within a collection of ancient Arabic documents. This involves locating, extracting, and classifying the text from these catalog cards, along with implementing processes for region-of-interest identification, rotation correction, feature extraction, and classification. The results demonstrate the effectiveness of the proposed method, achieving an accuracy rate of 92.5%, compared to 83.5% with classical image classification and 81.5% with OCR alone. MDPI 2023-09-28 /pmc/articles/PMC10575097/ /pubmed/37836963 http://dx.doi.org/10.3390/s23198133 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Al-homed, Lutfieh S.
Jambi, Kamal M.
Al-Barhamtoshy, Hassanin M.
A Deep Learning Approach for Arabic Manuscripts Classification
title A Deep Learning Approach for Arabic Manuscripts Classification
title_full A Deep Learning Approach for Arabic Manuscripts Classification
title_fullStr A Deep Learning Approach for Arabic Manuscripts Classification
title_full_unstemmed A Deep Learning Approach for Arabic Manuscripts Classification
title_short A Deep Learning Approach for Arabic Manuscripts Classification
title_sort deep learning approach for arabic manuscripts classification
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10575097/
https://www.ncbi.nlm.nih.gov/pubmed/37836963
http://dx.doi.org/10.3390/s23198133
work_keys_str_mv AT alhomedlutfiehs adeeplearningapproachforarabicmanuscriptsclassification
AT jambikamalm adeeplearningapproachforarabicmanuscriptsclassification
AT albarhamtoshyhassaninm adeeplearningapproachforarabicmanuscriptsclassification
AT alhomedlutfiehs deeplearningapproachforarabicmanuscriptsclassification
AT jambikamalm deeplearningapproachforarabicmanuscriptsclassification
AT albarhamtoshyhassaninm deeplearningapproachforarabicmanuscriptsclassification