Cargando…
Optical character recognition system for Baybayin scripts using support vector machine
In 2018, the Philippine Congress signed House Bill 1022 declaring the Baybayin script as the Philippines’ national writing system. In this regard, it is highly probable that the Baybayin and Latin scripts would appear in a single document. In this work, we propose a system that discriminates the cha...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
PeerJ Inc.
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7959605/ https://www.ncbi.nlm.nih.gov/pubmed/33817010 http://dx.doi.org/10.7717/peerj-cs.360 |
_version_ | 1783664985789956096 |
---|---|
author | Pino, Rodney Mendoza, Renier Sambayan, Rachelle |
author_facet | Pino, Rodney Mendoza, Renier Sambayan, Rachelle |
author_sort | Pino, Rodney |
collection | PubMed |
description | In 2018, the Philippine Congress signed House Bill 1022 declaring the Baybayin script as the Philippines’ national writing system. In this regard, it is highly probable that the Baybayin and Latin scripts would appear in a single document. In this work, we propose a system that discriminates the characters of both scripts. The proposed system considers the normalization of an individual character to identify if it belongs to Baybayin or Latin script and further classify them as to what unit they represent. This gives us four classification problems, namely: (1) Baybayin and Latin script recognition, (2) Baybayin character classification, (3) Latin character classification, and (4) Baybayin diacritical marks classification. To the best of our knowledge, this is the first study that makes use of Support Vector Machine (SVM) for Baybayin script recognition. This work also provides a new dataset for Baybayin, its diacritics, and Latin characters. Classification problems (1) and (4) use binary SVM while (2) and (3) apply the multiclass SVM classification. On average, our numerical experiments yield satisfactory results: (1) has 98.5% accuracy, 98.5% precision, 98.49% recall, and 98.5% F1 Score; (2) has 96.51% accuracy, 95.62% precision, 95.61% recall, and 95.62% F1 Score; (3) has 95.8% accuracy, 95.85% precision, 95.8% recall, and 95.83% F1 Score; and (4) has 100% accuracy, 100% precision, 100% recall, and 100% F1 Score. |
format | Online Article Text |
id | pubmed-7959605 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | PeerJ Inc. |
record_format | MEDLINE/PubMed |
spelling | pubmed-79596052021-04-02 Optical character recognition system for Baybayin scripts using support vector machine Pino, Rodney Mendoza, Renier Sambayan, Rachelle PeerJ Comput Sci Artificial Intelligence In 2018, the Philippine Congress signed House Bill 1022 declaring the Baybayin script as the Philippines’ national writing system. In this regard, it is highly probable that the Baybayin and Latin scripts would appear in a single document. In this work, we propose a system that discriminates the characters of both scripts. The proposed system considers the normalization of an individual character to identify if it belongs to Baybayin or Latin script and further classify them as to what unit they represent. This gives us four classification problems, namely: (1) Baybayin and Latin script recognition, (2) Baybayin character classification, (3) Latin character classification, and (4) Baybayin diacritical marks classification. To the best of our knowledge, this is the first study that makes use of Support Vector Machine (SVM) for Baybayin script recognition. This work also provides a new dataset for Baybayin, its diacritics, and Latin characters. Classification problems (1) and (4) use binary SVM while (2) and (3) apply the multiclass SVM classification. On average, our numerical experiments yield satisfactory results: (1) has 98.5% accuracy, 98.5% precision, 98.49% recall, and 98.5% F1 Score; (2) has 96.51% accuracy, 95.62% precision, 95.61% recall, and 95.62% F1 Score; (3) has 95.8% accuracy, 95.85% precision, 95.8% recall, and 95.83% F1 Score; and (4) has 100% accuracy, 100% precision, 100% recall, and 100% F1 Score. PeerJ Inc. 2021-02-15 /pmc/articles/PMC7959605/ /pubmed/33817010 http://dx.doi.org/10.7717/peerj-cs.360 Text en ©2021 Pino et al. https://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Computer Science) and either DOI or URL of the article must be cited. |
spellingShingle | Artificial Intelligence Pino, Rodney Mendoza, Renier Sambayan, Rachelle Optical character recognition system for Baybayin scripts using support vector machine |
title | Optical character recognition system for Baybayin scripts using support vector machine |
title_full | Optical character recognition system for Baybayin scripts using support vector machine |
title_fullStr | Optical character recognition system for Baybayin scripts using support vector machine |
title_full_unstemmed | Optical character recognition system for Baybayin scripts using support vector machine |
title_short | Optical character recognition system for Baybayin scripts using support vector machine |
title_sort | optical character recognition system for baybayin scripts using support vector machine |
topic | Artificial Intelligence |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7959605/ https://www.ncbi.nlm.nih.gov/pubmed/33817010 http://dx.doi.org/10.7717/peerj-cs.360 |
work_keys_str_mv | AT pinorodney opticalcharacterrecognitionsystemforbaybayinscriptsusingsupportvectormachine AT mendozarenier opticalcharacterrecognitionsystemforbaybayinscriptsusingsupportvectormachine AT sambayanrachelle opticalcharacterrecognitionsystemforbaybayinscriptsusingsupportvectormachine |