Cargando…

FAN-MCCD: Fast and Accurate Network for Multi-Scale Chinese Character Detection

Inaccurate localization due to scale-variation during character detection causes a widespread issue overconfidence in results of the document analysis community, for the most part in historical and handwritten documents. In this work, we explored the performance of a state-of-the-art network with a...

Descripción completa

Detalles Bibliográficos
Autores principales: Alnaasan, Manar, Kim, Sungho
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8586960/
https://www.ncbi.nlm.nih.gov/pubmed/34770596
http://dx.doi.org/10.3390/s21217289
_version_ 1784597989891768320
author Alnaasan, Manar
Kim, Sungho
author_facet Alnaasan, Manar
Kim, Sungho
author_sort Alnaasan, Manar
collection PubMed
description Inaccurate localization due to scale-variation during character detection causes a widespread issue overconfidence in results of the document analysis community, for the most part in historical and handwritten documents. In this work, we explored the performance of a state-of-the-art network with a simple pipeline that fast and accurately predicts handwritten Chinese characters in old documents. In order to adapt to locations of characters with multi-scale more precisely, excluding pre-processing and in-between steps, we utilized a network with multi-scale feature maps. Then, across each feature map, pre-selected boxes of unalike scales and aspect ratios were employed. The last step was to prune the bounding boxes, sending them to non-maximum suppression to yield the final results. Focusing on a well-designed neural network architecture and loss function that presents well-classified examples, we found our experiments on Caoshu, Character, and Src-images datasets demonstrated that detection performance was enhanced for the detection rate (DT), the false positive per character (FPPC), and the F-score in the order of 98.84%, 0.71, and 97.64%, respectively. In comparison with SSD (single-shot detector), the detection performance of a detection rate (DT), the false positive per character (FPPC), and the F-score were 61.12%, 6.12, and 60.33%, respectively.
format Online
Article
Text
id pubmed-8586960
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-85869602021-11-13 FAN-MCCD: Fast and Accurate Network for Multi-Scale Chinese Character Detection Alnaasan, Manar Kim, Sungho Sensors (Basel) Article Inaccurate localization due to scale-variation during character detection causes a widespread issue overconfidence in results of the document analysis community, for the most part in historical and handwritten documents. In this work, we explored the performance of a state-of-the-art network with a simple pipeline that fast and accurately predicts handwritten Chinese characters in old documents. In order to adapt to locations of characters with multi-scale more precisely, excluding pre-processing and in-between steps, we utilized a network with multi-scale feature maps. Then, across each feature map, pre-selected boxes of unalike scales and aspect ratios were employed. The last step was to prune the bounding boxes, sending them to non-maximum suppression to yield the final results. Focusing on a well-designed neural network architecture and loss function that presents well-classified examples, we found our experiments on Caoshu, Character, and Src-images datasets demonstrated that detection performance was enhanced for the detection rate (DT), the false positive per character (FPPC), and the F-score in the order of 98.84%, 0.71, and 97.64%, respectively. In comparison with SSD (single-shot detector), the detection performance of a detection rate (DT), the false positive per character (FPPC), and the F-score were 61.12%, 6.12, and 60.33%, respectively. MDPI 2021-11-02 /pmc/articles/PMC8586960/ /pubmed/34770596 http://dx.doi.org/10.3390/s21217289 Text en © 2021 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Alnaasan, Manar
Kim, Sungho
FAN-MCCD: Fast and Accurate Network for Multi-Scale Chinese Character Detection
title FAN-MCCD: Fast and Accurate Network for Multi-Scale Chinese Character Detection
title_full FAN-MCCD: Fast and Accurate Network for Multi-Scale Chinese Character Detection
title_fullStr FAN-MCCD: Fast and Accurate Network for Multi-Scale Chinese Character Detection
title_full_unstemmed FAN-MCCD: Fast and Accurate Network for Multi-Scale Chinese Character Detection
title_short FAN-MCCD: Fast and Accurate Network for Multi-Scale Chinese Character Detection
title_sort fan-mccd: fast and accurate network for multi-scale chinese character detection
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8586960/
https://www.ncbi.nlm.nih.gov/pubmed/34770596
http://dx.doi.org/10.3390/s21217289
work_keys_str_mv AT alnaasanmanar fanmccdfastandaccuratenetworkformultiscalechinesecharacterdetection
AT kimsungho fanmccdfastandaccuratenetworkformultiscalechinesecharacterdetection