Cargando…

Attention-Based Scene Text Detection on Dual Feature Fusion

The segmentation-based scene text detection algorithm has advantages in scene text detection scenarios with arbitrary shape and extreme aspect ratio, depending on its pixel-level description and fine post-processing. However, the insufficient use of semantic and spatial information in the network li...

Descripción completa

Detalles Bibliográficos
Autores principales: Li, Yuze, Silamu, Wushour, Wang, Zhenchao, Xu, Miaomiao
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9739706/
https://www.ncbi.nlm.nih.gov/pubmed/36501774
http://dx.doi.org/10.3390/s22239072
_version_ 1784847874789474304
author Li, Yuze
Silamu, Wushour
Wang, Zhenchao
Xu, Miaomiao
author_facet Li, Yuze
Silamu, Wushour
Wang, Zhenchao
Xu, Miaomiao
author_sort Li, Yuze
collection PubMed
description The segmentation-based scene text detection algorithm has advantages in scene text detection scenarios with arbitrary shape and extreme aspect ratio, depending on its pixel-level description and fine post-processing. However, the insufficient use of semantic and spatial information in the network limits the classification and positioning capabilities of the network. Existing scene text detection methods have the problem of losing important feature information in the process of extracting features from each network layer. To solve this problem, the Attention-based Dual Feature Fusion Model (ADFM) is proposed. The Bi-directional Feature Fusion Pyramid Module (BFM) first adds stronger semantic information to the higher-resolution feature maps through a top-down process and then reduces the aliasing effects generated by the previous process through a bottom-up process to enhance the representation of multi-scale text semantic information. Meanwhile, a position-sensitive Spatial Attention Module (SAM) is introduced in the intermediate process of two-stage feature fusion. It focuses on the one feature map with the highest resolution and strongest semantic features generated in the top-down process and weighs the spatial position weight by the relevance of text features, thus improving the sensitivity of the text detection network to text regions. The effectiveness of each module of ADFM was verified by ablation experiments and the model was compared with recent scene text detection methods on several publicly available datasets.
format Online
Article
Text
id pubmed-9739706
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-97397062022-12-11 Attention-Based Scene Text Detection on Dual Feature Fusion Li, Yuze Silamu, Wushour Wang, Zhenchao Xu, Miaomiao Sensors (Basel) Article The segmentation-based scene text detection algorithm has advantages in scene text detection scenarios with arbitrary shape and extreme aspect ratio, depending on its pixel-level description and fine post-processing. However, the insufficient use of semantic and spatial information in the network limits the classification and positioning capabilities of the network. Existing scene text detection methods have the problem of losing important feature information in the process of extracting features from each network layer. To solve this problem, the Attention-based Dual Feature Fusion Model (ADFM) is proposed. The Bi-directional Feature Fusion Pyramid Module (BFM) first adds stronger semantic information to the higher-resolution feature maps through a top-down process and then reduces the aliasing effects generated by the previous process through a bottom-up process to enhance the representation of multi-scale text semantic information. Meanwhile, a position-sensitive Spatial Attention Module (SAM) is introduced in the intermediate process of two-stage feature fusion. It focuses on the one feature map with the highest resolution and strongest semantic features generated in the top-down process and weighs the spatial position weight by the relevance of text features, thus improving the sensitivity of the text detection network to text regions. The effectiveness of each module of ADFM was verified by ablation experiments and the model was compared with recent scene text detection methods on several publicly available datasets. MDPI 2022-11-23 /pmc/articles/PMC9739706/ /pubmed/36501774 http://dx.doi.org/10.3390/s22239072 Text en © 2022 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Li, Yuze
Silamu, Wushour
Wang, Zhenchao
Xu, Miaomiao
Attention-Based Scene Text Detection on Dual Feature Fusion
title Attention-Based Scene Text Detection on Dual Feature Fusion
title_full Attention-Based Scene Text Detection on Dual Feature Fusion
title_fullStr Attention-Based Scene Text Detection on Dual Feature Fusion
title_full_unstemmed Attention-Based Scene Text Detection on Dual Feature Fusion
title_short Attention-Based Scene Text Detection on Dual Feature Fusion
title_sort attention-based scene text detection on dual feature fusion
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9739706/
https://www.ncbi.nlm.nih.gov/pubmed/36501774
http://dx.doi.org/10.3390/s22239072
work_keys_str_mv AT liyuze attentionbasedscenetextdetectionondualfeaturefusion
AT silamuwushour attentionbasedscenetextdetectionondualfeaturefusion
AT wangzhenchao attentionbasedscenetextdetectionondualfeaturefusion
AT xumiaomiao attentionbasedscenetextdetectionondualfeaturefusion