Cargando…
Attention-Based Scene Text Detection on Dual Feature Fusion
The segmentation-based scene text detection algorithm has advantages in scene text detection scenarios with arbitrary shape and extreme aspect ratio, depending on its pixel-level description and fine post-processing. However, the insufficient use of semantic and spatial information in the network li...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9739706/ https://www.ncbi.nlm.nih.gov/pubmed/36501774 http://dx.doi.org/10.3390/s22239072 |
_version_ | 1784847874789474304 |
---|---|
author | Li, Yuze Silamu, Wushour Wang, Zhenchao Xu, Miaomiao |
author_facet | Li, Yuze Silamu, Wushour Wang, Zhenchao Xu, Miaomiao |
author_sort | Li, Yuze |
collection | PubMed |
description | The segmentation-based scene text detection algorithm has advantages in scene text detection scenarios with arbitrary shape and extreme aspect ratio, depending on its pixel-level description and fine post-processing. However, the insufficient use of semantic and spatial information in the network limits the classification and positioning capabilities of the network. Existing scene text detection methods have the problem of losing important feature information in the process of extracting features from each network layer. To solve this problem, the Attention-based Dual Feature Fusion Model (ADFM) is proposed. The Bi-directional Feature Fusion Pyramid Module (BFM) first adds stronger semantic information to the higher-resolution feature maps through a top-down process and then reduces the aliasing effects generated by the previous process through a bottom-up process to enhance the representation of multi-scale text semantic information. Meanwhile, a position-sensitive Spatial Attention Module (SAM) is introduced in the intermediate process of two-stage feature fusion. It focuses on the one feature map with the highest resolution and strongest semantic features generated in the top-down process and weighs the spatial position weight by the relevance of text features, thus improving the sensitivity of the text detection network to text regions. The effectiveness of each module of ADFM was verified by ablation experiments and the model was compared with recent scene text detection methods on several publicly available datasets. |
format | Online Article Text |
id | pubmed-9739706 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-97397062022-12-11 Attention-Based Scene Text Detection on Dual Feature Fusion Li, Yuze Silamu, Wushour Wang, Zhenchao Xu, Miaomiao Sensors (Basel) Article The segmentation-based scene text detection algorithm has advantages in scene text detection scenarios with arbitrary shape and extreme aspect ratio, depending on its pixel-level description and fine post-processing. However, the insufficient use of semantic and spatial information in the network limits the classification and positioning capabilities of the network. Existing scene text detection methods have the problem of losing important feature information in the process of extracting features from each network layer. To solve this problem, the Attention-based Dual Feature Fusion Model (ADFM) is proposed. The Bi-directional Feature Fusion Pyramid Module (BFM) first adds stronger semantic information to the higher-resolution feature maps through a top-down process and then reduces the aliasing effects generated by the previous process through a bottom-up process to enhance the representation of multi-scale text semantic information. Meanwhile, a position-sensitive Spatial Attention Module (SAM) is introduced in the intermediate process of two-stage feature fusion. It focuses on the one feature map with the highest resolution and strongest semantic features generated in the top-down process and weighs the spatial position weight by the relevance of text features, thus improving the sensitivity of the text detection network to text regions. The effectiveness of each module of ADFM was verified by ablation experiments and the model was compared with recent scene text detection methods on several publicly available datasets. MDPI 2022-11-23 /pmc/articles/PMC9739706/ /pubmed/36501774 http://dx.doi.org/10.3390/s22239072 Text en © 2022 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Li, Yuze Silamu, Wushour Wang, Zhenchao Xu, Miaomiao Attention-Based Scene Text Detection on Dual Feature Fusion |
title | Attention-Based Scene Text Detection on Dual Feature Fusion |
title_full | Attention-Based Scene Text Detection on Dual Feature Fusion |
title_fullStr | Attention-Based Scene Text Detection on Dual Feature Fusion |
title_full_unstemmed | Attention-Based Scene Text Detection on Dual Feature Fusion |
title_short | Attention-Based Scene Text Detection on Dual Feature Fusion |
title_sort | attention-based scene text detection on dual feature fusion |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9739706/ https://www.ncbi.nlm.nih.gov/pubmed/36501774 http://dx.doi.org/10.3390/s22239072 |
work_keys_str_mv | AT liyuze attentionbasedscenetextdetectionondualfeaturefusion AT silamuwushour attentionbasedscenetextdetectionondualfeaturefusion AT wangzhenchao attentionbasedscenetextdetectionondualfeaturefusion AT xumiaomiao attentionbasedscenetextdetectionondualfeaturefusion |