Cargando…

Attention-Based Scene Text Detection on Dual Feature Fusion

The segmentation-based scene text detection algorithm has advantages in scene text detection scenarios with arbitrary shape and extreme aspect ratio, depending on its pixel-level description and fine post-processing. However, the insufficient use of semantic and spatial information in the network li...

Descripción completa

Detalles Bibliográficos
Autores principales:	Li, Yuze, Silamu, Wushour, Wang, Zhenchao, Xu, Miaomiao
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	MDPI 2022
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9739706/ https://www.ncbi.nlm.nih.gov/pubmed/36501774 http://dx.doi.org/10.3390/s22239072

_version_	1784847874789474304
author	Li, Yuze Silamu, Wushour Wang, Zhenchao Xu, Miaomiao
author_facet	Li, Yuze Silamu, Wushour Wang, Zhenchao Xu, Miaomiao
author_sort	Li, Yuze
collection	PubMed
description	The segmentation-based scene text detection algorithm has advantages in scene text detection scenarios with arbitrary shape and extreme aspect ratio, depending on its pixel-level description and fine post-processing. However, the insufficient use of semantic and spatial information in the network limits the classification and positioning capabilities of the network. Existing scene text detection methods have the problem of losing important feature information in the process of extracting features from each network layer. To solve this problem, the Attention-based Dual Feature Fusion Model (ADFM) is proposed. The Bi-directional Feature Fusion Pyramid Module (BFM) first adds stronger semantic information to the higher-resolution feature maps through a top-down process and then reduces the aliasing effects generated by the previous process through a bottom-up process to enhance the representation of multi-scale text semantic information. Meanwhile, a position-sensitive Spatial Attention Module (SAM) is introduced in the intermediate process of two-stage feature fusion. It focuses on the one feature map with the highest resolution and strongest semantic features generated in the top-down process and weighs the spatial position weight by the relevance of text features, thus improving the sensitivity of the text detection network to text regions. The effectiveness of each module of ADFM was verified by ablation experiments and the model was compared with recent scene text detection methods on several publicly available datasets.
format	Online Article Text
id	pubmed-9739706
institution	National Center for Biotechnology Information
language	English
publishDate	2022
publisher	MDPI
record_format	MEDLINE/PubMed
spelling	pubmed-97397062022-12-11 Attention-Based Scene Text Detection on Dual Feature Fusion Li, Yuze Silamu, Wushour Wang, Zhenchao Xu, Miaomiao Sensors (Basel) Article The segmentation-based scene text detection algorithm has advantages in scene text detection scenarios with arbitrary shape and extreme aspect ratio, depending on its pixel-level description and fine post-processing. However, the insufficient use of semantic and spatial information in the network limits the classification and positioning capabilities of the network. Existing scene text detection methods have the problem of losing important feature information in the process of extracting features from each network layer. To solve this problem, the Attention-based Dual Feature Fusion Model (ADFM) is proposed. The Bi-directional Feature Fusion Pyramid Module (BFM) first adds stronger semantic information to the higher-resolution feature maps through a top-down process and then reduces the aliasing effects generated by the previous process through a bottom-up process to enhance the representation of multi-scale text semantic information. Meanwhile, a position-sensitive Spatial Attention Module (SAM) is introduced in the intermediate process of two-stage feature fusion. It focuses on the one feature map with the highest resolution and strongest semantic features generated in the top-down process and weighs the spatial position weight by the relevance of text features, thus improving the sensitivity of the text detection network to text regions. The effectiveness of each module of ADFM was verified by ablation experiments and the model was compared with recent scene text detection methods on several publicly available datasets. MDPI 2022-11-23 /pmc/articles/PMC9739706/ /pubmed/36501774 http://dx.doi.org/10.3390/s22239072 Text en © 2022 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle	Article Li, Yuze Silamu, Wushour Wang, Zhenchao Xu, Miaomiao Attention-Based Scene Text Detection on Dual Feature Fusion
title	Attention-Based Scene Text Detection on Dual Feature Fusion
title_full	Attention-Based Scene Text Detection on Dual Feature Fusion
title_fullStr	Attention-Based Scene Text Detection on Dual Feature Fusion
title_full_unstemmed	Attention-Based Scene Text Detection on Dual Feature Fusion
title_short	Attention-Based Scene Text Detection on Dual Feature Fusion
title_sort	attention-based scene text detection on dual feature fusion
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9739706/ https://www.ncbi.nlm.nih.gov/pubmed/36501774 http://dx.doi.org/10.3390/s22239072
work_keys_str_mv	AT liyuze attentionbasedscenetextdetectionondualfeaturefusion AT silamuwushour attentionbasedscenetextdetectionondualfeaturefusion AT wangzhenchao attentionbasedscenetextdetectionondualfeaturefusion AT xumiaomiao attentionbasedscenetextdetectionondualfeaturefusion

Attention-Based Scene Text Detection on Dual Feature Fusion

Ejemplares similares