Cargando…

AAF-Net: Scene text detection based on attention aggregation features

With the advent of the era of artificial intelligence, text detection is widely used in the real world. In text detection, due to the limitation of the receptive field of the neural network, most existing scene text detection methods cannot accurately detect small target text instances in any direct...

Descripción completa

Detalles Bibliográficos
Autores principales: Chen, Mengmeng, Ibrayim, Mayire, Hamdulla, Askar
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9355182/
https://www.ncbi.nlm.nih.gov/pubmed/35930564
http://dx.doi.org/10.1371/journal.pone.0272322
_version_ 1784763235379970048
author Chen, Mengmeng
Ibrayim, Mayire
Hamdulla, Askar
author_facet Chen, Mengmeng
Ibrayim, Mayire
Hamdulla, Askar
author_sort Chen, Mengmeng
collection PubMed
description With the advent of the era of artificial intelligence, text detection is widely used in the real world. In text detection, due to the limitation of the receptive field of the neural network, most existing scene text detection methods cannot accurately detect small target text instances in any direction, and the detection rate of mutually adhering text instances is low, which is prone to false detection. To tackle such difficulties, in this paper, we propose a new feature pyramid network for scene text detection, Cross-Scale Attention Aggregation Feature Pyramid Network (CSAA-FPN). Specifically, we use a Attention Aggregation Feature Module (AAFM) to enhance features, which not only solves the problem of weak features and small receptive fields extracted by lightweight networks but also better handles multi-scale information and accurately separate adjacent text instances. An attention module CBAM is introduced to focus on effective information so that the output feature layer has richer and more accurate information. Furthermore, we design an Adaptive Fusion Module (AFM), which weights the output features and pays attention to the pixel information to further refine the features. Experiments conducted on CTW1500, Total-Text, ICDAR2015, and MSRA-TD500 have demonstrated the superiority of this model.
format Online
Article
Text
id pubmed-9355182
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-93551822022-08-06 AAF-Net: Scene text detection based on attention aggregation features Chen, Mengmeng Ibrayim, Mayire Hamdulla, Askar PLoS One Research Article With the advent of the era of artificial intelligence, text detection is widely used in the real world. In text detection, due to the limitation of the receptive field of the neural network, most existing scene text detection methods cannot accurately detect small target text instances in any direction, and the detection rate of mutually adhering text instances is low, which is prone to false detection. To tackle such difficulties, in this paper, we propose a new feature pyramid network for scene text detection, Cross-Scale Attention Aggregation Feature Pyramid Network (CSAA-FPN). Specifically, we use a Attention Aggregation Feature Module (AAFM) to enhance features, which not only solves the problem of weak features and small receptive fields extracted by lightweight networks but also better handles multi-scale information and accurately separate adjacent text instances. An attention module CBAM is introduced to focus on effective information so that the output feature layer has richer and more accurate information. Furthermore, we design an Adaptive Fusion Module (AFM), which weights the output features and pays attention to the pixel information to further refine the features. Experiments conducted on CTW1500, Total-Text, ICDAR2015, and MSRA-TD500 have demonstrated the superiority of this model. Public Library of Science 2022-08-05 /pmc/articles/PMC9355182/ /pubmed/35930564 http://dx.doi.org/10.1371/journal.pone.0272322 Text en © 2022 Chen et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Chen, Mengmeng
Ibrayim, Mayire
Hamdulla, Askar
AAF-Net: Scene text detection based on attention aggregation features
title AAF-Net: Scene text detection based on attention aggregation features
title_full AAF-Net: Scene text detection based on attention aggregation features
title_fullStr AAF-Net: Scene text detection based on attention aggregation features
title_full_unstemmed AAF-Net: Scene text detection based on attention aggregation features
title_short AAF-Net: Scene text detection based on attention aggregation features
title_sort aaf-net: scene text detection based on attention aggregation features
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9355182/
https://www.ncbi.nlm.nih.gov/pubmed/35930564
http://dx.doi.org/10.1371/journal.pone.0272322
work_keys_str_mv AT chenmengmeng aafnetscenetextdetectionbasedonattentionaggregationfeatures
AT ibrayimmayire aafnetscenetextdetectionbasedonattentionaggregationfeatures
AT hamdullaaskar aafnetscenetextdetectionbasedonattentionaggregationfeatures