Cargando…
AAF-Net: Scene text detection based on attention aggregation features
With the advent of the era of artificial intelligence, text detection is widely used in the real world. In text detection, due to the limitation of the receptive field of the neural network, most existing scene text detection methods cannot accurately detect small target text instances in any direct...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9355182/ https://www.ncbi.nlm.nih.gov/pubmed/35930564 http://dx.doi.org/10.1371/journal.pone.0272322 |
_version_ | 1784763235379970048 |
---|---|
author | Chen, Mengmeng Ibrayim, Mayire Hamdulla, Askar |
author_facet | Chen, Mengmeng Ibrayim, Mayire Hamdulla, Askar |
author_sort | Chen, Mengmeng |
collection | PubMed |
description | With the advent of the era of artificial intelligence, text detection is widely used in the real world. In text detection, due to the limitation of the receptive field of the neural network, most existing scene text detection methods cannot accurately detect small target text instances in any direction, and the detection rate of mutually adhering text instances is low, which is prone to false detection. To tackle such difficulties, in this paper, we propose a new feature pyramid network for scene text detection, Cross-Scale Attention Aggregation Feature Pyramid Network (CSAA-FPN). Specifically, we use a Attention Aggregation Feature Module (AAFM) to enhance features, which not only solves the problem of weak features and small receptive fields extracted by lightweight networks but also better handles multi-scale information and accurately separate adjacent text instances. An attention module CBAM is introduced to focus on effective information so that the output feature layer has richer and more accurate information. Furthermore, we design an Adaptive Fusion Module (AFM), which weights the output features and pays attention to the pixel information to further refine the features. Experiments conducted on CTW1500, Total-Text, ICDAR2015, and MSRA-TD500 have demonstrated the superiority of this model. |
format | Online Article Text |
id | pubmed-9355182 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-93551822022-08-06 AAF-Net: Scene text detection based on attention aggregation features Chen, Mengmeng Ibrayim, Mayire Hamdulla, Askar PLoS One Research Article With the advent of the era of artificial intelligence, text detection is widely used in the real world. In text detection, due to the limitation of the receptive field of the neural network, most existing scene text detection methods cannot accurately detect small target text instances in any direction, and the detection rate of mutually adhering text instances is low, which is prone to false detection. To tackle such difficulties, in this paper, we propose a new feature pyramid network for scene text detection, Cross-Scale Attention Aggregation Feature Pyramid Network (CSAA-FPN). Specifically, we use a Attention Aggregation Feature Module (AAFM) to enhance features, which not only solves the problem of weak features and small receptive fields extracted by lightweight networks but also better handles multi-scale information and accurately separate adjacent text instances. An attention module CBAM is introduced to focus on effective information so that the output feature layer has richer and more accurate information. Furthermore, we design an Adaptive Fusion Module (AFM), which weights the output features and pays attention to the pixel information to further refine the features. Experiments conducted on CTW1500, Total-Text, ICDAR2015, and MSRA-TD500 have demonstrated the superiority of this model. Public Library of Science 2022-08-05 /pmc/articles/PMC9355182/ /pubmed/35930564 http://dx.doi.org/10.1371/journal.pone.0272322 Text en © 2022 Chen et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. |
spellingShingle | Research Article Chen, Mengmeng Ibrayim, Mayire Hamdulla, Askar AAF-Net: Scene text detection based on attention aggregation features |
title | AAF-Net: Scene text detection based on attention aggregation features |
title_full | AAF-Net: Scene text detection based on attention aggregation features |
title_fullStr | AAF-Net: Scene text detection based on attention aggregation features |
title_full_unstemmed | AAF-Net: Scene text detection based on attention aggregation features |
title_short | AAF-Net: Scene text detection based on attention aggregation features |
title_sort | aaf-net: scene text detection based on attention aggregation features |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9355182/ https://www.ncbi.nlm.nih.gov/pubmed/35930564 http://dx.doi.org/10.1371/journal.pone.0272322 |
work_keys_str_mv | AT chenmengmeng aafnetscenetextdetectionbasedonattentionaggregationfeatures AT ibrayimmayire aafnetscenetextdetectionbasedonattentionaggregationfeatures AT hamdullaaskar aafnetscenetextdetectionbasedonattentionaggregationfeatures |