Cargando…

Object detection based on an adaptive attention mechanism

Object detection is an important component of computer vision. Most of the recent successful object detection methods are based on convolutional neural networks (CNNs). To improve the performance of these networks, researchers have designed many different architectures. They found that the CNN perfo...

Descripción completa

Detalles Bibliográficos
Autores principales:	Li, Wei, Liu, Kai, Zhang, Lizhe, Cheng, Fei
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Nature Publishing Group UK 2020
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7347846/ https://www.ncbi.nlm.nih.gov/pubmed/32647299 http://dx.doi.org/10.1038/s41598-020-67529-x

_version_	1783556668912566272
author	Li, Wei Liu, Kai Zhang, Lizhe Cheng, Fei
author_facet	Li, Wei Liu, Kai Zhang, Lizhe Cheng, Fei
author_sort	Li, Wei
collection	PubMed
description	Object detection is an important component of computer vision. Most of the recent successful object detection methods are based on convolutional neural networks (CNNs). To improve the performance of these networks, researchers have designed many different architectures. They found that the CNN performance benefits from carefully increasing the depth and width of their structures with respect to the spatial dimension. Some researchers have exploited the cardinality dimension. Others have found that skip and dense connections were also of benefit to performance. Recently, attention mechanisms on the channel dimension have gained popularity with researchers. Global average pooling is used in SENet to generate the input feature vector of the channel-wise attention unit. In this work, we argue that channel-wise attention can benefit from both global average pooling and global max pooling. We designed three novel attention units, namely, an adaptive channel-wise attention unit, an adaptive spatial-wise attention unit and an adaptive domain attention unit, to improve the performance of a CNN. Instead of concatenating the output of the two attention vectors generated by the two channel-wise attention sub-units, we weight the two attention vectors based on the output data of the two channel-wise attention sub-units. We integrated the proposed mechanism with the YOLOv3 and MobileNetv2 framework and tested the proposed network on the KITTI and Pascal VOC datasets. The experimental results show that YOLOv3 with the proposed attention mechanism outperforms the original YOLOv3 by mAP values of 2.9 and 1.2% on the KITTI and Pascal VOC datasets, respectively. MobileNetv2 with the proposed attention mechanism outperforms the original MobileNetv2 by a mAP value of 1.7% on the Pascal VOC dataset.
format	Online Article Text
id	pubmed-7347846
institution	National Center for Biotechnology Information
language	English
publishDate	2020
publisher	Nature Publishing Group UK
record_format	MEDLINE/PubMed
spelling	pubmed-73478462020-07-10 Object detection based on an adaptive attention mechanism Li, Wei Liu, Kai Zhang, Lizhe Cheng, Fei Sci Rep Article Object detection is an important component of computer vision. Most of the recent successful object detection methods are based on convolutional neural networks (CNNs). To improve the performance of these networks, researchers have designed many different architectures. They found that the CNN performance benefits from carefully increasing the depth and width of their structures with respect to the spatial dimension. Some researchers have exploited the cardinality dimension. Others have found that skip and dense connections were also of benefit to performance. Recently, attention mechanisms on the channel dimension have gained popularity with researchers. Global average pooling is used in SENet to generate the input feature vector of the channel-wise attention unit. In this work, we argue that channel-wise attention can benefit from both global average pooling and global max pooling. We designed three novel attention units, namely, an adaptive channel-wise attention unit, an adaptive spatial-wise attention unit and an adaptive domain attention unit, to improve the performance of a CNN. Instead of concatenating the output of the two attention vectors generated by the two channel-wise attention sub-units, we weight the two attention vectors based on the output data of the two channel-wise attention sub-units. We integrated the proposed mechanism with the YOLOv3 and MobileNetv2 framework and tested the proposed network on the KITTI and Pascal VOC datasets. The experimental results show that YOLOv3 with the proposed attention mechanism outperforms the original YOLOv3 by mAP values of 2.9 and 1.2% on the KITTI and Pascal VOC datasets, respectively. MobileNetv2 with the proposed attention mechanism outperforms the original MobileNetv2 by a mAP value of 1.7% on the Pascal VOC dataset. Nature Publishing Group UK 2020-07-09 /pmc/articles/PMC7347846/ /pubmed/32647299 http://dx.doi.org/10.1038/s41598-020-67529-x Text en © The Author(s) 2020 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
spellingShingle	Article Li, Wei Liu, Kai Zhang, Lizhe Cheng, Fei Object detection based on an adaptive attention mechanism
title	Object detection based on an adaptive attention mechanism
title_full	Object detection based on an adaptive attention mechanism
title_fullStr	Object detection based on an adaptive attention mechanism
title_full_unstemmed	Object detection based on an adaptive attention mechanism
title_short	Object detection based on an adaptive attention mechanism
title_sort	object detection based on an adaptive attention mechanism
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7347846/ https://www.ncbi.nlm.nih.gov/pubmed/32647299 http://dx.doi.org/10.1038/s41598-020-67529-x
work_keys_str_mv	AT liwei objectdetectionbasedonanadaptiveattentionmechanism AT liukai objectdetectionbasedonanadaptiveattentionmechanism AT zhanglizhe objectdetectionbasedonanadaptiveattentionmechanism AT chengfei objectdetectionbasedonanadaptiveattentionmechanism

Object detection based on an adaptive attention mechanism

Ejemplares similares