Cargando…
A new multi-scale backbone network for object detection based on asymmetric convolutions
Real-time object detection on mobile platforms is a crucial but challenging computer vision task. However, it is widely recognized that although the lightweight object detectors have a high detection speed, the detection accuracy is relatively low. In order to improve detecting accuracy, it is benef...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
SAGE Publications
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10454949/ https://www.ncbi.nlm.nih.gov/pubmed/33881962 http://dx.doi.org/10.1177/00368504211011343 |
_version_ | 1785096331273961472 |
---|---|
author | Ma, Xianghua Yang, Zhenkun |
author_facet | Ma, Xianghua Yang, Zhenkun |
author_sort | Ma, Xianghua |
collection | PubMed |
description | Real-time object detection on mobile platforms is a crucial but challenging computer vision task. However, it is widely recognized that although the lightweight object detectors have a high detection speed, the detection accuracy is relatively low. In order to improve detecting accuracy, it is beneficial to extract complete multi-scale image features in visual cognitive tasks. Asymmetric convolutions have a useful quality, that is, they have different aspect ratios, which can be used to exact image features of objects, especially objects with multi-scale characteristics. In this paper, we exploit three different asymmetric convolutions in parallel and propose a new multi-scale asymmetric convolution unit, namely MAC block to enhance multi-scale representation ability of CNNs. In addition, MAC block can adaptively merge the features with different scales by allocating learnable weighted parameters to three different asymmetric convolution branches. The proposed MAC blocks can be inserted into the state-of-the-art backbone such as ResNet-50 to form a new multi-scale backbone network of object detectors. To evaluate the performance of MAC block, we conduct experiments on CIFAR-100, PASCAL VOC 2007, PASCAL VOC 2012 and MS COCO 2014 datasets. Experimental results show that the detection precision can be greatly improved while a fast detection speed is guaranteed as well. |
format | Online Article Text |
id | pubmed-10454949 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | SAGE Publications |
record_format | MEDLINE/PubMed |
spelling | pubmed-104549492023-08-26 A new multi-scale backbone network for object detection based on asymmetric convolutions Ma, Xianghua Yang, Zhenkun Sci Prog Article Real-time object detection on mobile platforms is a crucial but challenging computer vision task. However, it is widely recognized that although the lightweight object detectors have a high detection speed, the detection accuracy is relatively low. In order to improve detecting accuracy, it is beneficial to extract complete multi-scale image features in visual cognitive tasks. Asymmetric convolutions have a useful quality, that is, they have different aspect ratios, which can be used to exact image features of objects, especially objects with multi-scale characteristics. In this paper, we exploit three different asymmetric convolutions in parallel and propose a new multi-scale asymmetric convolution unit, namely MAC block to enhance multi-scale representation ability of CNNs. In addition, MAC block can adaptively merge the features with different scales by allocating learnable weighted parameters to three different asymmetric convolution branches. The proposed MAC blocks can be inserted into the state-of-the-art backbone such as ResNet-50 to form a new multi-scale backbone network of object detectors. To evaluate the performance of MAC block, we conduct experiments on CIFAR-100, PASCAL VOC 2007, PASCAL VOC 2012 and MS COCO 2014 datasets. Experimental results show that the detection precision can be greatly improved while a fast detection speed is guaranteed as well. SAGE Publications 2021-04-21 /pmc/articles/PMC10454949/ /pubmed/33881962 http://dx.doi.org/10.1177/00368504211011343 Text en © The Author(s) 2021 https://creativecommons.org/licenses/by-nc/4.0/This article is distributed under the terms of the Creative Commons Attribution-NonCommercial 4.0 License (https://creativecommons.org/licenses/by-nc/4.0/) which permits non-commercial use, reproduction and distribution of the work without further permission provided the original work is attributed as specified on the SAGE and Open Access pages (https://us.sagepub.com/en-us/nam/open-access-at-sage). |
spellingShingle | Article Ma, Xianghua Yang, Zhenkun A new multi-scale backbone network for object detection based on asymmetric convolutions |
title | A new multi-scale backbone network for object detection based on asymmetric convolutions |
title_full | A new multi-scale backbone network for object detection based on asymmetric convolutions |
title_fullStr | A new multi-scale backbone network for object detection based on asymmetric convolutions |
title_full_unstemmed | A new multi-scale backbone network for object detection based on asymmetric convolutions |
title_short | A new multi-scale backbone network for object detection based on asymmetric convolutions |
title_sort | new multi-scale backbone network for object detection based on asymmetric convolutions |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10454949/ https://www.ncbi.nlm.nih.gov/pubmed/33881962 http://dx.doi.org/10.1177/00368504211011343 |
work_keys_str_mv | AT maxianghua anewmultiscalebackbonenetworkforobjectdetectionbasedonasymmetricconvolutions AT yangzhenkun anewmultiscalebackbonenetworkforobjectdetectionbasedonasymmetricconvolutions AT maxianghua newmultiscalebackbonenetworkforobjectdetectionbasedonasymmetricconvolutions AT yangzhenkun newmultiscalebackbonenetworkforobjectdetectionbasedonasymmetricconvolutions |