Cargando…

A Robust Visual Tracking Method Based on Reconstruction Patch Transformer Tracking

Recently, the transformer model has progressed from the field of visual classification to target tracking. Its primary method replaces the cross-correlation operation in the Siamese tracker. The backbone of the network is still a convolutional neural network (CNN). However, the existing transformer-...

Descripción completa

Detalles Bibliográficos
Autores principales:	Chen, Hui, Wang, Zhenhai, Tian, Hongyu, Yuan, Lutao, Wang, Xing, Leng, Peng
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	MDPI 2022
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9460596/ https://www.ncbi.nlm.nih.gov/pubmed/36081017 http://dx.doi.org/10.3390/s22176558

_version_	1784786785911439360
author	Chen, Hui Wang, Zhenhai Tian, Hongyu Yuan, Lutao Wang, Xing Leng, Peng
author_facet	Chen, Hui Wang, Zhenhai Tian, Hongyu Yuan, Lutao Wang, Xing Leng, Peng
author_sort	Chen, Hui
collection	PubMed
description	Recently, the transformer model has progressed from the field of visual classification to target tracking. Its primary method replaces the cross-correlation operation in the Siamese tracker. The backbone of the network is still a convolutional neural network (CNN). However, the existing transformer-based tracker simply deforms the features extracted by the CNN into patches and feeds them into the transformer encoder. Each patch contains a single element of the spatial dimension of the extracted features and inputs into the transformer structure to use cross-attention instead of cross-correlation operations. This paper proposes a reconstruction patch strategy which combines the extracted features with multiple elements of the spatial dimension into a new patch. The reconstruction operation has the following advantages: (1) the correlation between adjacent elements combines well, and the features extracted by the CNN are usable for classification and regression; (2) using the performer operation reduces the amount of network computation and the dimension of the patch sent to the transformer, thereby sharply reducing the network parameters and improving the model-tracking speed.
format	Online Article Text
id	pubmed-9460596
institution	National Center for Biotechnology Information
language	English
publishDate	2022
publisher	MDPI
record_format	MEDLINE/PubMed
spelling	pubmed-94605962022-09-10 A Robust Visual Tracking Method Based on Reconstruction Patch Transformer Tracking Chen, Hui Wang, Zhenhai Tian, Hongyu Yuan, Lutao Wang, Xing Leng, Peng Sensors (Basel) Article Recently, the transformer model has progressed from the field of visual classification to target tracking. Its primary method replaces the cross-correlation operation in the Siamese tracker. The backbone of the network is still a convolutional neural network (CNN). However, the existing transformer-based tracker simply deforms the features extracted by the CNN into patches and feeds them into the transformer encoder. Each patch contains a single element of the spatial dimension of the extracted features and inputs into the transformer structure to use cross-attention instead of cross-correlation operations. This paper proposes a reconstruction patch strategy which combines the extracted features with multiple elements of the spatial dimension into a new patch. The reconstruction operation has the following advantages: (1) the correlation between adjacent elements combines well, and the features extracted by the CNN are usable for classification and regression; (2) using the performer operation reduces the amount of network computation and the dimension of the patch sent to the transformer, thereby sharply reducing the network parameters and improving the model-tracking speed. MDPI 2022-08-31 /pmc/articles/PMC9460596/ /pubmed/36081017 http://dx.doi.org/10.3390/s22176558 Text en © 2022 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle	Article Chen, Hui Wang, Zhenhai Tian, Hongyu Yuan, Lutao Wang, Xing Leng, Peng A Robust Visual Tracking Method Based on Reconstruction Patch Transformer Tracking
title	A Robust Visual Tracking Method Based on Reconstruction Patch Transformer Tracking
title_full	A Robust Visual Tracking Method Based on Reconstruction Patch Transformer Tracking
title_fullStr	A Robust Visual Tracking Method Based on Reconstruction Patch Transformer Tracking
title_full_unstemmed	A Robust Visual Tracking Method Based on Reconstruction Patch Transformer Tracking
title_short	A Robust Visual Tracking Method Based on Reconstruction Patch Transformer Tracking
title_sort	robust visual tracking method based on reconstruction patch transformer tracking
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9460596/ https://www.ncbi.nlm.nih.gov/pubmed/36081017 http://dx.doi.org/10.3390/s22176558
work_keys_str_mv	AT chenhui arobustvisualtrackingmethodbasedonreconstructionpatchtransformertracking AT wangzhenhai arobustvisualtrackingmethodbasedonreconstructionpatchtransformertracking AT tianhongyu arobustvisualtrackingmethodbasedonreconstructionpatchtransformertracking AT yuanlutao arobustvisualtrackingmethodbasedonreconstructionpatchtransformertracking AT wangxing arobustvisualtrackingmethodbasedonreconstructionpatchtransformertracking AT lengpeng arobustvisualtrackingmethodbasedonreconstructionpatchtransformertracking AT chenhui robustvisualtrackingmethodbasedonreconstructionpatchtransformertracking AT wangzhenhai robustvisualtrackingmethodbasedonreconstructionpatchtransformertracking AT tianhongyu robustvisualtrackingmethodbasedonreconstructionpatchtransformertracking AT yuanlutao robustvisualtrackingmethodbasedonreconstructionpatchtransformertracking AT wangxing robustvisualtrackingmethodbasedonreconstructionpatchtransformertracking AT lengpeng robustvisualtrackingmethodbasedonreconstructionpatchtransformertracking

A Robust Visual Tracking Method Based on Reconstruction Patch Transformer Tracking

Ejemplares similares