Cargando…

Thermal Infrared Tracking Method Based on Efficient Global Information Perception

To solve the insufficient ability of the current Thermal InfraRed (TIR) tracking methods to resist occlusion and interference from similar targets, we propose a TIR tracking method based on efficient global information perception. In order to efficiently obtain the global semantic information of ima...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhao, Long, Liu, Xiaoye, Ren, Honge, Xue, Lingjixuan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9570693/
https://www.ncbi.nlm.nih.gov/pubmed/36236505
http://dx.doi.org/10.3390/s22197408
_version_ 1784810173669310464
author Zhao, Long
Liu, Xiaoye
Ren, Honge
Xue, Lingjixuan
author_facet Zhao, Long
Liu, Xiaoye
Ren, Honge
Xue, Lingjixuan
author_sort Zhao, Long
collection PubMed
description To solve the insufficient ability of the current Thermal InfraRed (TIR) tracking methods to resist occlusion and interference from similar targets, we propose a TIR tracking method based on efficient global information perception. In order to efficiently obtain the global semantic information of images, we use the Transformer structure for feature extraction and fusion. In the feature extraction process, the Focal Transformer structure is used to improve the efficiency of remote information modeling, which is highly similar to the human attention mechanism. The feature fusion process supplements the relative position encoding to the standard Transformer structure, which allows the model to continuously consider the influence of positional relationships during the learning process. It can also generalize to capture the different positional information for different input sequences. Thus, it makes the Transformer structure model the semantic information contained in images more efficiently. To further improve the tracking accuracy and robustness, the heterogeneous bi-prediction head is utilized in the object prediction process. The fully connected sub-network is responsible for the classification prediction of the foreground or background. The convolutional sub-network is responsible for the regression prediction of the object bounding box. In order to alleviate the contradiction between the vast demand for training data of the Transformer model and the insufficient scale of the TIR tracking dataset, the LaSOT-TIR dataset is generated with the generative adversarial network for network training. Our method achieves the best performance compared with other state-of-the-art trackers on the VOT2015-TIR, VOT2017-TIR, PTB-TIR and LSOTB-TIR datasets, and performs outstandingly especially when dealing with severe occlusion or interference from similar objects.
format Online
Article
Text
id pubmed-9570693
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-95706932022-10-17 Thermal Infrared Tracking Method Based on Efficient Global Information Perception Zhao, Long Liu, Xiaoye Ren, Honge Xue, Lingjixuan Sensors (Basel) Article To solve the insufficient ability of the current Thermal InfraRed (TIR) tracking methods to resist occlusion and interference from similar targets, we propose a TIR tracking method based on efficient global information perception. In order to efficiently obtain the global semantic information of images, we use the Transformer structure for feature extraction and fusion. In the feature extraction process, the Focal Transformer structure is used to improve the efficiency of remote information modeling, which is highly similar to the human attention mechanism. The feature fusion process supplements the relative position encoding to the standard Transformer structure, which allows the model to continuously consider the influence of positional relationships during the learning process. It can also generalize to capture the different positional information for different input sequences. Thus, it makes the Transformer structure model the semantic information contained in images more efficiently. To further improve the tracking accuracy and robustness, the heterogeneous bi-prediction head is utilized in the object prediction process. The fully connected sub-network is responsible for the classification prediction of the foreground or background. The convolutional sub-network is responsible for the regression prediction of the object bounding box. In order to alleviate the contradiction between the vast demand for training data of the Transformer model and the insufficient scale of the TIR tracking dataset, the LaSOT-TIR dataset is generated with the generative adversarial network for network training. Our method achieves the best performance compared with other state-of-the-art trackers on the VOT2015-TIR, VOT2017-TIR, PTB-TIR and LSOTB-TIR datasets, and performs outstandingly especially when dealing with severe occlusion or interference from similar objects. MDPI 2022-09-29 /pmc/articles/PMC9570693/ /pubmed/36236505 http://dx.doi.org/10.3390/s22197408 Text en © 2022 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Zhao, Long
Liu, Xiaoye
Ren, Honge
Xue, Lingjixuan
Thermal Infrared Tracking Method Based on Efficient Global Information Perception
title Thermal Infrared Tracking Method Based on Efficient Global Information Perception
title_full Thermal Infrared Tracking Method Based on Efficient Global Information Perception
title_fullStr Thermal Infrared Tracking Method Based on Efficient Global Information Perception
title_full_unstemmed Thermal Infrared Tracking Method Based on Efficient Global Information Perception
title_short Thermal Infrared Tracking Method Based on Efficient Global Information Perception
title_sort thermal infrared tracking method based on efficient global information perception
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9570693/
https://www.ncbi.nlm.nih.gov/pubmed/36236505
http://dx.doi.org/10.3390/s22197408
work_keys_str_mv AT zhaolong thermalinfraredtrackingmethodbasedonefficientglobalinformationperception
AT liuxiaoye thermalinfraredtrackingmethodbasedonefficientglobalinformationperception
AT renhonge thermalinfraredtrackingmethodbasedonefficientglobalinformationperception
AT xuelingjixuan thermalinfraredtrackingmethodbasedonefficientglobalinformationperception