Cargando…

Progressively Hybrid Transformer for Multi-Modal Vehicle Re-Identification

Multi-modal (i.e., visible, near-infrared, and thermal-infrared) vehicle re-identification has good potential to search vehicles of interest in low illumination. However, due to the fact that different modalities have varying imaging characteristics, a proper multi-modal complementary information fu...

Descripción completa

Detalles Bibliográficos
Autores principales: Pan, Wenjie, Huang, Linhan, Liang, Jianbao, Hong, Lan, Zhu, Jianqing
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10181439/
https://www.ncbi.nlm.nih.gov/pubmed/37177410
http://dx.doi.org/10.3390/s23094206
_version_ 1785041574842859520
author Pan, Wenjie
Huang, Linhan
Liang, Jianbao
Hong, Lan
Zhu, Jianqing
author_facet Pan, Wenjie
Huang, Linhan
Liang, Jianbao
Hong, Lan
Zhu, Jianqing
author_sort Pan, Wenjie
collection PubMed
description Multi-modal (i.e., visible, near-infrared, and thermal-infrared) vehicle re-identification has good potential to search vehicles of interest in low illumination. However, due to the fact that different modalities have varying imaging characteristics, a proper multi-modal complementary information fusion is crucial to multi-modal vehicle re-identification. For that, this paper proposes a progressively hybrid transformer (PHT). The PHT method consists of two aspects: random hybrid augmentation (RHA) and a feature hybrid mechanism (FHM). Regarding RHA, an image random cropper and a local region hybrider are designed. The image random cropper simultaneously crops multi-modal images of random positions, random numbers, random sizes, and random aspect ratios to generate local regions. The local region hybrider fuses the cropped regions to let regions of each modal bring local structural characteristics of all modalities, mitigating modal differences at the beginning of feature learning. Regarding the FHM, a modal-specific controller and a modal information embedding are designed to effectively fuse multi-modal information at the feature level. Experimental results show the proposed method wins the state-of-the-art method by a larger 2.7% mAP on RGBNT100 and a larger 6.6% mAP on RGBN300, demonstrating that the proposed method can learn multi-modal complementary information effectively.
format Online
Article
Text
id pubmed-10181439
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-101814392023-05-13 Progressively Hybrid Transformer for Multi-Modal Vehicle Re-Identification Pan, Wenjie Huang, Linhan Liang, Jianbao Hong, Lan Zhu, Jianqing Sensors (Basel) Article Multi-modal (i.e., visible, near-infrared, and thermal-infrared) vehicle re-identification has good potential to search vehicles of interest in low illumination. However, due to the fact that different modalities have varying imaging characteristics, a proper multi-modal complementary information fusion is crucial to multi-modal vehicle re-identification. For that, this paper proposes a progressively hybrid transformer (PHT). The PHT method consists of two aspects: random hybrid augmentation (RHA) and a feature hybrid mechanism (FHM). Regarding RHA, an image random cropper and a local region hybrider are designed. The image random cropper simultaneously crops multi-modal images of random positions, random numbers, random sizes, and random aspect ratios to generate local regions. The local region hybrider fuses the cropped regions to let regions of each modal bring local structural characteristics of all modalities, mitigating modal differences at the beginning of feature learning. Regarding the FHM, a modal-specific controller and a modal information embedding are designed to effectively fuse multi-modal information at the feature level. Experimental results show the proposed method wins the state-of-the-art method by a larger 2.7% mAP on RGBNT100 and a larger 6.6% mAP on RGBN300, demonstrating that the proposed method can learn multi-modal complementary information effectively. MDPI 2023-04-23 /pmc/articles/PMC10181439/ /pubmed/37177410 http://dx.doi.org/10.3390/s23094206 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Pan, Wenjie
Huang, Linhan
Liang, Jianbao
Hong, Lan
Zhu, Jianqing
Progressively Hybrid Transformer for Multi-Modal Vehicle Re-Identification
title Progressively Hybrid Transformer for Multi-Modal Vehicle Re-Identification
title_full Progressively Hybrid Transformer for Multi-Modal Vehicle Re-Identification
title_fullStr Progressively Hybrid Transformer for Multi-Modal Vehicle Re-Identification
title_full_unstemmed Progressively Hybrid Transformer for Multi-Modal Vehicle Re-Identification
title_short Progressively Hybrid Transformer for Multi-Modal Vehicle Re-Identification
title_sort progressively hybrid transformer for multi-modal vehicle re-identification
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10181439/
https://www.ncbi.nlm.nih.gov/pubmed/37177410
http://dx.doi.org/10.3390/s23094206
work_keys_str_mv AT panwenjie progressivelyhybridtransformerformultimodalvehiclereidentification
AT huanglinhan progressivelyhybridtransformerformultimodalvehiclereidentification
AT liangjianbao progressivelyhybridtransformerformultimodalvehiclereidentification
AT honglan progressivelyhybridtransformerformultimodalvehiclereidentification
AT zhujianqing progressivelyhybridtransformerformultimodalvehiclereidentification