Cargando…
Progressively Hybrid Transformer for Multi-Modal Vehicle Re-Identification
Multi-modal (i.e., visible, near-infrared, and thermal-infrared) vehicle re-identification has good potential to search vehicles of interest in low illumination. However, due to the fact that different modalities have varying imaging characteristics, a proper multi-modal complementary information fu...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10181439/ https://www.ncbi.nlm.nih.gov/pubmed/37177410 http://dx.doi.org/10.3390/s23094206 |
_version_ | 1785041574842859520 |
---|---|
author | Pan, Wenjie Huang, Linhan Liang, Jianbao Hong, Lan Zhu, Jianqing |
author_facet | Pan, Wenjie Huang, Linhan Liang, Jianbao Hong, Lan Zhu, Jianqing |
author_sort | Pan, Wenjie |
collection | PubMed |
description | Multi-modal (i.e., visible, near-infrared, and thermal-infrared) vehicle re-identification has good potential to search vehicles of interest in low illumination. However, due to the fact that different modalities have varying imaging characteristics, a proper multi-modal complementary information fusion is crucial to multi-modal vehicle re-identification. For that, this paper proposes a progressively hybrid transformer (PHT). The PHT method consists of two aspects: random hybrid augmentation (RHA) and a feature hybrid mechanism (FHM). Regarding RHA, an image random cropper and a local region hybrider are designed. The image random cropper simultaneously crops multi-modal images of random positions, random numbers, random sizes, and random aspect ratios to generate local regions. The local region hybrider fuses the cropped regions to let regions of each modal bring local structural characteristics of all modalities, mitigating modal differences at the beginning of feature learning. Regarding the FHM, a modal-specific controller and a modal information embedding are designed to effectively fuse multi-modal information at the feature level. Experimental results show the proposed method wins the state-of-the-art method by a larger 2.7% mAP on RGBNT100 and a larger 6.6% mAP on RGBN300, demonstrating that the proposed method can learn multi-modal complementary information effectively. |
format | Online Article Text |
id | pubmed-10181439 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-101814392023-05-13 Progressively Hybrid Transformer for Multi-Modal Vehicle Re-Identification Pan, Wenjie Huang, Linhan Liang, Jianbao Hong, Lan Zhu, Jianqing Sensors (Basel) Article Multi-modal (i.e., visible, near-infrared, and thermal-infrared) vehicle re-identification has good potential to search vehicles of interest in low illumination. However, due to the fact that different modalities have varying imaging characteristics, a proper multi-modal complementary information fusion is crucial to multi-modal vehicle re-identification. For that, this paper proposes a progressively hybrid transformer (PHT). The PHT method consists of two aspects: random hybrid augmentation (RHA) and a feature hybrid mechanism (FHM). Regarding RHA, an image random cropper and a local region hybrider are designed. The image random cropper simultaneously crops multi-modal images of random positions, random numbers, random sizes, and random aspect ratios to generate local regions. The local region hybrider fuses the cropped regions to let regions of each modal bring local structural characteristics of all modalities, mitigating modal differences at the beginning of feature learning. Regarding the FHM, a modal-specific controller and a modal information embedding are designed to effectively fuse multi-modal information at the feature level. Experimental results show the proposed method wins the state-of-the-art method by a larger 2.7% mAP on RGBNT100 and a larger 6.6% mAP on RGBN300, demonstrating that the proposed method can learn multi-modal complementary information effectively. MDPI 2023-04-23 /pmc/articles/PMC10181439/ /pubmed/37177410 http://dx.doi.org/10.3390/s23094206 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Pan, Wenjie Huang, Linhan Liang, Jianbao Hong, Lan Zhu, Jianqing Progressively Hybrid Transformer for Multi-Modal Vehicle Re-Identification |
title | Progressively Hybrid Transformer for Multi-Modal Vehicle Re-Identification |
title_full | Progressively Hybrid Transformer for Multi-Modal Vehicle Re-Identification |
title_fullStr | Progressively Hybrid Transformer for Multi-Modal Vehicle Re-Identification |
title_full_unstemmed | Progressively Hybrid Transformer for Multi-Modal Vehicle Re-Identification |
title_short | Progressively Hybrid Transformer for Multi-Modal Vehicle Re-Identification |
title_sort | progressively hybrid transformer for multi-modal vehicle re-identification |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10181439/ https://www.ncbi.nlm.nih.gov/pubmed/37177410 http://dx.doi.org/10.3390/s23094206 |
work_keys_str_mv | AT panwenjie progressivelyhybridtransformerformultimodalvehiclereidentification AT huanglinhan progressivelyhybridtransformerformultimodalvehiclereidentification AT liangjianbao progressivelyhybridtransformerformultimodalvehiclereidentification AT honglan progressivelyhybridtransformerformultimodalvehiclereidentification AT zhujianqing progressivelyhybridtransformerformultimodalvehiclereidentification |