Cargando…

Integration of Multi-Head Self-Attention and Convolution for Person Re-Identification

Person re-identification is essential to intelligent video analytics, whose results affect downstream tasks such as behavior and event analysis. However, most existing models only consider the accuracy, rather than the computational complexity, which is also an aspect to consider in practical deploy...

Descripción completa

Detalles Bibliográficos
Autores principales:	Zhou, Yalei, Liu, Peng, Cui, Yue, Liu, Chunguang, Duan, Wenli
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	MDPI 2022
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9414396/ https://www.ncbi.nlm.nih.gov/pubmed/36016054 http://dx.doi.org/10.3390/s22166293

_version_	1784775976864972800
author	Zhou, Yalei Liu, Peng Cui, Yue Liu, Chunguang Duan, Wenli
author_facet	Zhou, Yalei Liu, Peng Cui, Yue Liu, Chunguang Duan, Wenli
author_sort	Zhou, Yalei
collection	PubMed
description	Person re-identification is essential to intelligent video analytics, whose results affect downstream tasks such as behavior and event analysis. However, most existing models only consider the accuracy, rather than the computational complexity, which is also an aspect to consider in practical deployment. We note that self-attention is a powerful technique for representation learning. It can work with convolution to learn more discriminative feature representations for re-identification. We propose an improved multi-scale feature learning structure, DM-OSNet, with better performance than the original OSNet. Our DM-OSNet replaces the [Formula: see text] convolutional stream in OSNet with multi-head self-attention. To maintain model efficiency, we use double-layer multi-head self-attention to reduce the computational complexity of the original multi-head self-attention. The computational complexity is reduced from the original [Formula: see text] to [Formula: see text]. To further improve the model performance, we use SpCL to perform unsupervised pre-training on the large-scale unlabeled pedestrian dataset LUPerson. Finally, our DM-OSNet achieves an mAP of 87.36%, 78.26%, 72.96%, and 57.13% on the Market1501, DukeMTMC-reID, CUHK03, and MSMT17 datasets.
format	Online Article Text
id	pubmed-9414396
institution	National Center for Biotechnology Information
language	English
publishDate	2022
publisher	MDPI
record_format	MEDLINE/PubMed
spelling	pubmed-94143962022-08-27 Integration of Multi-Head Self-Attention and Convolution for Person Re-Identification Zhou, Yalei Liu, Peng Cui, Yue Liu, Chunguang Duan, Wenli Sensors (Basel) Article Person re-identification is essential to intelligent video analytics, whose results affect downstream tasks such as behavior and event analysis. However, most existing models only consider the accuracy, rather than the computational complexity, which is also an aspect to consider in practical deployment. We note that self-attention is a powerful technique for representation learning. It can work with convolution to learn more discriminative feature representations for re-identification. We propose an improved multi-scale feature learning structure, DM-OSNet, with better performance than the original OSNet. Our DM-OSNet replaces the [Formula: see text] convolutional stream in OSNet with multi-head self-attention. To maintain model efficiency, we use double-layer multi-head self-attention to reduce the computational complexity of the original multi-head self-attention. The computational complexity is reduced from the original [Formula: see text] to [Formula: see text]. To further improve the model performance, we use SpCL to perform unsupervised pre-training on the large-scale unlabeled pedestrian dataset LUPerson. Finally, our DM-OSNet achieves an mAP of 87.36%, 78.26%, 72.96%, and 57.13% on the Market1501, DukeMTMC-reID, CUHK03, and MSMT17 datasets. MDPI 2022-08-21 /pmc/articles/PMC9414396/ /pubmed/36016054 http://dx.doi.org/10.3390/s22166293 Text en © 2022 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle	Article Zhou, Yalei Liu, Peng Cui, Yue Liu, Chunguang Duan, Wenli Integration of Multi-Head Self-Attention and Convolution for Person Re-Identification
title	Integration of Multi-Head Self-Attention and Convolution for Person Re-Identification
title_full	Integration of Multi-Head Self-Attention and Convolution for Person Re-Identification
title_fullStr	Integration of Multi-Head Self-Attention and Convolution for Person Re-Identification
title_full_unstemmed	Integration of Multi-Head Self-Attention and Convolution for Person Re-Identification
title_short	Integration of Multi-Head Self-Attention and Convolution for Person Re-Identification
title_sort	integration of multi-head self-attention and convolution for person re-identification
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9414396/ https://www.ncbi.nlm.nih.gov/pubmed/36016054 http://dx.doi.org/10.3390/s22166293
work_keys_str_mv	AT zhouyalei integrationofmultiheadselfattentionandconvolutionforpersonreidentification AT liupeng integrationofmultiheadselfattentionandconvolutionforpersonreidentification AT cuiyue integrationofmultiheadselfattentionandconvolutionforpersonreidentification AT liuchunguang integrationofmultiheadselfattentionandconvolutionforpersonreidentification AT duanwenli integrationofmultiheadselfattentionandconvolutionforpersonreidentification

Integration of Multi-Head Self-Attention and Convolution for Person Re-Identification

Ejemplares similares