Cargando…

CSMOT: Make One-Shot Multi-Object Tracking in Crowded Scenes Great Again †

The current popular one-shot multi-object tracking (MOT) algorithms are dominated by the joint detection and embedding paradigm, which have high inference speeds and accuracy, but their tracking performance is unstable in crowded scenes. Not only does the detection branch have difficulty in obtainin...

Descripción completa

Detalles Bibliográficos
Autores principales: Hou, Haoxiong, Shen, Chao, Zhang, Ximing, Gao, Wei
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10098982/
https://www.ncbi.nlm.nih.gov/pubmed/37050842
http://dx.doi.org/10.3390/s23073782
_version_ 1785024947042648064
author Hou, Haoxiong
Shen, Chao
Zhang, Ximing
Gao, Wei
author_facet Hou, Haoxiong
Shen, Chao
Zhang, Ximing
Gao, Wei
author_sort Hou, Haoxiong
collection PubMed
description The current popular one-shot multi-object tracking (MOT) algorithms are dominated by the joint detection and embedding paradigm, which have high inference speeds and accuracy, but their tracking performance is unstable in crowded scenes. Not only does the detection branch have difficulty in obtaining the accurate object position, but the ambiguous appearance of features extracted by the re-identification (re-ID) branch also leads to identity switches. Focusing on the above problems, this paper proposes a more robust MOT algorithm, named CSMOT, based on FairMOT. First, on the basis of the encoder–decoder network, a coordinate attention module is designed to enhance the information interaction between channels (horizontal and vertical coordinates), which improves its object-detection abilities. Then, an angle-center loss that effectively maximizes intra-class similarity is proposed to optimize the re-ID branch, and the extracted re-ID features are made more discriminative. We further redesign the re-ID feature dimension to balance the detection and re-ID tasks. Finally, a simple and effective data association mechanism is introduced, which associates each detection instead of just the high-score detections during the tracking process. The experimental results show that our one-shot MOT algorithm achieves excellent tracking performance on multiple public datasets and can be effectively applied to crowded scenes. In particular, CSMOT decreases the number of ID switches by 11.8% and 33.8% on the MOT16 and MOT17 test datasets, respectively, compared to the baseline.
format Online
Article
Text
id pubmed-10098982
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-100989822023-04-14 CSMOT: Make One-Shot Multi-Object Tracking in Crowded Scenes Great Again † Hou, Haoxiong Shen, Chao Zhang, Ximing Gao, Wei Sensors (Basel) Article The current popular one-shot multi-object tracking (MOT) algorithms are dominated by the joint detection and embedding paradigm, which have high inference speeds and accuracy, but their tracking performance is unstable in crowded scenes. Not only does the detection branch have difficulty in obtaining the accurate object position, but the ambiguous appearance of features extracted by the re-identification (re-ID) branch also leads to identity switches. Focusing on the above problems, this paper proposes a more robust MOT algorithm, named CSMOT, based on FairMOT. First, on the basis of the encoder–decoder network, a coordinate attention module is designed to enhance the information interaction between channels (horizontal and vertical coordinates), which improves its object-detection abilities. Then, an angle-center loss that effectively maximizes intra-class similarity is proposed to optimize the re-ID branch, and the extracted re-ID features are made more discriminative. We further redesign the re-ID feature dimension to balance the detection and re-ID tasks. Finally, a simple and effective data association mechanism is introduced, which associates each detection instead of just the high-score detections during the tracking process. The experimental results show that our one-shot MOT algorithm achieves excellent tracking performance on multiple public datasets and can be effectively applied to crowded scenes. In particular, CSMOT decreases the number of ID switches by 11.8% and 33.8% on the MOT16 and MOT17 test datasets, respectively, compared to the baseline. MDPI 2023-04-06 /pmc/articles/PMC10098982/ /pubmed/37050842 http://dx.doi.org/10.3390/s23073782 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Hou, Haoxiong
Shen, Chao
Zhang, Ximing
Gao, Wei
CSMOT: Make One-Shot Multi-Object Tracking in Crowded Scenes Great Again †
title CSMOT: Make One-Shot Multi-Object Tracking in Crowded Scenes Great Again †
title_full CSMOT: Make One-Shot Multi-Object Tracking in Crowded Scenes Great Again †
title_fullStr CSMOT: Make One-Shot Multi-Object Tracking in Crowded Scenes Great Again †
title_full_unstemmed CSMOT: Make One-Shot Multi-Object Tracking in Crowded Scenes Great Again †
title_short CSMOT: Make One-Shot Multi-Object Tracking in Crowded Scenes Great Again †
title_sort csmot: make one-shot multi-object tracking in crowded scenes great again †
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10098982/
https://www.ncbi.nlm.nih.gov/pubmed/37050842
http://dx.doi.org/10.3390/s23073782
work_keys_str_mv AT houhaoxiong csmotmakeoneshotmultiobjecttrackingincrowdedscenesgreatagain
AT shenchao csmotmakeoneshotmultiobjecttrackingincrowdedscenesgreatagain
AT zhangximing csmotmakeoneshotmultiobjecttrackingincrowdedscenesgreatagain
AT gaowei csmotmakeoneshotmultiobjecttrackingincrowdedscenesgreatagain