Cargando…
CSMOT: Make One-Shot Multi-Object Tracking in Crowded Scenes Great Again †
The current popular one-shot multi-object tracking (MOT) algorithms are dominated by the joint detection and embedding paradigm, which have high inference speeds and accuracy, but their tracking performance is unstable in crowded scenes. Not only does the detection branch have difficulty in obtainin...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10098982/ https://www.ncbi.nlm.nih.gov/pubmed/37050842 http://dx.doi.org/10.3390/s23073782 |
_version_ | 1785024947042648064 |
---|---|
author | Hou, Haoxiong Shen, Chao Zhang, Ximing Gao, Wei |
author_facet | Hou, Haoxiong Shen, Chao Zhang, Ximing Gao, Wei |
author_sort | Hou, Haoxiong |
collection | PubMed |
description | The current popular one-shot multi-object tracking (MOT) algorithms are dominated by the joint detection and embedding paradigm, which have high inference speeds and accuracy, but their tracking performance is unstable in crowded scenes. Not only does the detection branch have difficulty in obtaining the accurate object position, but the ambiguous appearance of features extracted by the re-identification (re-ID) branch also leads to identity switches. Focusing on the above problems, this paper proposes a more robust MOT algorithm, named CSMOT, based on FairMOT. First, on the basis of the encoder–decoder network, a coordinate attention module is designed to enhance the information interaction between channels (horizontal and vertical coordinates), which improves its object-detection abilities. Then, an angle-center loss that effectively maximizes intra-class similarity is proposed to optimize the re-ID branch, and the extracted re-ID features are made more discriminative. We further redesign the re-ID feature dimension to balance the detection and re-ID tasks. Finally, a simple and effective data association mechanism is introduced, which associates each detection instead of just the high-score detections during the tracking process. The experimental results show that our one-shot MOT algorithm achieves excellent tracking performance on multiple public datasets and can be effectively applied to crowded scenes. In particular, CSMOT decreases the number of ID switches by 11.8% and 33.8% on the MOT16 and MOT17 test datasets, respectively, compared to the baseline. |
format | Online Article Text |
id | pubmed-10098982 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-100989822023-04-14 CSMOT: Make One-Shot Multi-Object Tracking in Crowded Scenes Great Again † Hou, Haoxiong Shen, Chao Zhang, Ximing Gao, Wei Sensors (Basel) Article The current popular one-shot multi-object tracking (MOT) algorithms are dominated by the joint detection and embedding paradigm, which have high inference speeds and accuracy, but their tracking performance is unstable in crowded scenes. Not only does the detection branch have difficulty in obtaining the accurate object position, but the ambiguous appearance of features extracted by the re-identification (re-ID) branch also leads to identity switches. Focusing on the above problems, this paper proposes a more robust MOT algorithm, named CSMOT, based on FairMOT. First, on the basis of the encoder–decoder network, a coordinate attention module is designed to enhance the information interaction between channels (horizontal and vertical coordinates), which improves its object-detection abilities. Then, an angle-center loss that effectively maximizes intra-class similarity is proposed to optimize the re-ID branch, and the extracted re-ID features are made more discriminative. We further redesign the re-ID feature dimension to balance the detection and re-ID tasks. Finally, a simple and effective data association mechanism is introduced, which associates each detection instead of just the high-score detections during the tracking process. The experimental results show that our one-shot MOT algorithm achieves excellent tracking performance on multiple public datasets and can be effectively applied to crowded scenes. In particular, CSMOT decreases the number of ID switches by 11.8% and 33.8% on the MOT16 and MOT17 test datasets, respectively, compared to the baseline. MDPI 2023-04-06 /pmc/articles/PMC10098982/ /pubmed/37050842 http://dx.doi.org/10.3390/s23073782 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Hou, Haoxiong Shen, Chao Zhang, Ximing Gao, Wei CSMOT: Make One-Shot Multi-Object Tracking in Crowded Scenes Great Again † |
title | CSMOT: Make One-Shot Multi-Object Tracking in Crowded Scenes Great Again † |
title_full | CSMOT: Make One-Shot Multi-Object Tracking in Crowded Scenes Great Again † |
title_fullStr | CSMOT: Make One-Shot Multi-Object Tracking in Crowded Scenes Great Again † |
title_full_unstemmed | CSMOT: Make One-Shot Multi-Object Tracking in Crowded Scenes Great Again † |
title_short | CSMOT: Make One-Shot Multi-Object Tracking in Crowded Scenes Great Again † |
title_sort | csmot: make one-shot multi-object tracking in crowded scenes great again † |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10098982/ https://www.ncbi.nlm.nih.gov/pubmed/37050842 http://dx.doi.org/10.3390/s23073782 |
work_keys_str_mv | AT houhaoxiong csmotmakeoneshotmultiobjecttrackingincrowdedscenesgreatagain AT shenchao csmotmakeoneshotmultiobjecttrackingincrowdedscenesgreatagain AT zhangximing csmotmakeoneshotmultiobjecttrackingincrowdedscenesgreatagain AT gaowei csmotmakeoneshotmultiobjecttrackingincrowdedscenesgreatagain |