Cargando…

Multiple Traffic Target Tracking with Spatial-Temporal Affinity Network

Traffic target tracking is a core task in intelligent transportation system because it is useful for scene understanding and vehicle autonomous driving. Most state-of-the-art (SOTA) multiple object tracking (MOT) methods adopt a two-step procedure: object detection followed by data association. The...

Descripción completa

Detalles Bibliográficos
Autores principales:	Sun, Yamin, Zhao, Yue, Wang, Sirui
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Hindawi 2022
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9152393/ https://www.ncbi.nlm.nih.gov/pubmed/35655505 http://dx.doi.org/10.1155/2022/9693767

_version_	1784717637031297024
author	Sun, Yamin Zhao, Yue Wang, Sirui
author_facet	Sun, Yamin Zhao, Yue Wang, Sirui
author_sort	Sun, Yamin
collection	PubMed
description	Traffic target tracking is a core task in intelligent transportation system because it is useful for scene understanding and vehicle autonomous driving. Most state-of-the-art (SOTA) multiple object tracking (MOT) methods adopt a two-step procedure: object detection followed by data association. The object detection has made great progress with the development of deep learning. However, the data association still heavily depends on hand crafted constraints, such as appearance, shape, and motion, which need to be elaborately trained for a special object. In this study, a spatial-temporal encoder-decoder affinity network is proposed for multiple traffic targets tracking, aiming to utilize the power of deep learning to learn a robust spatial-temporal affinity feature of the detections and tracklets for data association. The proposed spatial-temporal affinity network contains a two-stage transformer encoder module to encode the features of the detections and the tracked targets at the image level and the tracklet level, aiming to capture the spatial correlation and temporal history information. Then, a spatial transformer decoder module is designed to compute the association affinity, where the results from the two-stage transformer encoder module are fed back to fully capture and encode the spatial and temporal information from the detections and the tracklets of the tracked targets. Thus, efficient affinity computation can be applied to perform data association in online tracking. To validate the effectiveness of the proposed method, three popular multiple traffic target tracking datasets, KITTI, UA-DETRAC, and VisDrone, are used for evaluation. On the KITTI dataset, the proposed method is compared with 15 SOTA methods and achieves 86.9% multiple object tracking accuracy (MOTA) and 85.71% multiple object tracking precision (MOTP). On the UA-DETRAC dataset, 12 SOTA methods are used to compare with the proposed method, and the proposed method achieves 20.82% MOTA and 35.65% MOTP, respectively. On the VisDrone dataset, the proposed method is compared with 10 SOTA trackers and achieves 40.5% MOTA and 74.1% MOTP, respectively. All those experimental results show that the proposed method is competitive to the state-of-the-art methods by obtaining superior tracking performance.
format	Online Article Text
id	pubmed-9152393
institution	National Center for Biotechnology Information
language	English
publishDate	2022
publisher	Hindawi
record_format	MEDLINE/PubMed
spelling	pubmed-91523932022-06-01 Multiple Traffic Target Tracking with Spatial-Temporal Affinity Network Sun, Yamin Zhao, Yue Wang, Sirui Comput Intell Neurosci Research Article Traffic target tracking is a core task in intelligent transportation system because it is useful for scene understanding and vehicle autonomous driving. Most state-of-the-art (SOTA) multiple object tracking (MOT) methods adopt a two-step procedure: object detection followed by data association. The object detection has made great progress with the development of deep learning. However, the data association still heavily depends on hand crafted constraints, such as appearance, shape, and motion, which need to be elaborately trained for a special object. In this study, a spatial-temporal encoder-decoder affinity network is proposed for multiple traffic targets tracking, aiming to utilize the power of deep learning to learn a robust spatial-temporal affinity feature of the detections and tracklets for data association. The proposed spatial-temporal affinity network contains a two-stage transformer encoder module to encode the features of the detections and the tracked targets at the image level and the tracklet level, aiming to capture the spatial correlation and temporal history information. Then, a spatial transformer decoder module is designed to compute the association affinity, where the results from the two-stage transformer encoder module are fed back to fully capture and encode the spatial and temporal information from the detections and the tracklets of the tracked targets. Thus, efficient affinity computation can be applied to perform data association in online tracking. To validate the effectiveness of the proposed method, three popular multiple traffic target tracking datasets, KITTI, UA-DETRAC, and VisDrone, are used for evaluation. On the KITTI dataset, the proposed method is compared with 15 SOTA methods and achieves 86.9% multiple object tracking accuracy (MOTA) and 85.71% multiple object tracking precision (MOTP). On the UA-DETRAC dataset, 12 SOTA methods are used to compare with the proposed method, and the proposed method achieves 20.82% MOTA and 35.65% MOTP, respectively. On the VisDrone dataset, the proposed method is compared with 10 SOTA trackers and achieves 40.5% MOTA and 74.1% MOTP, respectively. All those experimental results show that the proposed method is competitive to the state-of-the-art methods by obtaining superior tracking performance. Hindawi 2022-05-23 /pmc/articles/PMC9152393/ /pubmed/35655505 http://dx.doi.org/10.1155/2022/9693767 Text en Copyright © 2022 Yamin Sun et al. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Research Article Sun, Yamin Zhao, Yue Wang, Sirui Multiple Traffic Target Tracking with Spatial-Temporal Affinity Network
title	Multiple Traffic Target Tracking with Spatial-Temporal Affinity Network
title_full	Multiple Traffic Target Tracking with Spatial-Temporal Affinity Network
title_fullStr	Multiple Traffic Target Tracking with Spatial-Temporal Affinity Network
title_full_unstemmed	Multiple Traffic Target Tracking with Spatial-Temporal Affinity Network
title_short	Multiple Traffic Target Tracking with Spatial-Temporal Affinity Network
title_sort	multiple traffic target tracking with spatial-temporal affinity network
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9152393/ https://www.ncbi.nlm.nih.gov/pubmed/35655505 http://dx.doi.org/10.1155/2022/9693767
work_keys_str_mv	AT sunyamin multipletraffictargettrackingwithspatialtemporalaffinitynetwork AT zhaoyue multipletraffictargettrackingwithspatialtemporalaffinitynetwork AT wangsirui multipletraffictargettrackingwithspatialtemporalaffinitynetwork

Multiple Traffic Target Tracking with Spatial-Temporal Affinity Network

Ejemplares similares