Cargando…

Multiple Traffic Target Tracking with Spatial-Temporal Affinity Network

Traffic target tracking is a core task in intelligent transportation system because it is useful for scene understanding and vehicle autonomous driving. Most state-of-the-art (SOTA) multiple object tracking (MOT) methods adopt a two-step procedure: object detection followed by data association. The...

Descripción completa

Detalles Bibliográficos
Autores principales: Sun, Yamin, Zhao, Yue, Wang, Sirui
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Hindawi 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9152393/
https://www.ncbi.nlm.nih.gov/pubmed/35655505
http://dx.doi.org/10.1155/2022/9693767
_version_ 1784717637031297024
author Sun, Yamin
Zhao, Yue
Wang, Sirui
author_facet Sun, Yamin
Zhao, Yue
Wang, Sirui
author_sort Sun, Yamin
collection PubMed
description Traffic target tracking is a core task in intelligent transportation system because it is useful for scene understanding and vehicle autonomous driving. Most state-of-the-art (SOTA) multiple object tracking (MOT) methods adopt a two-step procedure: object detection followed by data association. The object detection has made great progress with the development of deep learning. However, the data association still heavily depends on hand crafted constraints, such as appearance, shape, and motion, which need to be elaborately trained for a special object. In this study, a spatial-temporal encoder-decoder affinity network is proposed for multiple traffic targets tracking, aiming to utilize the power of deep learning to learn a robust spatial-temporal affinity feature of the detections and tracklets for data association. The proposed spatial-temporal affinity network contains a two-stage transformer encoder module to encode the features of the detections and the tracked targets at the image level and the tracklet level, aiming to capture the spatial correlation and temporal history information. Then, a spatial transformer decoder module is designed to compute the association affinity, where the results from the two-stage transformer encoder module are fed back to fully capture and encode the spatial and temporal information from the detections and the tracklets of the tracked targets. Thus, efficient affinity computation can be applied to perform data association in online tracking. To validate the effectiveness of the proposed method, three popular multiple traffic target tracking datasets, KITTI, UA-DETRAC, and VisDrone, are used for evaluation. On the KITTI dataset, the proposed method is compared with 15 SOTA methods and achieves 86.9% multiple object tracking accuracy (MOTA) and 85.71% multiple object tracking precision (MOTP). On the UA-DETRAC dataset, 12 SOTA methods are used to compare with the proposed method, and the proposed method achieves 20.82% MOTA and 35.65% MOTP, respectively. On the VisDrone dataset, the proposed method is compared with 10 SOTA trackers and achieves 40.5% MOTA and 74.1% MOTP, respectively. All those experimental results show that the proposed method is competitive to the state-of-the-art methods by obtaining superior tracking performance.
format Online
Article
Text
id pubmed-9152393
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Hindawi
record_format MEDLINE/PubMed
spelling pubmed-91523932022-06-01 Multiple Traffic Target Tracking with Spatial-Temporal Affinity Network Sun, Yamin Zhao, Yue Wang, Sirui Comput Intell Neurosci Research Article Traffic target tracking is a core task in intelligent transportation system because it is useful for scene understanding and vehicle autonomous driving. Most state-of-the-art (SOTA) multiple object tracking (MOT) methods adopt a two-step procedure: object detection followed by data association. The object detection has made great progress with the development of deep learning. However, the data association still heavily depends on hand crafted constraints, such as appearance, shape, and motion, which need to be elaborately trained for a special object. In this study, a spatial-temporal encoder-decoder affinity network is proposed for multiple traffic targets tracking, aiming to utilize the power of deep learning to learn a robust spatial-temporal affinity feature of the detections and tracklets for data association. The proposed spatial-temporal affinity network contains a two-stage transformer encoder module to encode the features of the detections and the tracked targets at the image level and the tracklet level, aiming to capture the spatial correlation and temporal history information. Then, a spatial transformer decoder module is designed to compute the association affinity, where the results from the two-stage transformer encoder module are fed back to fully capture and encode the spatial and temporal information from the detections and the tracklets of the tracked targets. Thus, efficient affinity computation can be applied to perform data association in online tracking. To validate the effectiveness of the proposed method, three popular multiple traffic target tracking datasets, KITTI, UA-DETRAC, and VisDrone, are used for evaluation. On the KITTI dataset, the proposed method is compared with 15 SOTA methods and achieves 86.9% multiple object tracking accuracy (MOTA) and 85.71% multiple object tracking precision (MOTP). On the UA-DETRAC dataset, 12 SOTA methods are used to compare with the proposed method, and the proposed method achieves 20.82% MOTA and 35.65% MOTP, respectively. On the VisDrone dataset, the proposed method is compared with 10 SOTA trackers and achieves 40.5% MOTA and 74.1% MOTP, respectively. All those experimental results show that the proposed method is competitive to the state-of-the-art methods by obtaining superior tracking performance. Hindawi 2022-05-23 /pmc/articles/PMC9152393/ /pubmed/35655505 http://dx.doi.org/10.1155/2022/9693767 Text en Copyright © 2022 Yamin Sun et al. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Sun, Yamin
Zhao, Yue
Wang, Sirui
Multiple Traffic Target Tracking with Spatial-Temporal Affinity Network
title Multiple Traffic Target Tracking with Spatial-Temporal Affinity Network
title_full Multiple Traffic Target Tracking with Spatial-Temporal Affinity Network
title_fullStr Multiple Traffic Target Tracking with Spatial-Temporal Affinity Network
title_full_unstemmed Multiple Traffic Target Tracking with Spatial-Temporal Affinity Network
title_short Multiple Traffic Target Tracking with Spatial-Temporal Affinity Network
title_sort multiple traffic target tracking with spatial-temporal affinity network
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9152393/
https://www.ncbi.nlm.nih.gov/pubmed/35655505
http://dx.doi.org/10.1155/2022/9693767
work_keys_str_mv AT sunyamin multipletraffictargettrackingwithspatialtemporalaffinitynetwork
AT zhaoyue multipletraffictargettrackingwithspatialtemporalaffinitynetwork
AT wangsirui multipletraffictargettrackingwithspatialtemporalaffinitynetwork