Cargando…

PointTransformer: Encoding Human Local Features for Small Target Detection

The improvement of small target detection and obscuration handling is the key problem to be solved in the object detection task. In the field operation of chemical plant, due to the occlusion of construction workers and the long distance of surveillance shooting, it often leads to the phenomenon of...

Descripción completa

Detalles Bibliográficos
Autores principales: Tang, Yudi, Wang, Bing, He, Wangli, Qian, Feng, Liu, Zhen
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Hindawi 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9420563/
https://www.ncbi.nlm.nih.gov/pubmed/36045967
http://dx.doi.org/10.1155/2022/9640673
_version_ 1784777418142121984
author Tang, Yudi
Wang, Bing
He, Wangli
Qian, Feng
Liu, Zhen
author_facet Tang, Yudi
Wang, Bing
He, Wangli
Qian, Feng
Liu, Zhen
author_sort Tang, Yudi
collection PubMed
description The improvement of small target detection and obscuration handling is the key problem to be solved in the object detection task. In the field operation of chemical plant, due to the occlusion of construction workers and the long distance of surveillance shooting, it often leads to the phenomenon of missed detection. Most of the existing work uses multiple feature fusion strategies to extract different levels of features and then aggregate them into global features, which does not utilize local features and makes it difficult to improve the performance of small target detection. To address this issue, this paper introduces Point Transformer, a transformer encoder, as the core backbone of the object detection framework that first uses a priori information of human skeletal points to obtain local features and then uses both self-attention and cross-attention mechanisms to reconstruct the local features corresponding to each key point. In addition, since the target to be detected is highly correlated with the position of human skeletal points, to further boost Point Transformer's performance, a learnable positional encoding method is proposed by us to highlight the position characteristics of each skeletal point. The proposed model is evaluated on the dataset of field operation in a chemical plant. The results are significantly better than the classical algorithms. It also outperforms state-of-the-art by 12 percent of map points in the small target detection task.
format Online
Article
Text
id pubmed-9420563
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Hindawi
record_format MEDLINE/PubMed
spelling pubmed-94205632022-08-30 PointTransformer: Encoding Human Local Features for Small Target Detection Tang, Yudi Wang, Bing He, Wangli Qian, Feng Liu, Zhen Comput Intell Neurosci Research Article The improvement of small target detection and obscuration handling is the key problem to be solved in the object detection task. In the field operation of chemical plant, due to the occlusion of construction workers and the long distance of surveillance shooting, it often leads to the phenomenon of missed detection. Most of the existing work uses multiple feature fusion strategies to extract different levels of features and then aggregate them into global features, which does not utilize local features and makes it difficult to improve the performance of small target detection. To address this issue, this paper introduces Point Transformer, a transformer encoder, as the core backbone of the object detection framework that first uses a priori information of human skeletal points to obtain local features and then uses both self-attention and cross-attention mechanisms to reconstruct the local features corresponding to each key point. In addition, since the target to be detected is highly correlated with the position of human skeletal points, to further boost Point Transformer's performance, a learnable positional encoding method is proposed by us to highlight the position characteristics of each skeletal point. The proposed model is evaluated on the dataset of field operation in a chemical plant. The results are significantly better than the classical algorithms. It also outperforms state-of-the-art by 12 percent of map points in the small target detection task. Hindawi 2022-08-21 /pmc/articles/PMC9420563/ /pubmed/36045967 http://dx.doi.org/10.1155/2022/9640673 Text en Copyright © 2022 Yudi Tang et al. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Tang, Yudi
Wang, Bing
He, Wangli
Qian, Feng
Liu, Zhen
PointTransformer: Encoding Human Local Features for Small Target Detection
title PointTransformer: Encoding Human Local Features for Small Target Detection
title_full PointTransformer: Encoding Human Local Features for Small Target Detection
title_fullStr PointTransformer: Encoding Human Local Features for Small Target Detection
title_full_unstemmed PointTransformer: Encoding Human Local Features for Small Target Detection
title_short PointTransformer: Encoding Human Local Features for Small Target Detection
title_sort pointtransformer: encoding human local features for small target detection
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9420563/
https://www.ncbi.nlm.nih.gov/pubmed/36045967
http://dx.doi.org/10.1155/2022/9640673
work_keys_str_mv AT tangyudi pointtransformerencodinghumanlocalfeaturesforsmalltargetdetection
AT wangbing pointtransformerencodinghumanlocalfeaturesforsmalltargetdetection
AT hewangli pointtransformerencodinghumanlocalfeaturesforsmalltargetdetection
AT qianfeng pointtransformerencodinghumanlocalfeaturesforsmalltargetdetection
AT liuzhen pointtransformerencodinghumanlocalfeaturesforsmalltargetdetection