Cargando…
PointTransformer: Encoding Human Local Features for Small Target Detection
The improvement of small target detection and obscuration handling is the key problem to be solved in the object detection task. In the field operation of chemical plant, due to the occlusion of construction workers and the long distance of surveillance shooting, it often leads to the phenomenon of...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Hindawi
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9420563/ https://www.ncbi.nlm.nih.gov/pubmed/36045967 http://dx.doi.org/10.1155/2022/9640673 |
_version_ | 1784777418142121984 |
---|---|
author | Tang, Yudi Wang, Bing He, Wangli Qian, Feng Liu, Zhen |
author_facet | Tang, Yudi Wang, Bing He, Wangli Qian, Feng Liu, Zhen |
author_sort | Tang, Yudi |
collection | PubMed |
description | The improvement of small target detection and obscuration handling is the key problem to be solved in the object detection task. In the field operation of chemical plant, due to the occlusion of construction workers and the long distance of surveillance shooting, it often leads to the phenomenon of missed detection. Most of the existing work uses multiple feature fusion strategies to extract different levels of features and then aggregate them into global features, which does not utilize local features and makes it difficult to improve the performance of small target detection. To address this issue, this paper introduces Point Transformer, a transformer encoder, as the core backbone of the object detection framework that first uses a priori information of human skeletal points to obtain local features and then uses both self-attention and cross-attention mechanisms to reconstruct the local features corresponding to each key point. In addition, since the target to be detected is highly correlated with the position of human skeletal points, to further boost Point Transformer's performance, a learnable positional encoding method is proposed by us to highlight the position characteristics of each skeletal point. The proposed model is evaluated on the dataset of field operation in a chemical plant. The results are significantly better than the classical algorithms. It also outperforms state-of-the-art by 12 percent of map points in the small target detection task. |
format | Online Article Text |
id | pubmed-9420563 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Hindawi |
record_format | MEDLINE/PubMed |
spelling | pubmed-94205632022-08-30 PointTransformer: Encoding Human Local Features for Small Target Detection Tang, Yudi Wang, Bing He, Wangli Qian, Feng Liu, Zhen Comput Intell Neurosci Research Article The improvement of small target detection and obscuration handling is the key problem to be solved in the object detection task. In the field operation of chemical plant, due to the occlusion of construction workers and the long distance of surveillance shooting, it often leads to the phenomenon of missed detection. Most of the existing work uses multiple feature fusion strategies to extract different levels of features and then aggregate them into global features, which does not utilize local features and makes it difficult to improve the performance of small target detection. To address this issue, this paper introduces Point Transformer, a transformer encoder, as the core backbone of the object detection framework that first uses a priori information of human skeletal points to obtain local features and then uses both self-attention and cross-attention mechanisms to reconstruct the local features corresponding to each key point. In addition, since the target to be detected is highly correlated with the position of human skeletal points, to further boost Point Transformer's performance, a learnable positional encoding method is proposed by us to highlight the position characteristics of each skeletal point. The proposed model is evaluated on the dataset of field operation in a chemical plant. The results are significantly better than the classical algorithms. It also outperforms state-of-the-art by 12 percent of map points in the small target detection task. Hindawi 2022-08-21 /pmc/articles/PMC9420563/ /pubmed/36045967 http://dx.doi.org/10.1155/2022/9640673 Text en Copyright © 2022 Yudi Tang et al. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Article Tang, Yudi Wang, Bing He, Wangli Qian, Feng Liu, Zhen PointTransformer: Encoding Human Local Features for Small Target Detection |
title | PointTransformer: Encoding Human Local Features for Small Target Detection |
title_full | PointTransformer: Encoding Human Local Features for Small Target Detection |
title_fullStr | PointTransformer: Encoding Human Local Features for Small Target Detection |
title_full_unstemmed | PointTransformer: Encoding Human Local Features for Small Target Detection |
title_short | PointTransformer: Encoding Human Local Features for Small Target Detection |
title_sort | pointtransformer: encoding human local features for small target detection |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9420563/ https://www.ncbi.nlm.nih.gov/pubmed/36045967 http://dx.doi.org/10.1155/2022/9640673 |
work_keys_str_mv | AT tangyudi pointtransformerencodinghumanlocalfeaturesforsmalltargetdetection AT wangbing pointtransformerencodinghumanlocalfeaturesforsmalltargetdetection AT hewangli pointtransformerencodinghumanlocalfeaturesforsmalltargetdetection AT qianfeng pointtransformerencodinghumanlocalfeaturesforsmalltargetdetection AT liuzhen pointtransformerencodinghumanlocalfeaturesforsmalltargetdetection |