Cargando…

STMP-Net: A Spatiotemporal Prediction Network Integrating Motion Perception

This article proposes a video prediction network called STMP-Net that addresses the problem of the inability of Recurrent Neural Networks (RNNs) to fully extract spatiotemporal information and motion change features during video prediction. STMP-Net combines spatiotemporal memory and motion percepti...

Descripción completa

Detalles Bibliográficos
Autores principales:	Chen, Suting, Yang, Ning
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	MDPI 2023
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10255356/ https://www.ncbi.nlm.nih.gov/pubmed/37299860 http://dx.doi.org/10.3390/s23115133

_version_	1785056851832864768
author	Chen, Suting Yang, Ning
author_facet	Chen, Suting Yang, Ning
author_sort	Chen, Suting
collection	PubMed
description	This article proposes a video prediction network called STMP-Net that addresses the problem of the inability of Recurrent Neural Networks (RNNs) to fully extract spatiotemporal information and motion change features during video prediction. STMP-Net combines spatiotemporal memory and motion perception to make more accurate predictions. Firstly, a spatiotemporal attention fusion unit (STAFU) is proposed as the basic module of the prediction network, which learns and transfers spatiotemporal features in both horizontal and vertical directions based on spatiotemporal feature information and contextual attention mechanism. Additionally, a contextual attention mechanism is introduced in the hidden state to focus attention on more important details and improve the capture of detailed features, thus greatly reducing the computational load of the network. Secondly, a motion gradient highway unit (MGHU) is proposed by combining motion perception modules and adding them between adjacent layers, which can adaptively learn the important information of input features and fuse motion change features to significantly improve the predictive performance of the model. Finally, a high-speed channel is provided between layers to quickly transmit important features and alleviate the gradient vanishing problem caused by back-propagation. The experimental results show that compared with mainstream video prediction networks, the proposed method can achieve better prediction results in long-term video prediction, especially in motion scenes.
format	Online Article Text
id	pubmed-10255356
institution	National Center for Biotechnology Information
language	English
publishDate	2023
publisher	MDPI
record_format	MEDLINE/PubMed
spelling	pubmed-102553562023-06-10 STMP-Net: A Spatiotemporal Prediction Network Integrating Motion Perception Chen, Suting Yang, Ning Sensors (Basel) Article This article proposes a video prediction network called STMP-Net that addresses the problem of the inability of Recurrent Neural Networks (RNNs) to fully extract spatiotemporal information and motion change features during video prediction. STMP-Net combines spatiotemporal memory and motion perception to make more accurate predictions. Firstly, a spatiotemporal attention fusion unit (STAFU) is proposed as the basic module of the prediction network, which learns and transfers spatiotemporal features in both horizontal and vertical directions based on spatiotemporal feature information and contextual attention mechanism. Additionally, a contextual attention mechanism is introduced in the hidden state to focus attention on more important details and improve the capture of detailed features, thus greatly reducing the computational load of the network. Secondly, a motion gradient highway unit (MGHU) is proposed by combining motion perception modules and adding them between adjacent layers, which can adaptively learn the important information of input features and fuse motion change features to significantly improve the predictive performance of the model. Finally, a high-speed channel is provided between layers to quickly transmit important features and alleviate the gradient vanishing problem caused by back-propagation. The experimental results show that compared with mainstream video prediction networks, the proposed method can achieve better prediction results in long-term video prediction, especially in motion scenes. MDPI 2023-05-28 /pmc/articles/PMC10255356/ /pubmed/37299860 http://dx.doi.org/10.3390/s23115133 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle	Article Chen, Suting Yang, Ning STMP-Net: A Spatiotemporal Prediction Network Integrating Motion Perception
title	STMP-Net: A Spatiotemporal Prediction Network Integrating Motion Perception
title_full	STMP-Net: A Spatiotemporal Prediction Network Integrating Motion Perception
title_fullStr	STMP-Net: A Spatiotemporal Prediction Network Integrating Motion Perception
title_full_unstemmed	STMP-Net: A Spatiotemporal Prediction Network Integrating Motion Perception
title_short	STMP-Net: A Spatiotemporal Prediction Network Integrating Motion Perception
title_sort	stmp-net: a spatiotemporal prediction network integrating motion perception
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10255356/ https://www.ncbi.nlm.nih.gov/pubmed/37299860 http://dx.doi.org/10.3390/s23115133
work_keys_str_mv	AT chensuting stmpnetaspatiotemporalpredictionnetworkintegratingmotionperception AT yangning stmpnetaspatiotemporalpredictionnetworkintegratingmotionperception

STMP-Net: A Spatiotemporal Prediction Network Integrating Motion Perception

Ejemplares similares