Cargando…

NRVC: Neural Representation for Video Compression with Implicit Multiscale Fusion Network

Recently, end-to-end deep models for video compression have made steady advancements. However, this resulted in a lengthy and complex pipeline containing numerous redundant parameters. The video compression approaches based on implicit neural representation (INR) allow videos to be directly represen...

Descripción completa

Detalles Bibliográficos
Autores principales: Liu, Shangdong, Cao, Puming, Feng, Yujian, Ji, Yimu, Chen, Jiayuan, Xie, Xuedong, Wu, Longji
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10453668/
https://www.ncbi.nlm.nih.gov/pubmed/37628197
http://dx.doi.org/10.3390/e25081167
_version_ 1785095993491980288
author Liu, Shangdong
Cao, Puming
Feng, Yujian
Ji, Yimu
Chen, Jiayuan
Xie, Xuedong
Wu, Longji
author_facet Liu, Shangdong
Cao, Puming
Feng, Yujian
Ji, Yimu
Chen, Jiayuan
Xie, Xuedong
Wu, Longji
author_sort Liu, Shangdong
collection PubMed
description Recently, end-to-end deep models for video compression have made steady advancements. However, this resulted in a lengthy and complex pipeline containing numerous redundant parameters. The video compression approaches based on implicit neural representation (INR) allow videos to be directly represented as a function approximated by a neural network, resulting in a more lightweight model, whereas the singularity of the feature extraction pipeline limits the network’s ability to fit the mapping function for video frames. Hence, we propose a neural representation approach for video compression with an implicit multiscale fusion network (NRVC), utilizing normalized residual networks to improve the effectiveness of INR in fitting the target function. We propose the multiscale representations for video compression (MSRVC) network, which effectively extracts features from the input video sequence to enhance the degree of overfitting in the mapping function. Additionally, we propose the feature extraction channel attention (FECA) block to capture interaction information between different feature extraction channels, further improving the effectiveness of feature extraction. The results show that compared to the NeRV method with similar bits per pixel (BPP), NRVC has a 2.16% increase in the decoded peak signal-to-noise ratio (PSNR). Moreover, NRVC outperforms the conventional HEVC in terms of PSNR.
format Online
Article
Text
id pubmed-10453668
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-104536682023-08-26 NRVC: Neural Representation for Video Compression with Implicit Multiscale Fusion Network Liu, Shangdong Cao, Puming Feng, Yujian Ji, Yimu Chen, Jiayuan Xie, Xuedong Wu, Longji Entropy (Basel) Article Recently, end-to-end deep models for video compression have made steady advancements. However, this resulted in a lengthy and complex pipeline containing numerous redundant parameters. The video compression approaches based on implicit neural representation (INR) allow videos to be directly represented as a function approximated by a neural network, resulting in a more lightweight model, whereas the singularity of the feature extraction pipeline limits the network’s ability to fit the mapping function for video frames. Hence, we propose a neural representation approach for video compression with an implicit multiscale fusion network (NRVC), utilizing normalized residual networks to improve the effectiveness of INR in fitting the target function. We propose the multiscale representations for video compression (MSRVC) network, which effectively extracts features from the input video sequence to enhance the degree of overfitting in the mapping function. Additionally, we propose the feature extraction channel attention (FECA) block to capture interaction information between different feature extraction channels, further improving the effectiveness of feature extraction. The results show that compared to the NeRV method with similar bits per pixel (BPP), NRVC has a 2.16% increase in the decoded peak signal-to-noise ratio (PSNR). Moreover, NRVC outperforms the conventional HEVC in terms of PSNR. MDPI 2023-08-04 /pmc/articles/PMC10453668/ /pubmed/37628197 http://dx.doi.org/10.3390/e25081167 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Liu, Shangdong
Cao, Puming
Feng, Yujian
Ji, Yimu
Chen, Jiayuan
Xie, Xuedong
Wu, Longji
NRVC: Neural Representation for Video Compression with Implicit Multiscale Fusion Network
title NRVC: Neural Representation for Video Compression with Implicit Multiscale Fusion Network
title_full NRVC: Neural Representation for Video Compression with Implicit Multiscale Fusion Network
title_fullStr NRVC: Neural Representation for Video Compression with Implicit Multiscale Fusion Network
title_full_unstemmed NRVC: Neural Representation for Video Compression with Implicit Multiscale Fusion Network
title_short NRVC: Neural Representation for Video Compression with Implicit Multiscale Fusion Network
title_sort nrvc: neural representation for video compression with implicit multiscale fusion network
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10453668/
https://www.ncbi.nlm.nih.gov/pubmed/37628197
http://dx.doi.org/10.3390/e25081167
work_keys_str_mv AT liushangdong nrvcneuralrepresentationforvideocompressionwithimplicitmultiscalefusionnetwork
AT caopuming nrvcneuralrepresentationforvideocompressionwithimplicitmultiscalefusionnetwork
AT fengyujian nrvcneuralrepresentationforvideocompressionwithimplicitmultiscalefusionnetwork
AT jiyimu nrvcneuralrepresentationforvideocompressionwithimplicitmultiscalefusionnetwork
AT chenjiayuan nrvcneuralrepresentationforvideocompressionwithimplicitmultiscalefusionnetwork
AT xiexuedong nrvcneuralrepresentationforvideocompressionwithimplicitmultiscalefusionnetwork
AT wulongji nrvcneuralrepresentationforvideocompressionwithimplicitmultiscalefusionnetwork