Cargando…
NRVC: Neural Representation for Video Compression with Implicit Multiscale Fusion Network
Recently, end-to-end deep models for video compression have made steady advancements. However, this resulted in a lengthy and complex pipeline containing numerous redundant parameters. The video compression approaches based on implicit neural representation (INR) allow videos to be directly represen...
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10453668/ https://www.ncbi.nlm.nih.gov/pubmed/37628197 http://dx.doi.org/10.3390/e25081167 |
_version_ | 1785095993491980288 |
---|---|
author | Liu, Shangdong Cao, Puming Feng, Yujian Ji, Yimu Chen, Jiayuan Xie, Xuedong Wu, Longji |
author_facet | Liu, Shangdong Cao, Puming Feng, Yujian Ji, Yimu Chen, Jiayuan Xie, Xuedong Wu, Longji |
author_sort | Liu, Shangdong |
collection | PubMed |
description | Recently, end-to-end deep models for video compression have made steady advancements. However, this resulted in a lengthy and complex pipeline containing numerous redundant parameters. The video compression approaches based on implicit neural representation (INR) allow videos to be directly represented as a function approximated by a neural network, resulting in a more lightweight model, whereas the singularity of the feature extraction pipeline limits the network’s ability to fit the mapping function for video frames. Hence, we propose a neural representation approach for video compression with an implicit multiscale fusion network (NRVC), utilizing normalized residual networks to improve the effectiveness of INR in fitting the target function. We propose the multiscale representations for video compression (MSRVC) network, which effectively extracts features from the input video sequence to enhance the degree of overfitting in the mapping function. Additionally, we propose the feature extraction channel attention (FECA) block to capture interaction information between different feature extraction channels, further improving the effectiveness of feature extraction. The results show that compared to the NeRV method with similar bits per pixel (BPP), NRVC has a 2.16% increase in the decoded peak signal-to-noise ratio (PSNR). Moreover, NRVC outperforms the conventional HEVC in terms of PSNR. |
format | Online Article Text |
id | pubmed-10453668 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-104536682023-08-26 NRVC: Neural Representation for Video Compression with Implicit Multiscale Fusion Network Liu, Shangdong Cao, Puming Feng, Yujian Ji, Yimu Chen, Jiayuan Xie, Xuedong Wu, Longji Entropy (Basel) Article Recently, end-to-end deep models for video compression have made steady advancements. However, this resulted in a lengthy and complex pipeline containing numerous redundant parameters. The video compression approaches based on implicit neural representation (INR) allow videos to be directly represented as a function approximated by a neural network, resulting in a more lightweight model, whereas the singularity of the feature extraction pipeline limits the network’s ability to fit the mapping function for video frames. Hence, we propose a neural representation approach for video compression with an implicit multiscale fusion network (NRVC), utilizing normalized residual networks to improve the effectiveness of INR in fitting the target function. We propose the multiscale representations for video compression (MSRVC) network, which effectively extracts features from the input video sequence to enhance the degree of overfitting in the mapping function. Additionally, we propose the feature extraction channel attention (FECA) block to capture interaction information between different feature extraction channels, further improving the effectiveness of feature extraction. The results show that compared to the NeRV method with similar bits per pixel (BPP), NRVC has a 2.16% increase in the decoded peak signal-to-noise ratio (PSNR). Moreover, NRVC outperforms the conventional HEVC in terms of PSNR. MDPI 2023-08-04 /pmc/articles/PMC10453668/ /pubmed/37628197 http://dx.doi.org/10.3390/e25081167 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Liu, Shangdong Cao, Puming Feng, Yujian Ji, Yimu Chen, Jiayuan Xie, Xuedong Wu, Longji NRVC: Neural Representation for Video Compression with Implicit Multiscale Fusion Network |
title | NRVC: Neural Representation for Video Compression with Implicit Multiscale Fusion Network |
title_full | NRVC: Neural Representation for Video Compression with Implicit Multiscale Fusion Network |
title_fullStr | NRVC: Neural Representation for Video Compression with Implicit Multiscale Fusion Network |
title_full_unstemmed | NRVC: Neural Representation for Video Compression with Implicit Multiscale Fusion Network |
title_short | NRVC: Neural Representation for Video Compression with Implicit Multiscale Fusion Network |
title_sort | nrvc: neural representation for video compression with implicit multiscale fusion network |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10453668/ https://www.ncbi.nlm.nih.gov/pubmed/37628197 http://dx.doi.org/10.3390/e25081167 |
work_keys_str_mv | AT liushangdong nrvcneuralrepresentationforvideocompressionwithimplicitmultiscalefusionnetwork AT caopuming nrvcneuralrepresentationforvideocompressionwithimplicitmultiscalefusionnetwork AT fengyujian nrvcneuralrepresentationforvideocompressionwithimplicitmultiscalefusionnetwork AT jiyimu nrvcneuralrepresentationforvideocompressionwithimplicitmultiscalefusionnetwork AT chenjiayuan nrvcneuralrepresentationforvideocompressionwithimplicitmultiscalefusionnetwork AT xiexuedong nrvcneuralrepresentationforvideocompressionwithimplicitmultiscalefusionnetwork AT wulongji nrvcneuralrepresentationforvideocompressionwithimplicitmultiscalefusionnetwork |