Cargando…

Deep Learning Post-Filtering Using Multi-Head Attention and Multiresolution Feature Fusion for Image and Intra-Video Quality Enhancement

The paper proposes a novel post-filtering method based on convolutional neural networks (CNNs) for quality enhancement of RGB/grayscale images and video sequences. The lossy images are encoded using common image codecs, such as JPEG and JPEG2000. The video sequences are encoded using previous and on...

Descripción completa

Detalles Bibliográficos
Autores principales:	Schiopu, Ionut, Munteanu, Adrian
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	MDPI 2022
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8963040/ https://www.ncbi.nlm.nih.gov/pubmed/35214252 http://dx.doi.org/10.3390/s22041353

_version_	1784677907156697088
author	Schiopu, Ionut Munteanu, Adrian
author_facet	Schiopu, Ionut Munteanu, Adrian
author_sort	Schiopu, Ionut
collection	PubMed
description	The paper proposes a novel post-filtering method based on convolutional neural networks (CNNs) for quality enhancement of RGB/grayscale images and video sequences. The lossy images are encoded using common image codecs, such as JPEG and JPEG2000. The video sequences are encoded using previous and ongoing video coding standards, high-efficiency video coding (HEVC) and versatile video coding (VVC), respectively. A novel deep neural network architecture is proposed to estimate fine refinement details for full-, half-, and quarter-patch resolutions. The proposed architecture is built using a set of efficient processing blocks designed based on the following concepts: (i) the multi-head attention mechanism for refining the feature maps, (ii) the weight sharing concept for reducing the network complexity, and (iii) novel block designs of layer structures for multiresolution feature fusion. The proposed method provides substantial performance improvements compared with both common image codecs and video coding standards. Experimental results on high-resolution images and standard video sequences show that the proposed post-filtering method provides average BD-rate savings of [Formula: see text] over JPEG and [Formula: see text] over HEVC (x265) for RGB images, Y-BD-rate savings of [Formula: see text] over JPEG and [Formula: see text] over VVC (VTM) for grayscale images, and [Formula: see text] over HEVC and [Formula: see text] over VVC for video sequences.
format	Online Article Text
id	pubmed-8963040
institution	National Center for Biotechnology Information
language	English
publishDate	2022
publisher	MDPI
record_format	MEDLINE/PubMed
spelling	pubmed-89630402022-03-30 Deep Learning Post-Filtering Using Multi-Head Attention and Multiresolution Feature Fusion for Image and Intra-Video Quality Enhancement Schiopu, Ionut Munteanu, Adrian Sensors (Basel) Article The paper proposes a novel post-filtering method based on convolutional neural networks (CNNs) for quality enhancement of RGB/grayscale images and video sequences. The lossy images are encoded using common image codecs, such as JPEG and JPEG2000. The video sequences are encoded using previous and ongoing video coding standards, high-efficiency video coding (HEVC) and versatile video coding (VVC), respectively. A novel deep neural network architecture is proposed to estimate fine refinement details for full-, half-, and quarter-patch resolutions. The proposed architecture is built using a set of efficient processing blocks designed based on the following concepts: (i) the multi-head attention mechanism for refining the feature maps, (ii) the weight sharing concept for reducing the network complexity, and (iii) novel block designs of layer structures for multiresolution feature fusion. The proposed method provides substantial performance improvements compared with both common image codecs and video coding standards. Experimental results on high-resolution images and standard video sequences show that the proposed post-filtering method provides average BD-rate savings of [Formula: see text] over JPEG and [Formula: see text] over HEVC (x265) for RGB images, Y-BD-rate savings of [Formula: see text] over JPEG and [Formula: see text] over VVC (VTM) for grayscale images, and [Formula: see text] over HEVC and [Formula: see text] over VVC for video sequences. MDPI 2022-02-10 /pmc/articles/PMC8963040/ /pubmed/35214252 http://dx.doi.org/10.3390/s22041353 Text en © 2022 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle	Article Schiopu, Ionut Munteanu, Adrian Deep Learning Post-Filtering Using Multi-Head Attention and Multiresolution Feature Fusion for Image and Intra-Video Quality Enhancement
title	Deep Learning Post-Filtering Using Multi-Head Attention and Multiresolution Feature Fusion for Image and Intra-Video Quality Enhancement
title_full	Deep Learning Post-Filtering Using Multi-Head Attention and Multiresolution Feature Fusion for Image and Intra-Video Quality Enhancement
title_fullStr	Deep Learning Post-Filtering Using Multi-Head Attention and Multiresolution Feature Fusion for Image and Intra-Video Quality Enhancement
title_full_unstemmed	Deep Learning Post-Filtering Using Multi-Head Attention and Multiresolution Feature Fusion for Image and Intra-Video Quality Enhancement
title_short	Deep Learning Post-Filtering Using Multi-Head Attention and Multiresolution Feature Fusion for Image and Intra-Video Quality Enhancement
title_sort	deep learning post-filtering using multi-head attention and multiresolution feature fusion for image and intra-video quality enhancement
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8963040/ https://www.ncbi.nlm.nih.gov/pubmed/35214252 http://dx.doi.org/10.3390/s22041353
work_keys_str_mv	AT schiopuionut deeplearningpostfilteringusingmultiheadattentionandmultiresolutionfeaturefusionforimageandintravideoqualityenhancement AT munteanuadrian deeplearningpostfilteringusingmultiheadattentionandmultiresolutionfeaturefusionforimageandintravideoqualityenhancement

Deep Learning Post-Filtering Using Multi-Head Attention and Multiresolution Feature Fusion for Image and Intra-Video Quality Enhancement

Ejemplares similares