Cargando…

Attention-Based Bi-Prediction Network for Versatile Video Coding (VVC) over 5G Network

As the demands of various network-dependent services such as Internet of things (IoT) applications, autonomous driving, and augmented and virtual reality (AR/VR) increase, the fifthgeneration (5G) network is expected to become a key communication technology. The latest video coding standard, versati...

Descripción completa

Detalles Bibliográficos
Autores principales: Choi, Young-Ju, Lee, Young-Woon, Kim, Jongho, Jeong, Se Yoon, Choi, Jin Soo, Kim, Byung-Gyu
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10007134/
https://www.ncbi.nlm.nih.gov/pubmed/36904838
http://dx.doi.org/10.3390/s23052631
_version_ 1784905443393404928
author Choi, Young-Ju
Lee, Young-Woon
Kim, Jongho
Jeong, Se Yoon
Choi, Jin Soo
Kim, Byung-Gyu
author_facet Choi, Young-Ju
Lee, Young-Woon
Kim, Jongho
Jeong, Se Yoon
Choi, Jin Soo
Kim, Byung-Gyu
author_sort Choi, Young-Ju
collection PubMed
description As the demands of various network-dependent services such as Internet of things (IoT) applications, autonomous driving, and augmented and virtual reality (AR/VR) increase, the fifthgeneration (5G) network is expected to become a key communication technology. The latest video coding standard, versatile video coding (VVC), can contribute to providing high-quality services by achieving superior compression performance. In video coding, inter bi-prediction serves to improve the coding efficiency significantly by producing a precise fused prediction block. Although block-wise methods, such as bi-prediction with CU-level weight (BCW), are applied in VVC, it is still difficult for the linear fusion-based strategy to represent diverse pixel variations inside a block. In addition, a pixel-wise method called bi-directional optical flow (BDOF) has been proposed to refine bi-prediction block. However, the non-linear optical flow equation in BDOF mode is applied under assumptions, so this method is still unable to accurately compensate various kinds of bi-prediction blocks. In this paper, we propose an attention-based bi-prediction network (ABPN) to substitute for the whole existing bi-prediction methods. The proposed ABPN is designed to learn efficient representations of the fused features by utilizing an attention mechanism. Furthermore, the knowledge distillation (KD)- based approach is employed to compress the size of the proposed network while keeping comparable output as the large model. The proposed ABPN is integrated into the VTM-11.0 NNVC-1.0 standard reference software. When compared with VTM anchor, it is verified that the BD-rate reduction of the lightweighted ABPN can be up to 5.89% and 4.91% on Y component under random access (RA) and low delay B (LDB), respectively.
format Online
Article
Text
id pubmed-10007134
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-100071342023-03-12 Attention-Based Bi-Prediction Network for Versatile Video Coding (VVC) over 5G Network Choi, Young-Ju Lee, Young-Woon Kim, Jongho Jeong, Se Yoon Choi, Jin Soo Kim, Byung-Gyu Sensors (Basel) Article As the demands of various network-dependent services such as Internet of things (IoT) applications, autonomous driving, and augmented and virtual reality (AR/VR) increase, the fifthgeneration (5G) network is expected to become a key communication technology. The latest video coding standard, versatile video coding (VVC), can contribute to providing high-quality services by achieving superior compression performance. In video coding, inter bi-prediction serves to improve the coding efficiency significantly by producing a precise fused prediction block. Although block-wise methods, such as bi-prediction with CU-level weight (BCW), are applied in VVC, it is still difficult for the linear fusion-based strategy to represent diverse pixel variations inside a block. In addition, a pixel-wise method called bi-directional optical flow (BDOF) has been proposed to refine bi-prediction block. However, the non-linear optical flow equation in BDOF mode is applied under assumptions, so this method is still unable to accurately compensate various kinds of bi-prediction blocks. In this paper, we propose an attention-based bi-prediction network (ABPN) to substitute for the whole existing bi-prediction methods. The proposed ABPN is designed to learn efficient representations of the fused features by utilizing an attention mechanism. Furthermore, the knowledge distillation (KD)- based approach is employed to compress the size of the proposed network while keeping comparable output as the large model. The proposed ABPN is integrated into the VTM-11.0 NNVC-1.0 standard reference software. When compared with VTM anchor, it is verified that the BD-rate reduction of the lightweighted ABPN can be up to 5.89% and 4.91% on Y component under random access (RA) and low delay B (LDB), respectively. MDPI 2023-02-27 /pmc/articles/PMC10007134/ /pubmed/36904838 http://dx.doi.org/10.3390/s23052631 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Choi, Young-Ju
Lee, Young-Woon
Kim, Jongho
Jeong, Se Yoon
Choi, Jin Soo
Kim, Byung-Gyu
Attention-Based Bi-Prediction Network for Versatile Video Coding (VVC) over 5G Network
title Attention-Based Bi-Prediction Network for Versatile Video Coding (VVC) over 5G Network
title_full Attention-Based Bi-Prediction Network for Versatile Video Coding (VVC) over 5G Network
title_fullStr Attention-Based Bi-Prediction Network for Versatile Video Coding (VVC) over 5G Network
title_full_unstemmed Attention-Based Bi-Prediction Network for Versatile Video Coding (VVC) over 5G Network
title_short Attention-Based Bi-Prediction Network for Versatile Video Coding (VVC) over 5G Network
title_sort attention-based bi-prediction network for versatile video coding (vvc) over 5g network
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10007134/
https://www.ncbi.nlm.nih.gov/pubmed/36904838
http://dx.doi.org/10.3390/s23052631
work_keys_str_mv AT choiyoungju attentionbasedbipredictionnetworkforversatilevideocodingvvcover5gnetwork
AT leeyoungwoon attentionbasedbipredictionnetworkforversatilevideocodingvvcover5gnetwork
AT kimjongho attentionbasedbipredictionnetworkforversatilevideocodingvvcover5gnetwork
AT jeongseyoon attentionbasedbipredictionnetworkforversatilevideocodingvvcover5gnetwork
AT choijinsoo attentionbasedbipredictionnetworkforversatilevideocodingvvcover5gnetwork
AT kimbyunggyu attentionbasedbipredictionnetworkforversatilevideocodingvvcover5gnetwork