Cargando…
Attention-Based Bi-Prediction Network for Versatile Video Coding (VVC) over 5G Network
As the demands of various network-dependent services such as Internet of things (IoT) applications, autonomous driving, and augmented and virtual reality (AR/VR) increase, the fifthgeneration (5G) network is expected to become a key communication technology. The latest video coding standard, versati...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10007134/ https://www.ncbi.nlm.nih.gov/pubmed/36904838 http://dx.doi.org/10.3390/s23052631 |
_version_ | 1784905443393404928 |
---|---|
author | Choi, Young-Ju Lee, Young-Woon Kim, Jongho Jeong, Se Yoon Choi, Jin Soo Kim, Byung-Gyu |
author_facet | Choi, Young-Ju Lee, Young-Woon Kim, Jongho Jeong, Se Yoon Choi, Jin Soo Kim, Byung-Gyu |
author_sort | Choi, Young-Ju |
collection | PubMed |
description | As the demands of various network-dependent services such as Internet of things (IoT) applications, autonomous driving, and augmented and virtual reality (AR/VR) increase, the fifthgeneration (5G) network is expected to become a key communication technology. The latest video coding standard, versatile video coding (VVC), can contribute to providing high-quality services by achieving superior compression performance. In video coding, inter bi-prediction serves to improve the coding efficiency significantly by producing a precise fused prediction block. Although block-wise methods, such as bi-prediction with CU-level weight (BCW), are applied in VVC, it is still difficult for the linear fusion-based strategy to represent diverse pixel variations inside a block. In addition, a pixel-wise method called bi-directional optical flow (BDOF) has been proposed to refine bi-prediction block. However, the non-linear optical flow equation in BDOF mode is applied under assumptions, so this method is still unable to accurately compensate various kinds of bi-prediction blocks. In this paper, we propose an attention-based bi-prediction network (ABPN) to substitute for the whole existing bi-prediction methods. The proposed ABPN is designed to learn efficient representations of the fused features by utilizing an attention mechanism. Furthermore, the knowledge distillation (KD)- based approach is employed to compress the size of the proposed network while keeping comparable output as the large model. The proposed ABPN is integrated into the VTM-11.0 NNVC-1.0 standard reference software. When compared with VTM anchor, it is verified that the BD-rate reduction of the lightweighted ABPN can be up to 5.89% and 4.91% on Y component under random access (RA) and low delay B (LDB), respectively. |
format | Online Article Text |
id | pubmed-10007134 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-100071342023-03-12 Attention-Based Bi-Prediction Network for Versatile Video Coding (VVC) over 5G Network Choi, Young-Ju Lee, Young-Woon Kim, Jongho Jeong, Se Yoon Choi, Jin Soo Kim, Byung-Gyu Sensors (Basel) Article As the demands of various network-dependent services such as Internet of things (IoT) applications, autonomous driving, and augmented and virtual reality (AR/VR) increase, the fifthgeneration (5G) network is expected to become a key communication technology. The latest video coding standard, versatile video coding (VVC), can contribute to providing high-quality services by achieving superior compression performance. In video coding, inter bi-prediction serves to improve the coding efficiency significantly by producing a precise fused prediction block. Although block-wise methods, such as bi-prediction with CU-level weight (BCW), are applied in VVC, it is still difficult for the linear fusion-based strategy to represent diverse pixel variations inside a block. In addition, a pixel-wise method called bi-directional optical flow (BDOF) has been proposed to refine bi-prediction block. However, the non-linear optical flow equation in BDOF mode is applied under assumptions, so this method is still unable to accurately compensate various kinds of bi-prediction blocks. In this paper, we propose an attention-based bi-prediction network (ABPN) to substitute for the whole existing bi-prediction methods. The proposed ABPN is designed to learn efficient representations of the fused features by utilizing an attention mechanism. Furthermore, the knowledge distillation (KD)- based approach is employed to compress the size of the proposed network while keeping comparable output as the large model. The proposed ABPN is integrated into the VTM-11.0 NNVC-1.0 standard reference software. When compared with VTM anchor, it is verified that the BD-rate reduction of the lightweighted ABPN can be up to 5.89% and 4.91% on Y component under random access (RA) and low delay B (LDB), respectively. MDPI 2023-02-27 /pmc/articles/PMC10007134/ /pubmed/36904838 http://dx.doi.org/10.3390/s23052631 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Choi, Young-Ju Lee, Young-Woon Kim, Jongho Jeong, Se Yoon Choi, Jin Soo Kim, Byung-Gyu Attention-Based Bi-Prediction Network for Versatile Video Coding (VVC) over 5G Network |
title | Attention-Based Bi-Prediction Network for Versatile Video Coding (VVC) over 5G Network |
title_full | Attention-Based Bi-Prediction Network for Versatile Video Coding (VVC) over 5G Network |
title_fullStr | Attention-Based Bi-Prediction Network for Versatile Video Coding (VVC) over 5G Network |
title_full_unstemmed | Attention-Based Bi-Prediction Network for Versatile Video Coding (VVC) over 5G Network |
title_short | Attention-Based Bi-Prediction Network for Versatile Video Coding (VVC) over 5G Network |
title_sort | attention-based bi-prediction network for versatile video coding (vvc) over 5g network |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10007134/ https://www.ncbi.nlm.nih.gov/pubmed/36904838 http://dx.doi.org/10.3390/s23052631 |
work_keys_str_mv | AT choiyoungju attentionbasedbipredictionnetworkforversatilevideocodingvvcover5gnetwork AT leeyoungwoon attentionbasedbipredictionnetworkforversatilevideocodingvvcover5gnetwork AT kimjongho attentionbasedbipredictionnetworkforversatilevideocodingvvcover5gnetwork AT jeongseyoon attentionbasedbipredictionnetworkforversatilevideocodingvvcover5gnetwork AT choijinsoo attentionbasedbipredictionnetworkforversatilevideocodingvvcover5gnetwork AT kimbyunggyu attentionbasedbipredictionnetworkforversatilevideocodingvvcover5gnetwork |