Cargando…
Intra Prediction Method for Depth Video Coding by Block Clustering through Deep Learning
In this paper, we propose an intra-picture prediction method for depth video by a block clustering through a neural network. The proposed method solves a problem that the block that has two or more clusters drops the prediction performance of the intra prediction for depth video. The proposed neural...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9787791/ https://www.ncbi.nlm.nih.gov/pubmed/36560023 http://dx.doi.org/10.3390/s22249656 |
_version_ | 1784858597747851264 |
---|---|
author | Lee, Dong-seok Kwon, Soon-kak |
author_facet | Lee, Dong-seok Kwon, Soon-kak |
author_sort | Lee, Dong-seok |
collection | PubMed |
description | In this paper, we propose an intra-picture prediction method for depth video by a block clustering through a neural network. The proposed method solves a problem that the block that has two or more clusters drops the prediction performance of the intra prediction for depth video. The proposed neural network consists of both a spatial feature prediction network and a clustering network. The spatial feature prediction network utilizes spatial features in vertical and horizontal directions. The network contains a 1D CNN layer and a fully connected layer. The 1D CNN layer extracts the spatial features for a vertical direction and a horizontal direction from a top block and a left block of the reference pixels, respectively. 1D CNN is designed to handle time-series data, but it can also be applied to find the spatial features by regarding a pixel order in a certain direction as a timestamp. The fully connected layer predicts the spatial features of the block to be coded through the extracted features. The clustering network finds clusters from the spatial features which are the outputs of the spatial feature prediction network. The network consists of 4 CNN layers. The first 3 CNN layers combine two spatial features in the vertical and horizontal directions. The last layer outputs the probabilities that pixels belong to the clusters. The pixels of the block are predicted by the representative values of the clusters that are the average of the reference pixels belonging to the clusters. For the intra prediction for various block sizes, the block is scaled to the size of the network input. The prediction result through the proposed network is scaled back to the original size. In network training, the mean square error is used as a loss function between the original block and the predicted block. A penalty for output values far from both ends is introduced to the loss function for clear network clustering. In the simulation results, the bit rate is saved by up to 12.45% under the same distortion condition compared with the latest video coding standard. |
format | Online Article Text |
id | pubmed-9787791 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-97877912022-12-24 Intra Prediction Method for Depth Video Coding by Block Clustering through Deep Learning Lee, Dong-seok Kwon, Soon-kak Sensors (Basel) Article In this paper, we propose an intra-picture prediction method for depth video by a block clustering through a neural network. The proposed method solves a problem that the block that has two or more clusters drops the prediction performance of the intra prediction for depth video. The proposed neural network consists of both a spatial feature prediction network and a clustering network. The spatial feature prediction network utilizes spatial features in vertical and horizontal directions. The network contains a 1D CNN layer and a fully connected layer. The 1D CNN layer extracts the spatial features for a vertical direction and a horizontal direction from a top block and a left block of the reference pixels, respectively. 1D CNN is designed to handle time-series data, but it can also be applied to find the spatial features by regarding a pixel order in a certain direction as a timestamp. The fully connected layer predicts the spatial features of the block to be coded through the extracted features. The clustering network finds clusters from the spatial features which are the outputs of the spatial feature prediction network. The network consists of 4 CNN layers. The first 3 CNN layers combine two spatial features in the vertical and horizontal directions. The last layer outputs the probabilities that pixels belong to the clusters. The pixels of the block are predicted by the representative values of the clusters that are the average of the reference pixels belonging to the clusters. For the intra prediction for various block sizes, the block is scaled to the size of the network input. The prediction result through the proposed network is scaled back to the original size. In network training, the mean square error is used as a loss function between the original block and the predicted block. A penalty for output values far from both ends is introduced to the loss function for clear network clustering. In the simulation results, the bit rate is saved by up to 12.45% under the same distortion condition compared with the latest video coding standard. MDPI 2022-12-09 /pmc/articles/PMC9787791/ /pubmed/36560023 http://dx.doi.org/10.3390/s22249656 Text en © 2022 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Lee, Dong-seok Kwon, Soon-kak Intra Prediction Method for Depth Video Coding by Block Clustering through Deep Learning |
title | Intra Prediction Method for Depth Video Coding by Block Clustering through Deep Learning |
title_full | Intra Prediction Method for Depth Video Coding by Block Clustering through Deep Learning |
title_fullStr | Intra Prediction Method for Depth Video Coding by Block Clustering through Deep Learning |
title_full_unstemmed | Intra Prediction Method for Depth Video Coding by Block Clustering through Deep Learning |
title_short | Intra Prediction Method for Depth Video Coding by Block Clustering through Deep Learning |
title_sort | intra prediction method for depth video coding by block clustering through deep learning |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9787791/ https://www.ncbi.nlm.nih.gov/pubmed/36560023 http://dx.doi.org/10.3390/s22249656 |
work_keys_str_mv | AT leedongseok intrapredictionmethodfordepthvideocodingbyblockclusteringthroughdeeplearning AT kwonsoonkak intrapredictionmethodfordepthvideocodingbyblockclusteringthroughdeeplearning |