Cargando…

Design of GPU Network-on-Chip for Real-Time Video Super-Resolution Reconstruction

Deep learning has a better output quality compared with traditional algorithms for video super-resolution (SR), but the network model needs large resources and has poor real-time performance. This paper focuses on solving the speed problem of SR; it achieves real-time SR by the collaborative design...

Descripción completa

Detalles Bibliográficos
Autores principales:	Peng, Zhiyong, Du, Jiang, Qiao, Yulong
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	MDPI 2023
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10223162/ https://www.ncbi.nlm.nih.gov/pubmed/37241678 http://dx.doi.org/10.3390/mi14051055

_version_	1785049875324338176
author	Peng, Zhiyong Du, Jiang Qiao, Yulong
author_facet	Peng, Zhiyong Du, Jiang Qiao, Yulong
author_sort	Peng, Zhiyong
collection	PubMed
description	Deep learning has a better output quality compared with traditional algorithms for video super-resolution (SR), but the network model needs large resources and has poor real-time performance. This paper focuses on solving the speed problem of SR; it achieves real-time SR by the collaborative design of a deep learning video SR algorithm and GPU parallel acceleration. An algorithm combining deep learning networks with a lookup table (LUT) is proposed for the video SR, which ensures both the SR effect and ease of GPU parallel acceleration. The computational efficiency of the GPU network-on-chip algorithm is improved to ensure real-time performance by three major GPU optimization strategies: storage access optimization, conditional branching function optimization, and threading optimization. Finally, the network-on-chip was implemented on a RTX 3090 GPU, and the validity of the algorithm was demonstrated through ablation experiments. In addition, SR performance is compared with existing classical algorithms based on standard datasets. The new algorithm was found to be more efficient than the SR-LUT algorithm. The average PSNR was 0.61 dB higher than the SR-LUT-V algorithm and 0.24 dB higher than the SR-LUT-S algorithm. At the same time, the speed of real video SR was tested. For a real video with a resolution of [Formula: see text] , the proposed GPU network-on-chip achieved a speed of 42 FPS. The new method is 9.1 times faster than the original SR-LUT-S fast method, which was directly imported into the GPU for processing.
format	Online Article Text
id	pubmed-10223162
institution	National Center for Biotechnology Information
language	English
publishDate	2023
publisher	MDPI
record_format	MEDLINE/PubMed
spelling	pubmed-102231622023-05-28 Design of GPU Network-on-Chip for Real-Time Video Super-Resolution Reconstruction Peng, Zhiyong Du, Jiang Qiao, Yulong Micromachines (Basel) Article Deep learning has a better output quality compared with traditional algorithms for video super-resolution (SR), but the network model needs large resources and has poor real-time performance. This paper focuses on solving the speed problem of SR; it achieves real-time SR by the collaborative design of a deep learning video SR algorithm and GPU parallel acceleration. An algorithm combining deep learning networks with a lookup table (LUT) is proposed for the video SR, which ensures both the SR effect and ease of GPU parallel acceleration. The computational efficiency of the GPU network-on-chip algorithm is improved to ensure real-time performance by three major GPU optimization strategies: storage access optimization, conditional branching function optimization, and threading optimization. Finally, the network-on-chip was implemented on a RTX 3090 GPU, and the validity of the algorithm was demonstrated through ablation experiments. In addition, SR performance is compared with existing classical algorithms based on standard datasets. The new algorithm was found to be more efficient than the SR-LUT algorithm. The average PSNR was 0.61 dB higher than the SR-LUT-V algorithm and 0.24 dB higher than the SR-LUT-S algorithm. At the same time, the speed of real video SR was tested. For a real video with a resolution of [Formula: see text] , the proposed GPU network-on-chip achieved a speed of 42 FPS. The new method is 9.1 times faster than the original SR-LUT-S fast method, which was directly imported into the GPU for processing. MDPI 2023-05-16 /pmc/articles/PMC10223162/ /pubmed/37241678 http://dx.doi.org/10.3390/mi14051055 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle	Article Peng, Zhiyong Du, Jiang Qiao, Yulong Design of GPU Network-on-Chip for Real-Time Video Super-Resolution Reconstruction
title	Design of GPU Network-on-Chip for Real-Time Video Super-Resolution Reconstruction
title_full	Design of GPU Network-on-Chip for Real-Time Video Super-Resolution Reconstruction
title_fullStr	Design of GPU Network-on-Chip for Real-Time Video Super-Resolution Reconstruction
title_full_unstemmed	Design of GPU Network-on-Chip for Real-Time Video Super-Resolution Reconstruction
title_short	Design of GPU Network-on-Chip for Real-Time Video Super-Resolution Reconstruction
title_sort	design of gpu network-on-chip for real-time video super-resolution reconstruction
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10223162/ https://www.ncbi.nlm.nih.gov/pubmed/37241678 http://dx.doi.org/10.3390/mi14051055
work_keys_str_mv	AT pengzhiyong designofgpunetworkonchipforrealtimevideosuperresolutionreconstruction AT dujiang designofgpunetworkonchipforrealtimevideosuperresolutionreconstruction AT qiaoyulong designofgpunetworkonchipforrealtimevideosuperresolutionreconstruction

Design of GPU Network-on-Chip for Real-Time Video Super-Resolution Reconstruction

Ejemplares similares