Cargando…

Diffusion Probabilistic Modeling for Video Generation

Denoising diffusion probabilistic models are a promising new class of generative models that mark a milestone in high-quality image generation. This paper showcases their ability to sequentially generate video, surpassing prior methods in perceptual and probabilistic forecasting metrics. We propose...

Descripción completa

Detalles Bibliográficos
Autores principales: Yang, Ruihan, Srivastava, Prakhar, Mandt, Stephan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10606505/
https://www.ncbi.nlm.nih.gov/pubmed/37895590
http://dx.doi.org/10.3390/e25101469
_version_ 1785127332366778368
author Yang, Ruihan
Srivastava, Prakhar
Mandt, Stephan
author_facet Yang, Ruihan
Srivastava, Prakhar
Mandt, Stephan
author_sort Yang, Ruihan
collection PubMed
description Denoising diffusion probabilistic models are a promising new class of generative models that mark a milestone in high-quality image generation. This paper showcases their ability to sequentially generate video, surpassing prior methods in perceptual and probabilistic forecasting metrics. We propose an autoregressive, end-to-end optimized video diffusion model inspired by recent advances in neural video compression. The model successively generates future frames by correcting a deterministic next-frame prediction using a stochastic residual generated by an inverse diffusion process. We compare this approach against six baselines on four datasets involving natural and simulation-based videos. We find significant improvements in terms of perceptual quality and probabilistic frame forecasting ability for all datasets.
format Online
Article
Text
id pubmed-10606505
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-106065052023-10-28 Diffusion Probabilistic Modeling for Video Generation Yang, Ruihan Srivastava, Prakhar Mandt, Stephan Entropy (Basel) Article Denoising diffusion probabilistic models are a promising new class of generative models that mark a milestone in high-quality image generation. This paper showcases their ability to sequentially generate video, surpassing prior methods in perceptual and probabilistic forecasting metrics. We propose an autoregressive, end-to-end optimized video diffusion model inspired by recent advances in neural video compression. The model successively generates future frames by correcting a deterministic next-frame prediction using a stochastic residual generated by an inverse diffusion process. We compare this approach against six baselines on four datasets involving natural and simulation-based videos. We find significant improvements in terms of perceptual quality and probabilistic frame forecasting ability for all datasets. MDPI 2023-10-20 /pmc/articles/PMC10606505/ /pubmed/37895590 http://dx.doi.org/10.3390/e25101469 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Yang, Ruihan
Srivastava, Prakhar
Mandt, Stephan
Diffusion Probabilistic Modeling for Video Generation
title Diffusion Probabilistic Modeling for Video Generation
title_full Diffusion Probabilistic Modeling for Video Generation
title_fullStr Diffusion Probabilistic Modeling for Video Generation
title_full_unstemmed Diffusion Probabilistic Modeling for Video Generation
title_short Diffusion Probabilistic Modeling for Video Generation
title_sort diffusion probabilistic modeling for video generation
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10606505/
https://www.ncbi.nlm.nih.gov/pubmed/37895590
http://dx.doi.org/10.3390/e25101469
work_keys_str_mv AT yangruihan diffusionprobabilisticmodelingforvideogeneration
AT srivastavaprakhar diffusionprobabilisticmodelingforvideogeneration
AT mandtstephan diffusionprobabilisticmodelingforvideogeneration