Disentangled Motion Modeling for Video Frame Interpolation

Salvato in:
Dettagli Bibliografici
Pubblicato in:arXiv.org (Dec 19, 2024), p. n/a
Autore principale: Lew, Jaihyun
Altri autori: Choi, Jooyoung, Shin, Chaehun, Jung, Dahuin, Yoon, Sungroh
Pubblicazione:
Cornell University Library, arXiv.org
Soggetti:
Accesso online:Citation/Abstract
Full text outside of ProQuest
Tags: Aggiungi Tag
Nessun Tag, puoi essere il primo ad aggiungerne!!

MARC

LEADER 00000nab a2200000uu 4500
001 3072356345
003 UK-CbPIL
022 |a 2331-8422 
035 |a 3072356345 
045 0 |b d20241219 
100 1 |a Lew, Jaihyun 
245 1 |a Disentangled Motion Modeling for Video Frame Interpolation 
260 |b Cornell University Library, arXiv.org  |c Dec 19, 2024 
513 |a Working Paper 
520 3 |a Video Frame Interpolation (VFI) aims to synthesize intermediate frames between existing frames to enhance visual smoothness and quality. Beyond the conventional methods based on the reconstruction loss, recent works have employed generative models for improved perceptual quality. However, they require complex training and large computational costs for pixel space modeling. In this paper, we introduce disentangled Motion Modeling (MoMo), a diffusion-based approach for VFI that enhances visual quality by focusing on intermediate motion modeling. We propose a disentangled two-stage training process. In the initial stage, frame synthesis and flow models are trained to generate accurate frames and flows optimal for synthesis. In the subsequent stage, we introduce a motion diffusion model, which incorporates our novel U-Net architecture specifically designed for optical flow, to generate bi-directional flows between frames. By learning the simpler low-frequency representation of motions, MoMo achieves superior perceptual quality with reduced computational demands compared to the generative modeling methods on the pixel space. MoMo surpasses state-of-the-art methods in perceptual metrics across various benchmarks, demonstrating its efficacy and efficiency in VFI. 
653 |a Computing costs 
653 |a Pixels 
653 |a Smoothness 
653 |a Frames (data processing) 
653 |a Optical flow (image analysis) 
653 |a Interpolation 
653 |a Computational efficiency 
700 1 |a Choi, Jooyoung 
700 1 |a Shin, Chaehun 
700 1 |a Jung, Dahuin 
700 1 |a Yoon, Sungroh 
773 0 |t arXiv.org  |g (Dec 19, 2024), p. n/a 
786 0 |d ProQuest  |t Engineering Database 
856 4 1 |3 Citation/Abstract  |u https://www.proquest.com/docview/3072356345/abstract/embedded/6A8EOT78XXH2IG52?source=fedsrch 
856 4 0 |3 Full text outside of ProQuest  |u http://arxiv.org/abs/2406.17256