【PAPER-introduction】VideoGPT

personal study note of work VideoGPT()

paper: VideoGPT: Video Generation using VQ-VAE and Transformers
code: github

Background

Multiple typies of Deep Generative Models have shown great potential and made incredible progress in last few years.
Meanwhile, high-fidelity natural videos is one notable modality that has not seen the same level of progress in generative modeling as compared to images, audio, and text.
It is reasonable because there’s high-dimentional data with spatio-temporal corral

structure

reference

苏剑林’s 科学空间(space of science)