Generative masked transformers have demonstrated remarkable success across various content generation tasks, primarily due to their ability to effectively model large-scale dataset distributions with high consistency. However, in the animation …
A novel framework for human interaction generation using collaborative masked modeling in the discrete space, which explicitly models spatio-temporal dependencies within and between the interacting individuals.
Human motion stylization aims to revise the style of an input motion while keeping its content unaltered. Unlike existing works that operate directly in pose space, we leverage the extit{latent space} of pretrained autoencoders as a more expressive …
Growing interests in RGB-D salient object detection (RGB-D SOD) have been witnessed in recent years, owing partly to the popularity of depth sensors and the rapid progress of deep learning techniques. Unfortunately, existing RGB-D SOD methods …