Generative Human Motion Stylization in Latent Space

Abstract

Human motion stylization aims to revise the style of an input motion while keeping its content unaltered. Unlike existing works that operate directly in pose space, we leverage the extit{latent space} of pretrained autoencoders as a more expressive and robust representation for motion extraction and infusion. Building upon this, we present a novel extit{generative} model that produces diverse stylization results of a single motion (latent) code. During training, a motion code is decomposed into two coding components: a deterministic content code, and a probabilistic style code adhering to a prior distribution; then a generator massages the random combination of content and style codes to reconstruct the corresponding motion codes. Our approach is versatile, allowing the learning of probabilistic style space from either style labeled or unlabeled motions, providing notable flexibility in stylization as well. In inference, users can opt to stylize a motion using style cues from a reference motion or a label. Even in the absence of explicit style input, our model facilitates novel re-stylization by sampling from the unconditional style prior distribution. Experimental results show that our proposed stylization models, despite their lightweight design, outperform the state-of-the-arts in style reeanactment, content preservation, and generalization across various applications and settings.

Publication
International Conference on Learning Representations, 2024
Yuxuan Mu
Yuxuan Mu
M.Sc. student
Xinxin Zuo
Xinxin Zuo
Postdoctoral Fellow
Li Cheng
Li Cheng
Professor