SUPR
Speech-driven full-body animation using deep learning
Dnr:

NAISS 2024/6-138

Type:

NAISS Medium Storage

Principal Investigator:

Christopher Peters

Affiliation:

Kungliga Tekniska högskolan

Start Date:

2024-04-29

End Date:

2025-05-01

Primary Classification:

10207: Computer Vision and Robotics (Autonomous Systems)

Secondary Classification:

10208: Language Technology (Computational Linguistics)

Tertiary Classification:

10102: Geometry

Allocation

Abstract

Automatic coherent gesticulation of 3D avatars that clarifies the understanding of unique information being said is an essential task in the animation community. Effectively generated gestures visually stimulate the conversation and emphasize appropriately the spoken words. Automatic speech-driven gestures and facial expressions complement human speech in communication. Previous research focuses on generating full-body animations that are temporally in sync with the speech. But the generated animation does not strongly correspond to the emotional context of driving speech. In [1], our work presents speech-driven 3D facial animation with accurate lip-sync and user-driven emotion control. In [2], we present an automatic 3D body emotional gesture generation method by emotional monologue speeches based on generative AI. However, the motion capture (MOCAP) recorded samples of real humans are more expressive and represent the emotional context better than the presented methods. We would like to further examine deep learning methods to improve the gesticulation performance that is closer to the ground truth MOCAP data. Further, we also intend to conduct research in speech-driven multi-human interaction scenarios. [1] Danecek, R., Chhatre, K., Tripathi, S., Wen, Y., Black, M.J., & Bolkart, T. (2023). Emotional Speech-Driven Animation with Content-Emotion Disentanglement. ArXiv, abs/2306.08990. [2] Chhatre, K. et al. (2023) Anonymous work under review, based on 3D human gesture generation using emotional monologue speeches. Work done is in collaboration with KTH SE, Max Planck Institute Tuebingen DE, and Google Research Zurich CH.