PhonemeNet: A Transformer Pipeline for Text-Driven Facial Animation
A transformer pipeline for text-driven facial animation exploiting phoneme-level speech structure, achieving real-time performance and best-in-class lip synchronization accuracy. …
A transformer pipeline for text-driven facial animation exploiting phoneme-level speech structure, achieving real-time performance and best-in-class lip synchronization accuracy. …
A full-pipeline platform for interactive AI character experiences, demonstrated through Digital Einstein and deployed at scientific conferences, technology events, and public …
SIGGRAPH Asia 2024 Emerging Technologies demonstration describing the physical installation and AI integration of Digital Einstein at the Tokyo venue.
EmoSpaceTime decouples emotion and content in 3D speech animation through contrastive learning, enabling fine-grained control over emotional expressivity independent of spoken …
A multimodal dialog act classifier integrating text and acoustic features for real-time classification in conversations with digital characters. CUI 2024.