Conferences >ICASSP 2025 - 2025 IEEE Inter...

XDGesture: An xLSTM-based Diffusion Model for Co-speech Gesture Generation

Abstract:

In multimodal human-computer interaction, generating co-speech gestures is crucial for enhancing interaction naturalness and user experience. However, achieving synchroni...Show More

Metadata

Abstract:

In multimodal human-computer interaction, generating co-speech gestures is crucial for enhancing interaction naturalness and user experience. However, achieving synchronized and natural gesture sequences remains a significant challenge due to the complexity of modeling temporal dependencies across different modalities. Existing methods often rely on simple concatenation techniques, which are limited in effectively handling multimodal information. To address this issue, we propose XDGesture, a diffusion-based framework that integrates a Cross-Modal Fusion module and xLSTM. The Cross-Modal Fusion module efficiently merges information from different modalities, providing the model with rich contextual conditions. Meanwhile, xLSTM, with its enhanced memory structure and exponential gating mechanism, processes the fused multimodal data, capturing long-range dependencies between speech and gestures. This enables the generation of high-quality gesture sequences that are naturally synchronized with speech. Experimental results demonstrate that XDGesture remarkably outperforms existing baselines on multiple datasets, particularly in terms of gesture quality, naturalness, and synchronization with speech.

Published in: ICASSP 2025 - 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Date of Conference: 06-11 April 2025

Date Added to IEEE Xplore: 07 March 2025

ISBN Information:

ISSN Information:

DOI: 10.1109/ICASSP49660.2025.10888507

Conference Location: Hyderabad, India

Contents

References is not available for this document.

XDGesture: An xLSTM-based Diffusion Model for Co-speech Gesture Generation

Abstract:

Metadata

Abstract:

ISSN Information:

References

IEEE Account

Purchase Details

Profile Information

Need Help?

XDGesture: An xLSTM-based Diffusion Model for Co-speech Gesture Generation

Alerts

Abstract:

Metadata

Abstract:

ISSN Information:

References

IEEE Account

Purchase Details

Profile Information

Need Help?