Conferences >ICASSP 2025 - 2025 IEEE Inter...

TFS: Revisiting Temporal Language Grounding from Frequency Spiking Perspective

Abstract:

Temporal Language Grounding (TLG) aims to localize moments in untrimmed videos that are most relevant to natural language queries. While existing weakly-supervised method...Show More

Metadata

Abstract:

Temporal Language Grounding (TLG) aims to localize moments in untrimmed videos that are most relevant to natural language queries. While existing weakly-supervised methods have achieved significant success in exploring cross-modal relationships, they still face a critical bottleneck: the interference of task-irrelevant information in query embeddings. To address this issue, we propose TLG Frequency Spiking (TFS), a dimensional mask derived from the frequency domain that models the varying importance specific to different queries. By enhancing the understanding of queries, TFS effectively optimizes the cross-modal alignment of visual and textual modalities. Experimental results show that TFS significantly outperforms state-of-the-art baselines on both the Charades-STA and ActivityNet-Captions datasets.

Published in: ICASSP 2025 - 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Date of Conference: 06-11 April 2025

Date Added to IEEE Xplore: 07 March 2025

ISBN Information:

ISSN Information:

DOI: 10.1109/ICASSP49660.2025.10889584

Conference Location: Hyderabad, India

Funding Agency:

Contents

References is not available for this document.

TFS: Revisiting Temporal Language Grounding from Frequency Spiking Perspective

Abstract:

Metadata

Abstract:

ISSN Information:

Funding Agency:

References

IEEE Account

Purchase Details

Profile Information

Need Help?

TFS: Revisiting Temporal Language Grounding from Frequency Spiking Perspective

Alerts

Abstract:

Metadata

Abstract:

ISSN Information:

Funding Agency:

References

IEEE Account

Purchase Details

Profile Information

Need Help?