Loading [MathJax]/extensions/MathMenu.js
Looking for the Signs: Identifying Isolated Sign Instances in Continuous Video Footage | IEEE Conference Publication | IEEE Xplore

Looking for the Signs: Identifying Isolated Sign Instances in Continuous Video Footage


Abstract:

In this paper, we focus on the task of one-shot sign spotting, i.e. given an example of an isolated sign (query), we want to identify whether/where this sign appears in a...Show More

Abstract:

In this paper, we focus on the task of one-shot sign spotting, i.e. given an example of an isolated sign (query), we want to identify whether/where this sign appears in a continuous, co-articulated sign language video (target). To achieve this goal, we propose a transformer-based network, called Sign-Lookup. We employ 3D Convolutional Neural Networks (CNNs) to extract spatio-temporal representations from video clips. To solve the temporal scale discrepancies between the query and the target videos, we construct multiple queries from a single video clip using different frame-level strides. Self-attention is applied across these query clips to simulate a continuous scale space. We also utilize another self-attention module on the target video to learn the contextual within the sequence. Finally mutual-attention is used to match the temporal scales to localize the query within the target sequence. Extensive experiments demonstrate that the proposed approach can not only reliably identify isolated signs in continuous videos, regardless of the signers' appearance, but can also generalize to different sign languages. By taking advantage of the attention mechanism and the adaptive features, our model achieves state-of-the-art performance on the sign spotting task with accuracy as high as 96% on challenging benchmark datasets and significantly outperforming other approaches.
Date of Conference: 15-18 December 2021
Date Added to IEEE Xplore: 12 January 2022
ISBN Information:
Conference Location: Jodhpur, India

Funding Agency:


I. Introduction

Sign Languages are the native languages among Deaf communities. They are languages in their own right, distinct from spoken language but as linguistically complex as any spoken language. Each country has its own sign language often with regional variations. They incorporate manual (including handshape and motion) and non-manual (facial expression and body posture) channels or articulators which are combined via grammatical constructs that use both direction and space to convey meaning. As such, the grammar and word ordering of sign is very different to spoken language.

Contact IEEE to Subscribe

References

References is not available for this document.