Abstract:
Sign language is a language for communicating among the deaf and hard of hearing people and the hearing people. Sign language has its own grammatical system which is diff...Show MoreMetadata
Abstract:
Sign language is a language for communicating among the deaf and hard of hearing people and the hearing people. Sign language has its own grammatical system which is different from spoken or written languages. A sentence of sign language consists of glosses as morphemes, and the meaning of sign language depends on the movements of a body, hands, finger shapes, and facial expressions. Previous methods can translate sign language sentences into a sequence of glosses or written language but are weak for the out-of-vocabulary problem such as variations of word order. This is because the model only considers a sign language sentence as an inseparable sequence, even though the sentence consists of multiple glosses. In this paper, we propose a method that predicting every gloss for each video frame of Korean sign language using transformers. Predicted frame-by-frame gloss information can be used to transform a video of a sign language sentence into a gloss sequence, even if the model has not learned that pattern.
Published in: 2021 International Conference on Information and Communication Technology Convergence (ICTC)
Date of Conference: 20-22 October 2021
Date Added to IEEE Xplore: 07 December 2021
ISBN Information:
Print on Demand(PoD) ISSN: 2162-1233