1. INTRODUCTION
In contrast with the communication between normal people in daily life, sign language almost plays an irreplaceable role in the deaf community, which can convey meaning through visualized signals with a series of coherent gestural movements and facial expressions. Computer vision-based continuous Sign Language Recognition (SLR) can extract visual features from the original input and recognize the related sign language glosses [1], [2]. As a cutting-edge de facto tool, deep learning allows multi-layer networks to be fed with preprocessed vectors and to automatically extract rules, which is more effective in recognizing images, video, speech, and audio [3]. Therefore, it should be taken for granted that Deep Neural Networks (DNN) have dramatically brought about breakthroughs in continuous SLR [4].