Skip to Main Content
Real-time, two-way transmission of American Sign Language (ASL) video over cellular networks provides natural communication among members of the Deaf community. As a communication tool, compressed ASL video must be evaluated according to the intelligibility of the conversation, not according to conventional definitions of video quality. Guided by linguistic principles and human perception of ASL, this paper proposes a full-reference computational model of intelligibility for ASL (CIM-ASL) that is suitable for evaluating compressed ASL video. The CIM-ASL measures distortions only in regions relevant for ASL communication, using spatial and temporal pooling mechanisms that vary the contribution of distortions according to their relative impact on the intelligibility of the compressed video. The model is trained and evaluating using ground truth experimental data collected in three separate studies. The CIM-ASL provides accurate estimates of subjective intelligibility and demonstrates statistically significant improvements over computational models traditionally used to estimate video quality. The CIM-ASL is incorporated into an H.264 compliant video coding framework, creating a closed-loop encoding system optimized explicitly for ASL intelligibility. The ASL-optimized encoder achieves bitrate reductions between 10% and 42%, without reducing intelligibility, when compared to a general purpose H.264 encoder.