Skip to Main Content
Benefiting from the knowledge of speech, language, and hearing, a new technology has arisen to serve the users with complex information systems. This technology aims for a natural communication environment, capturing the attributes that humans favor in face-to-face exchange. Conversational interaction bears a central burden, with visual and manual signaling simultaneously supplementing the communication process. In addition to instrumenting the sensors for each mode, the interface must incorporate the context-aware algorithms in fusing and interpreting the multiple sensory channels. The ultimate objective is a reliable estimate of the user's intent, from which actionable responses can be made. The current research therefore addresses the multi-modal interfaces that can transcend from the limitations of the mouse and the keyboard. This report indicates the early status of the multimodal interfaces and identifies the emerging opportunities for enhanced usability and naturalness. It concludes by advocating the focused research on a frontier issue - the formulation of a quantitative language framework for multimodal communication.