Abstract:
Summary form only given. "Multimodal" refers to the different senses (visual, audio, tactile, etc.) used in human-computer interface. "Multimedia" refers to the different...Show MoreMetadata
Abstract:
Summary form only given. "Multimodal" refers to the different senses (visual, audio, tactile, etc.) used in human-computer interface. "Multimedia" refers to the different ways of representing information (text, graphics, audio, images, video, etc.). A signal processing, analysis, or understanding task is called multimedia/multimodal, if it involves two or more modalities or media, interacting in nontrivial ways. We shall give an array of examples of multimedia/multimodal signal processing, analysis, and understanding, including: audio/visual speech recognition, and audio/visual emotion recognition. A stable and robust facial movement tracking algorithm is presented, which is used in both tasks.
Date of Conference: 21-24 March 2004
Date Added to IEEE Xplore: 27 September 2004
Print ISBN:0-7803-8379-6