We present an extensive analytical evaluation of the effectiveness of a multimedia (text and speech) meeting browser which employs a combination of automatic speech recognition and a novel content indexing technique called temporal mapping in order to uncover contextual relations between audio and text segments in recorded remote meetings. Results show that even simple temporal mapping can effectively support browsing and retrieval of recorded audio segments and improve retrieval performance in situations where speech recognition would on its own have exhibited prohibitively high word error rates.
Published in:
Semantic Media Adaptation and Personalization, 2006. SMAP '06. First International Workshop on
Date of Conference: Dec. 2006