2002 IEEE Workshop on Multimedia Signal Processing.

9-11 Dec. 2002

Filter Results

Displaying Results 1 - 25 of 116
  • Proceedings of 2002 IEEE Workshop on Multimedia Signal Processing (Cat. No.02TH8661)

    Publication Year: 2002
    Request permission for commercial reuse | PDF file iconPDF (288 KB)
    Freely Available from IEEE
  • Author index

    Publication Year: 2002, Page(s):469 - 471
    Request permission for commercial reuse | PDF file iconPDF (136 KB)
    Freely Available from IEEE
  • Wipe effect detection for video sequences

    Publication Year: 2002, Page(s):161 - 164
    Cited by:  Papers (7)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (381 KB) | HTML iconHTML

    The design of automatic tools to allow content-based analysis, browsing, and retrieval is of paramount importance due to the wide spread of multimedia databases and to the enormous amount of information they contain. In this paper we present an algorithm tailored to the detection of editing effects such as wipes, which are widely used in television and movies production to emphasize scene changes.... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Video genre verification using both acoustic and visual modes

    Publication Year: 2002, Page(s):157 - 160
    Cited by:  Papers (9)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (364 KB) | HTML iconHTML

    This paper reports on the verification of the video genre: sport, cartoon, news, commercial and music. Results for the two modes, acoustic and visual, and for combined modes show an average equal error rate (ERR) of 16%, 15% and 10%, respectively. These reflect verification accuracy and as such are believed to be the first of their kind; previously published work has focused on closed set identifi... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Recovery of lost VQ indexes in packet transmission

    Publication Year: 2002, Page(s):65 - 68
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (464 KB) | HTML iconHTML

    We consider the problem of robust transmission of VQ-coded image/video via noisy packet networks. In the event of packet loss, some VQ index bits will be absent at the receiver side. But the very knowledge of lost packets identities the spatial locations of affected VQ blocks, which are powerful information for the decoder to estimate the missing VQ index bits. This is possible because of the stat... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An open source development tool for anthropomorphic dialog agent: face image synthesis and lip synchronization

    Publication Year: 2002, Page(s):272 - 275
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (530 KB) | HTML iconHTML

    We describe the design and report the development of an open source ware toolkit for building an easily customizable anthropomorphic dialog agent. This toolkit consists of four modules for multi-modal dialog integration, speech recognition, speech synthesis, and face image synthesis. In this paper, we focus on the construction of an agent's face image synthesis. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Musical query-by-description as a multiclass learning problem

    Publication Year: 2002, Page(s):153 - 156
    Cited by:  Papers (9)  |  Patents (4)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (382 KB) | HTML iconHTML

    We present the query-by-description (QBD) component of "Kandem", a time-aware music retrieval system. The QBD system we describe learns a relation between descriptive text concerning a musical artist and their actual acoustic output, making such queries as "Play me something loud with an electronic beat" possible by merely analyzing the audio content of a database. We show a novel machine learning... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Adaptive shape-texture intra coding refreshment for error resilient object-based video

    Publication Year: 2002, Page(s):113 - 116
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (436 KB) | HTML iconHTML

    Video encoders may use several techniques to improve error resilience. In particular, for video encoders that rely on predictive (inter) coding to remove temporal redundancy, intra coding refreshment is especially useful to stop error propagation when errors occur in the transmission or storage of the coded streams, which can cause the decoded quality to decay very rapidly. In object-based video c... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Reverse link analysis of cellular packet data networks with multiple receive antennas

    Publication Year: 2002, Page(s):400 - 403
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (357 KB) | HTML iconHTML

    The reverse link of a wireless packet data system with a varying number of users is considered. A time scale separation approximation is used to justify the analysis of this system based on a processor sharing model, and to compute the tradeoff between the offered load and the throughput seen by a typical user. When multiple receive antennas are available at the base station, it is shown that simp... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Stream-weighted HMM for audio-visual ASR: a study on connected digit recognition

    Publication Year: 2002, Page(s):1 - 4
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (552 KB) | HTML iconHTML

    We present some new results on connected digit recognition in noisy environments by audio-visual speech recognition. We derive hybrid (geometric- and appearance-based) visual lip features using a real-time lip-tracking algorithm that we proposed previously. Using a single-speaker corpus modeled after the TIDIGITS database, we build whole-word HMMs using both single-stream and 2-stream modeling str... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Unequal error protection of embedded multimedia objects for packet-erasure channels

    Publication Year: 2002, Page(s):61 - 64
    Cited by:  Papers (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (527 KB) | HTML iconHTML

    The application of forward-error-correcting codes to data organized as multiple, independent multimedia objects and encoded with modern embedded coders is investigated. Capitalizing on the strict importance-ordering characteristics of embedded encodings, the strength of the error protection is optimized such that is more important to the reconstructed quality of the dataset is assigned stronger pr... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Noise robust hands-free speech recognition using microphone array and Kalman filter as front-end system of conversational TV

    Publication Year: 2002, Page(s):268 - 271
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (364 KB) | HTML iconHTML

    In this paper, we investigate hands-free speech recognition as front-end system of conversational TV. The conversational TV is one of machine conversation systems to retrieve the interesting information by inquiring it to the TV. To realize the natural machine conversation without consciousness of microphone, hands-free speech recognition is required. In the hands-free speech recognition system, t... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A new hybrid error concealment scheme for MPEG-2 video transmission

    Publication Year: 2002, Page(s):29 - 32
    Cited by:  Papers (6)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (409 KB) | HTML iconHTML

    For entropy-coded MPEG-2 video frames, a transmission error will not only affect the underlying codeword but also may affect subsequent codewords, resulting in a great degradation of the received video frames. In this study, transmission errors in MPEG-2 video frames are first detected and located by the error detection scheme proposed by Shyu and Leou [1999], and then the corrupted blocks are con... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Entropy- and complexity-constrained classified quantizer design for distributed image classification

    Publication Year: 2002, Page(s):77 - 80
    Cited by:  Papers (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (385 KB) | HTML iconHTML

    In this paper, we address the issue of feature encoding for distributed image classification systems. Such systems often extract a set of features such as color, texture and shape from the raw multimedia data automatically and store them as content descriptors. This content-based metadata supports a wider variety of queries than text-based metadata and thus provides a promising approach for effici... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Realtime object extraction and tracking with an active camera using image mosaics

    Publication Year: 2002, Page(s):149 - 152
    Cited by:  Papers (3)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (474 KB) | HTML iconHTML

    Moving object extraction plays a key role in applications such as object-based videoconference, surveillance, and so on. The difficulties of moving object segmentation lie in the fact that physical objects are normally not homogeneous with respect to low-level features and it's usually tough to segment them accurately and efficiently. Object segmentation based on prestored background information h... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Comparing the quality of multiple descriptions of multimedia documents

    Publication Year: 2002, Page(s):241 - 244
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (375 KB) | HTML iconHTML

    With the definition of the MPEG-7 standard, thanks to its inter-operability behaviors, it is now possible for applications to use content descriptions of a same document, coming from different sources. This implies that the overall information available at the application can be highly redundant and mechanisms for filtering the information are hence required. In this work, a general approach to de... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Comparison of residual compression methods in motion compensated video

    Publication Year: 2002, Page(s):109 - 112
    Cited by:  Papers (3)  |  Patents (19)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (448 KB) | HTML iconHTML

    In this paper, we compare the objective performance of several algorithms for coding motion compensated residuals, including the wavelet transform, matching pursuits and an improved embedded DCT-based coding method. A motion-compensated prediction-based approach with overlapped block matching compensation (OBMC) is used in all systems in the evaluation. The results show that matching pursuits outp... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Signal combining techniques for video watermarking extraction

    Publication Year: 2002, Page(s):347 - 350
    Cited by:  Patents (3)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (349 KB) | HTML iconHTML

    This paper analyses the effect of signal combining techniques in video watermark extraction. A spread-spectrum like discrete cosine transform domain (DCT domain) watermarking technique is used as embedding method, together with common error correction codes (BCH, Reed-Solomon with multilevel signaling, and binary convolutional codes with Viterbi decoding). Besides an analytical evaluation of the s... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A DSP based MPEG-2 video decoder for HDTV or multichannel SDTV

    Publication Year: 2002, Page(s):134 - 137
    Cited by:  Papers (1)  |  Patents (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (394 KB) | HTML iconHTML

    This paper describes a DSP based MPEG-2 video decoding software. The proposed decoder is able to reconstruct with full quality, in real-time, a sequence in HDTV format (corresponding to a subset of the MP@HL configuration) or up to three SDTV sequences (corresponding to the MP@ML configuration). The developed implementation is based on single DSP, reducing the cost and enabling an easy upgrading o... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Eyeball Video Communications Platform

    Publication Year: 2002, Page(s):396 - 399
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (472 KB) | HTML iconHTML

    Eyeball Video Communications Platform (VCP) provides a comprehensive solution for video communications, instant messaging, remote collaboration and application development. Eyeball VCP supports one-to-one and many-to-many video communications and collaboration utilizing peer-to-peer data transport without employing any reflector service. This structure is not only cost effective but also provides ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Video personalization and summarization system

    Publication Year: 2002, Page(s):424 - 427
    Cited by:  Papers (9)  |  Patents (4)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (515 KB) | HTML iconHTML

    A video personalization and summarization system is designed and implemented to dynamically generate a personalized video summary. The personalization system adopts the three-tier server-middleware-client architecture in order to select, adapt, and deliver rich media content to the user. The server stores the content sources along with their corresponding MPEG-7 metadata descriptions. These semant... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Algorithm for summarization and key extraction in athletic video

    Publication Year: 2002, Page(s):229 - 232
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (354 KB) | HTML iconHTML

    In this paper, we present an effective framework for features extraction from an athletic sport sequence. The extracted features are the start and finish of the race and the type of competition. Our approach is based on camera movement detection and it process MPEG-2 video sequences. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Aerial communications using piano, clarinet, and bells

    Publication Year: 2002, Page(s):460 - 463
    Cited by:  Papers (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (453 KB) | HTML iconHTML

    This work explores novel mechanisms for aerial acoustic machine-machine communications. It builds on previous work by some of the authors, as well as others. In this paper we describe aerial acoustic communication systems that sound like musical instruments. The sound primitives come from simple models for the sound of the piano, the clarinet, and the bells. The messages are coded by combining the... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • The distributed Karhunen-Loeve transform

    Publication Year: 2002, Page(s):57 - 60
    Cited by:  Papers (19)  |  Patents (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (333 KB) | HTML iconHTML

    The Karhunen-Loeve transform is a key element of many signal processing tasks, including classification and compression. In this paper, we consider distributed signal processing scenarios with limited communication between correlated sources, and we investigate a distributed Karhunen-Loeve transform (KLT). In particular, a partial (where only a subset of sources are observed) and a conditional KLT... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Characterization of abrupt/gradual video shot transitions as unsmoothed/smoothed singularity

    Publication Year: 2002, Page(s):202 - 205
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (392 KB) | HTML iconHTML

    The multitude of high-level video operations demands sophisticated low-level video techniques. Detection (or segmentation) of video shot transitions (or boundaries) is one of the crucial low-level operations towards automatic video indexing, video editing, video abstracting or preview, and so on. Since content-based multimedia processing has been the focus of MPEG7, we shall develop a scheme to pr... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.