By Topic

2002 IEEE Workshop on Multimedia Signal Processing.

9-11 Dec. 2002

Filter Results

Displaying Results 1 - 25 of 116
  • Proceedings of 2002 IEEE Workshop on Multimedia Signal Processing (Cat. No.02TH8661)

    Publication Year: 2002
    Request permission for commercial reuse | PDF file iconPDF (288 KB)
    Freely Available from IEEE
  • Author index

    Publication Year: 2002, Page(s):469 - 471
    Request permission for commercial reuse | PDF file iconPDF (136 KB)
    Freely Available from IEEE
  • Adaptive shape-texture intra coding refreshment for error resilient object-based video

    Publication Year: 2002, Page(s):113 - 116
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (436 KB) | HTML iconHTML

    Video encoders may use several techniques to improve error resilience. In particular, for video encoders that rely on predictive (inter) coding to remove temporal redundancy, intra coding refreshment is especially useful to stop error propagation when errors occur in the transmission or storage of the coded streams, which can cause the decoded quality to decay very rapidly. In object-based video c... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Realtime object extraction and tracking with an active camera using image mosaics

    Publication Year: 2002, Page(s):149 - 152
    Cited by:  Papers (3)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (474 KB) | HTML iconHTML

    Moving object extraction plays a key role in applications such as object-based videoconference, surveillance, and so on. The difficulties of moving object segmentation lie in the fact that physical objects are normally not homogeneous with respect to low-level features and it's usually tough to segment them accurately and efficiently. Object segmentation based on prestored background information h... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Comparison of residual compression methods in motion compensated video

    Publication Year: 2002, Page(s):109 - 112
    Cited by:  Papers (3)  |  Patents (19)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (448 KB) | HTML iconHTML

    In this paper, we compare the objective performance of several algorithms for coding motion compensated residuals, including the wavelet transform, matching pursuits and an improved embedded DCT-based coding method. A motion-compensated prediction-based approach with overlapped block matching compensation (OBMC) is used in all systems in the evaluation. The results show that matching pursuits outp... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Face recognition by incremental learning for robotic interaction

    Publication Year: 2002, Page(s):280 - 283
    Cited by:  Papers (2)  |  Patents (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (421 KB) | HTML iconHTML

    One of the important features for human-robot interaction is its ability to recognize human faces. This paper presents a novel architecture suitable for real time robotic face recognition by learning a person's face incrementally. The Gabor features at respective feature locations of a face are used to derive a similarity measurement. A face tracking followed by a clustering technique is used to l... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Texturing and line art rendering using patch-based image analogies

    Publication Year: 2002, Page(s):142 - 148
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (1043 KB) | HTML iconHTML

    We present a simple patch-based matching scheme for generating novel visual appearance in which a new image is synthesized by the optimal pasting of small patches of input sample texture image. First, we use this patch-based matching method as an efficient and simple textured synthesis algorithm to produce a wide range of textures with superb visual appearance. Second, we extend the algorithm for ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Factorization with missing data for 3D structure recovery

    Publication Year: 2002, Page(s):105 - 108
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (437 KB) | HTML iconHTML

    Matrix factorization methods are now widely used to recover 3D structure from 2D projections [C. Tomasi and T. Kanade. International Journal of Computer Vision, 9(2), 1992] . In this practice, the observation matrix to be factored out has missing data, due to the limited field of view and the occlusion that occur in real video sequences. In opposition to the optimality of the SVD to factor out mat... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Recent progress in spontaneous speech recognition and understanding

    Publication Year: 2002, Page(s):253 - 258
    Cited by:  Papers (9)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (525 KB) | HTML iconHTML

    How to recognize and understand spontaneous speech is one of the most important issues in state-of-the-art speech recognition technology. In this context, a five-year large scale national project entitled "Spontaneous speech: corpus and processing technology" started in Japan in 1999. This paper gives an overview of the project and reports on the major results of experiments that have been conduct... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Speaker recognition using least squares IOHMMs

    Publication Year: 2002, Page(s):276 - 279
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (343 KB) | HTML iconHTML

    The purpose of the speaker recognition is to determine a speaker's identity from his/her speech utterances. Every speaker has his/her own physiological as well as behavioral characteristics embedded in his/her speech utterances. These characteristics can be extracted from utterances and statistically modeled. Through pattern recognition of unseen test speech with statistically trained models, a sp... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Conceptual interface and memory-modeling for real-time image processing systems

    Publication Year: 2002, Page(s):138 - 141
    Cited by:  Papers (1)  |  Patents (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (382 KB) | HTML iconHTML

    Most operations invoked in video processing systems are neighborhood oriented. For a video system designer, this limited spatio-temporal collection of pixels represents a natural abstraction. In this paper, we present a basic set of object-oriented design entities. Entities, which can be combined to capture an interface and memory model at a conceptual level, with the neighborhood as an abstractio... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Detecting corrupted intra macroblocks in H.263 video

    Publication Year: 2002, Page(s):33 - 36
    Cited by:  Papers (5)  |  Patents (5)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (379 KB) | HTML iconHTML

    Corrupted low frequency data of intra coded macroblocks can significantly degrade quality of video in error prone wireless networks. Therefore, a new method for detecting the corrupted blocks is presented. The method exploits temporal smoothness of video by computing the absolute difference between subsequent video frames. A threshold function is used to highlight the block differences, and a heur... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Context based coding of quantized alpha planes for video objects

    Publication Year: 2002, Page(s):101 - 104
    Cited by:  Papers (4)  |  Patents (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (342 KB) | HTML iconHTML

    In object based video, each frame is a composition of objects that are coded separately. The composition is performed through the alpha plane that represents the transparency of the object. We present an alternative to MPEG-4 for coding of alpha planes that considers their specific properties. Comparisons in terms of rate and distortion are provided, showing that the proposed coding scheme for sti... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Similarity matching of continuous melody contours for humming querying of melody databases

    Publication Year: 2002, Page(s):249 - 252
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (365 KB) | HTML iconHTML

    Music query-by-humming is a challenging problem since the humming query inevitably contains much variation and inaccuracy. In this paper, we present a novel melody similarity matching technique, which is based on continuous melody contour. We introduce a contour alignment technique, which addresses the robustness and efficiency issues. We also present a new melody similarity metric, which is perfo... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An open source development tool for anthropomorphic dialog agent: face image synthesis and lip synchronization

    Publication Year: 2002, Page(s):272 - 275
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (530 KB) | HTML iconHTML

    We describe the design and report the development of an open source ware toolkit for building an easily customizable anthropomorphic dialog agent. This toolkit consists of four modules for multi-modal dialog integration, speech recognition, speech synthesis, and face image synthesis. In this paper, we focus on the construction of an agent's face image synthesis. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Streaming agent for wired network/wireless link rate-mismatch environment

    Publication Year: 2002, Page(s):388 - 391
    Cited by:  Patents (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (374 KB) | HTML iconHTML

    It has been shown that an agent located at the junction of wired and wireless links can help streaming media systems identify where packet losses occur and therefore maintain proper end-to-end congestion control. In this paper, we further expand the functionality of such agents in two ways. First, they allow streaming servers to identify the allowed transmission rate in both the wired and wireless... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Content enhancement for e-learning lecture video using foreground/background separation

    Publication Year: 2002, Page(s):436 - 439
    Cited by:  Papers (2)  |  Patents (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (574 KB) | HTML iconHTML

    With the popular adaptation of e-learning in many universities, lectures are sometimes distributed online in the form of real-time streaming videos. In these videos, contents on the chalkboard develop into the main sources of study materials for the students. However, they are usually compressed to minimize the bitrate for on-line streaming, and the process blurs the contents on the chalkboard, ma... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A DSP based MPEG-2 video decoder for HDTV or multichannel SDTV

    Publication Year: 2002, Page(s):134 - 137
    Cited by:  Papers (1)  |  Patents (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (394 KB) | HTML iconHTML

    This paper describes a DSP based MPEG-2 video decoding software. The proposed decoder is able to reconstruct with full quality, in real-time, a sequence in HDTV format (corresponding to a subset of the MP@HL configuration) or up to three SDTV sequences (corresponding to the MP@ML configuration). The developed implementation is based on single DSP, reducing the cost and enabling an easy upgrading o... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Recovery of lost VQ indexes in packet transmission

    Publication Year: 2002, Page(s):65 - 68
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (464 KB) | HTML iconHTML

    We consider the problem of robust transmission of VQ-coded image/video via noisy packet networks. In the event of packet loss, some VQ index bits will be absent at the receiver side. But the very knowledge of lost packets identities the spatial locations of affected VQ blocks, which are powerful information for the decoder to estimate the missing VQ index bits. This is possible because of the stat... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Watermarking for 3D NURBS graphic data

    Publication Year: 2002, Page(s):304 - 307
    Cited by:  Papers (3)  |  Patents (3)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (370 KB) | HTML iconHTML

    In this paper, two watermarking algorithms for nonuniform rational b-spline (NURBS) are proposed. One is suitable for steganography, and the other for watermarking. Both algorithms do not directly embed data into the parameters of NURBS, but into the 2D virtual images extracted from the sampled points of 3D model. As a result, the proposed algorithm for steganography preserves the data size of the... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Two novel schemes for opportunistic multi-access

    Publication Year: 2002, Page(s):412 - 415
    Cited by:  Papers (3)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (380 KB) | HTML iconHTML

    We study opportunistic multiuser communications, and propose two novel schemes to address scheduling in asymmetric channels and admission control in such systems, respectively. We first device a relay-aided multiuser diversity (RAMD) scheme, in which a user can choose to communicate with the base station either directly or using relay transmission. We show that the RAMD scheme performs significant... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A new hybrid error concealment scheme for MPEG-2 video transmission

    Publication Year: 2002, Page(s):29 - 32
    Cited by:  Papers (6)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (409 KB) | HTML iconHTML

    For entropy-coded MPEG-2 video frames, a transmission error will not only affect the underlying codeword but also may affect subsequent codewords, resulting in a great degradation of the received video frames. In this study, transmission errors in MPEG-2 video frames are first detected and located by the error detection scheme proposed by Shyu and Leou [1999], and then the corrupted blocks are con... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • 3D rigid structure from video: what are "easy" shapes and "good" motions?

    Publication Year: 2002, Page(s):97 - 100
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (401 KB) | HTML iconHTML

    Factorization algorithms are increasingly popular to recover 3D rigid structure from video. In this paper, we analyze the rank 1 factorization algorithm to determine what are the most suitable 3D shapes or the best 3D motions to recover the 3D structure from the 2D trajectories of the features. We show that the shape is best retrieved from orthogonal views aligned with the longest and smallest axe... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Video indexing and retrieval based on recognized text

    Publication Year: 2002, Page(s):245 - 248
    Cited by:  Papers (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (443 KB) | HTML iconHTML

    In this paper we present our experiments on text-based video indexing and retrieval. Due to expected OCR errors and the lack of semantic breadth in video text, we proposed two solutions: 1) expanding the semantics of the query word, and 2) using Glimpse to perform approximate matching instead of exact matching. The results we achieved showed that semantic expansion and Glimpse can play important r... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Noise robust hands-free speech recognition using microphone array and Kalman filter as front-end system of conversational TV

    Publication Year: 2002, Page(s):268 - 271
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (364 KB) | HTML iconHTML

    In this paper, we investigate hands-free speech recognition as front-end system of conversational TV. The conversational TV is one of machine conversation systems to retrieve the interesting information by inquiring it to the TV. To realize the natural machine conversation without consciousness of microphone, hands-free speech recognition is required. In the hands-free speech recognition system, t... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.