By Topic

Multimedia Signal Processing, 1999 IEEE 3rd Workshop on

Date 13-15 Sept. 1999

Filter Results

Displaying Results 1 - 25 of 117
  • 1999 IEEE Third Workshop on Multimedia Signal Processing (Cat. No.99TH8451)

    Publication Year: 1999
    Save to Project icon | Request Permissions | PDF file iconPDF (313 KB)  
    Freely Available from IEEE
  • Author index

    Publication Year: 1999 , Page(s): 697 - 700
    Save to Project icon | Request Permissions | PDF file iconPDF (105 KB)  
    Freely Available from IEEE
  • An Approach Towards Mapping Quality Of Perception To Quality Of Service In Multimedia Communications

    Publication Year: 1999 , Page(s): 497 - 502
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (191 KB)  

    First Page of the Article
    View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A Probability Based Model For Subjective Quality Evaluation Of Mpeg-2 Coded Sequences

    Publication Year: 1999 , Page(s): 521 - 525
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (201 KB)  

    First Page of the Article
    View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Visual Quality Assessment Using A Contrast Gain Control Model

    Publication Year: 1999 , Page(s): 527 - 532
    Cited by:  Patents (4)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (240 KB)  

    First Page of the Article
    View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Development of an avatar language chatting system and its communication experiment using a Japanese communications satellite

    Publication Year: 1999 , Page(s): 605 - 610
    Cited by:  Papers (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (366 KB)  

    We propose a chatting system that uses avatar language to communicate across the linguistic barriers. We developed a CG system for generating CG animations of words from avatar language. A communication experiment was performed using a Japanese communications satellite to confirm the validity of the developed chatting system View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Robust speech recognition for mobile applications

    Publication Year: 1999 , Page(s): 227 - 232
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (321 KB)  

    This paper proposes a robust speaker-independent, connected digit recognition system for mobile applications. The system requires a small amount of ROM and low computational cost with a high recognition accuracy. In addition, the system can be efficiently implemented on most currently available 32-bit fixed-point DSP chips. To reach these goals, we combined robust speech parameter processing technologies with dual matrix quantization (MQ) and vector quantization (VQ) pairs, which supply discrete gender-dependent HMM to increase the performance of HMMs. The dual MQ/VQ pairs exploit the “evolution” of the speech short-term spectral envelopes with one pair providing error compensation using LSP mean compensated coefficients. In a car noise environment, the system attains an 80% average connected digit recognition accuracy around 10 dB. A digit accuracy of 93% is obtained at 5 dB View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Architecture for a Windows NT wireless LAN multimedia terminal

    Publication Year: 1999 , Page(s): 535 - 540
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (371 KB)  

    The support for multimedia services in wireless access networks is challenging to implement, due to an unreliable and bandwidth-limited wireless medium. In addition, the widely used applications and protocols generally lack the support for quality-of-service (QoS) parameters. This paper presents the architecture of a multimedia wireless LAN terminal. The terminal is implemented using a custom network demonstrator platform connected to a Windows NT workstation. The system provides a separate management plane for configuring the service parameters of a proprietary wireless medium access control (MAC) protocol. Also, native applications that are capable of accessing the MAC QoS parameters directly are supported View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A visual interactive environment for image retrieval by subjective parameters

    Publication Year: 1999 , Page(s): 559 - 564
    Cited by:  Papers (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (337 KB)  

    Recently, in designing computing systems, much focus has been given to support the user's subjectivity (Kansei in Japanese), by providing the system with a mostly explicit user model. We argue against this explicit nature of the model since there is evidence in neuroscience that the human brain does not have monolithic control and static internal models. We suggest that the user model should rather emerge from a prolonged interaction between the user and the system. We describe our interactive visual environment dedicated to image retrieval based on subjective parameters. It is endowed with a learning agent and an active interface allowing multi-model interaction between the user and the system. This interaction takes the form of symbols, examples or externalization of internal processes View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Fast 3D modeling from video

    Publication Year: 1999 , Page(s): 289 - 294
    Cited by:  Papers (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (378 KB)  

    We build 3D models of rigid bodies from video sequences. The algorithm we use is simple and robust. It recovers the 3D shape parameters and the 3D motion parameters by first estimating the parameters of the induced optical flow representation. To estimate the 3D shape and 3D motion from the optical flow, we use a fast algorithm that is based on the factorization of a matrix that is rank 1 in a noiseless situation. We demonstrate our approach with a piecewise planar object shape built from a real life video clip. We highlight some of the potential applications of the 3D models obtained View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • MPEG encoding algorithm with scene adaptive dynamic GOP structure

    Publication Year: 1999 , Page(s): 297 - 302
    Cited by:  Papers (5)  |  Patents (3)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (346 KB)  

    Although a GOP (group of picture) length is often fixed in MPEG, this may not guarantee the best picture quality from the point of view of coding efficiency. Furthermore, it is also well known that P-picture interval M=3 does not provide the best quality for all sequences. We propose an MPEG encoding algorithm with scene adaptive dynamic GOP structure to enhance the picture quality in the cases mentioned above. Using macroblock characteristics such as luminance activity and simple motion compensation on the activity domain, N and M values are dynamically determined View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Echo and noise reduction methods for multimedia communication systems

    Publication Year: 1999 , Page(s): 239 - 244
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (332 KB)  

    New concepts of echo cancellation and reduction of non-stationary noise affecting audio signals transmitted in telecommunication channels are proposed in the paper. In the both cases, some methods originated form artificial intelligence domain, i.e.: genetic algorithms, neural networks, rough sets are applied. Moreover, in the noise reduction method, some features of the human auditory system are exploited. A number of experiments have been carried out, and a brief discussion on some of them is included View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Design and implementation of IMAT (Internet Multimedia Authoring Tool) using a unified spatio-temporal relationship model

    Publication Year: 1999 , Page(s): 617 - 622
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (348 KB)  

    In this paper, we examine the problems of conventional multimedia data models that deal with mainly temporal relationships among the component media data of multimedia data and propose a unified spatio-temporal relationship model for representing and presenting multimedia data dynamically. And, we design and implement IMAT (Internet Multimedia Authoring Tool) based on the proposed model. IMAT is developed by JAVA technologies, therefore it can be ported and operated on any kind of platform in which JVM runs View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Embedded lossless wavelet-based image coding algorithm with successive partitioning and hybrid bit scanning

    Publication Year: 1999 , Page(s): 383 - 388
    Cited by:  Papers (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (318 KB)  

    A new embedded lossless wavelet-based image coding algorithm called successive partition zero coder (SPZC) which use hybrid bit scanning is proposed. By successive partition the wavelet coefficients in the space-frequency domain and non-causal adaptive context modeling, SPZC outperforms other state-of-the-art coders such as SPIHT, CREW and LJPEG etc. in terms of coding efficiency even without zerotree analysis View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Progressive browsing of 3D models

    Publication Year: 1999 , Page(s): 71 - 76
    Cited by:  Papers (2)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (386 KB)  

    High-speed desktop computers, capable of rendering complex 3D models, are becoming more commonplace, resulting in increasing interest in virtual reality systems. Virtual Reality Modelling Language (VRML) has become the de-facto standard for the description of virtual worlds over the Internet, yet the current version (VRML97) lacks capabilities for progressive transfer. This paper describes the implementation of a browser that allows the progressive transfer and simultaneous viewing of 3D models, stored in a progressive format on a Web server, using a codec based on MPEG-4 verification model source code. The authors then implement their own codec that is progressive per-vertex rather than per-level. This approach also has the benefit that it does not alter the vertex coordinates. Finally a watermarking scheme, based on the second codec, is proposed View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Preprocessing tool for compressed video editing

    Publication Year: 1999 , Page(s): 283 - 288
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (303 KB)  

    This paper presents a simple and effective preprocessing method developed for editing compressed video sequences. The proposed method involves extracting information about different video segments from the compressed bitstream. The algorithm is not designed to distinguish among types of segments but rather to indicate the position and duration. Since no decoding of the bitstream is done, the computational load of the algorithm is very low. Although the experimental results are shown on MPEG compressed video sequences, the algorithm can easily be applied to MJPEG, MPEG4 sequences given the header information View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A fractals-inspired approach to data embedding in digital images for authentication services

    Publication Year: 1999 , Page(s): 565 - 566
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (201 KB)  

    The aim of this demonstration is to present the ongoing performance of our research and development watermarking scheme software for owner and image authentication. The proposed illustrations cover a large panel of original images (in grey levels and colors), watermarks and attacks. Evaluation is performed according to ratio, visibility and robustness View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • WebMessenger: a new framework to produce multimedia content by combining synthesized speech and moving pictures in the WWW environment

    Publication Year: 1999 , Page(s): 611 - 616
    Cited by:  Papers (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (343 KB)  

    To facilitate multimedia content production for Internet distribution, this paper proposes a new framework for generating multimedia content. Advantages of the framework are minimized amount of transmitted data and easy-to-update multimedia content. Key points are (1) to produce speech messages by a text-to-speech (TTS) system and (2) to produce moving pictures by concatenating templates from a set of moving pictures. To support the proposed framework, a speech design tool and a movie assign tool are introduced. Three kinds of contents have been produced using the proposed framework and informally evaluated. Important results are: (1) the transmitted data of WebMessenger is 120 times smaller than the full movie version of the content, (2) quality of speech synthesized by TTS is not sufficient and a speech design tool is necessary, (3) WebMessenger is 50% cheaper than human speech recording, (4) speech synthesis by combining manual and automatic methods is effective View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An approach towards mapping quality of perception to quality of service in multimedia communications

    Publication Year: 1999 , Page(s): 497 - 501
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (277 KB)  

    An approach to the design of user-centred communications protocols is presented. Quality of perception (QoP) is a novel term which includes not only users' enjoyment and satisfaction with a multimedia presentation, but also their ability to assimilate its informational content. We show that a relation of proportionality can be obtained between QoP and low-level quality of service (QoS) parameters if the multimedia presentation content is taken into account. This relation could provide the basis for a preferential QoS parameter management scheme which is also user-oriented View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Efficient use of content-based retrieval in quick browsing: a realization

    Publication Year: 1999 , Page(s): 553 - 558
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (547 KB)  

    Due to recent growth of interest in multimedia applications, an increasing demand has emerged for efficient storage, management and browsing in multimedia databases. This fact has made the content-based retrieval (CBR) concept very popular during the past decade. However, it has been practically shown that CBR tools respond successfully to the inexperienced user queries very rarely. It is for this reason that in commercial systems, queries are supported by the use of annotations or manual indexing. In this work, content-based query, retrieval and indexing capabilities have been combined with an intelligent agent framework over a simple database architecture. The proposed system is based on the ideas presented in (Xirouhakis et al., 1998) which are implemented and further extended in this work View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Automatic music score recognition/play system based on decision based neural network

    Publication Year: 1999 , Page(s): 183 - 184
    Cited by:  Papers (1)  |  Patents (23)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (200 KB)  

    This paper proposes an automatic music score recognition system based on a hierarchically structured decision based neural network (DBNN), which can classify patterns with nonlinear decision boundaries. Currently, this system yields around a 97% recognition rate for printed music scores View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Distributed processing of multimedia extended documents

    Publication Year: 1999 , Page(s): 623 - 628
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (403 KB)  

    In this paper we present a new approach toward processing of multimedia documents. We use intelligence derived from media and processing dependencies and distribute the processing according to logical progression of the media processing tasks. To achieve this we devised a management strategy which takes as an input a user defined problem with pre-coded processing tools and maps to distributable tasks which have an associated resource requirements. At the last step of the processing, the available processing nodes, which vary in available resources, are matched to the processes. An example demonstrating the capabilities of this system is presented View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Software MPEG-2 video decoder on a DSP Enhanced Memory Module

    Publication Year: 1999 , Page(s): 661 - 666
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (345 KB)  

    This paper describes our effort in implementing software MPEG-2 video decoder using TI 'C6201 DSP and Basava Technology. The hardware device, DSP Enhanced Memory Module (DSP-MM), leverages the advantages of high computation performance from 'C6201 DSP, as well as high bandwidth memory access and efficient memory usage from Basava Technology. A prototype with 'C6201 DSP and 32MB SDRAM shared memory is functioning in PC environment running Windows 95 operating system. We will describe the video decoder in detail View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Robust coding of images using EBCOT and RVLC

    Publication Year: 1999 , Page(s): 395 - 400
    Cited by:  Papers (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (317 KB)  

    EBCOT (embedded block coding with optimized truncation) is an efficient image coding technique adopted by JPEG-2000 VM. EBCOT divides each subband into independently coded 64×64 blocks, and is therefore inherently more robust to errors than many other wavelet-based schemes. However, loss of data in any lower frequency subband block in EBCOT can still damage the image beyond repair. This work discusses the use of reversible variable length codes (RVLC) for the coding of lower frequency subbands in EBCOT, instead of arithmetic codes. It is demonstrated that RVLCs demand a very low bit-rate overhead View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Hardware implementation of four-step genetic search algorithm

    Publication Year: 1999 , Page(s): 643 - 648
    Cited by:  Papers (3)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (306 KB)  

    Genetic Algorithm (GA) has been applied to Block Matching Algorithm (BMA) and demonstrates positively its capability in BMA. Four-step genetic search (4GS) has been proposed recently (So and Wu, 1998). The mean square error (MSE) performance of 4GS is close to FS. The computational cost of 4GS is close to the well known three-step search (3SS). Realization of 4GS can be applied in video encoding hardware. Practical implementation issues of 4GS by using FPGA will be discussed. Since FPGA is reconfigurable device, the configuration of 4GS module can be changed as frame size changes View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.