By Topic

Multimedia Signal Processing, 1997., IEEE First Workshop on

Date 23-25 June 1997

Filter Results

Displaying Results 1 - 25 of 97
  • Proceedings of First Signal Processing Society Workshop on Multimedia Signal Processing

    Publication Year: 1997
    Request permission for commercial reuse | PDF file iconPDF (326 KB)
    Freely Available from IEEE
  • Author index

    Publication Year: 1997, Page(s):593 - 596
    Request permission for commercial reuse | PDF file iconPDF (101 KB)
    Freely Available from IEEE
  • Objective speech quality assessment of compounded digital telecommunication systems

    Publication Year: 1997, Page(s):137 - 142
    Cited by:  Papers (2)  |  Patents (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (284 KB)

    Digital telecommunication networks involve multiple number of public switched telephone networks (PSTN), cellular and mobile systems and to some extent also satellite systems. Most of these networks contain non-linear speech coders which may degrade the overall end-to-end quality of speech. An important problem is how to assess the speech quality of such compounded systems. The object of this pape... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • High performance CELP coder utilizing a novel adaptive forward-backward LPC quantization

    Publication Year: 1997, Page(s):131 - 136
    Cited by:  Patents (3)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (284 KB)

    A highly efficient algorithm termed adaptive forward-backward vector quantization (AFBVQ) is developed for variable bit rate quantization of linear predictive coding (LPC) coefficients and integrated with the FS1016 Federal Standard Code Excited Linear Predictive (CELP) coder. This results in a high performance low bit rate speech coder called as AFBVQ-CELP which brings in two-fold bit rate reduct... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Nonlinear adaptive prediction of nonstationary signals with application to speech coding

    Publication Year: 1997, Page(s):125 - 130
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (260 KB)

    The purpose of this contribution is to present a new approach for the prediction of speech signals that is appropriate to speech coding. The procedure is based upon the principles of blind equalisation. In an earlier publication we examined these principles from the prediction point of view as a general method. The present contribution examines the approach in relation to speech signal representat... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Frame rate and viseme analysis for multimedia applications

    Publication Year: 1997, Page(s):13 - 18
    Cited by:  Papers (6)  |  Patents (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (308 KB)

    In the future, multimedia technology will be able to provide video frame rates equal to or better than 30 frames per second (FPS). Until that time the hearing impaired community will be using band limited communication systems over unshielded twisted pair copper wiring. As a result, multimedia communication systems will use a coder/decoder (CODEC) to compress the video and audio signals for transm... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A variable-rate CELP coder for fast remote voicemail retrieval using a notebook computer

    Publication Year: 1997, Page(s):119 - 124
    Cited by:  Papers (1)  |  Patents (21)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (308 KB)

    Remote retrieval of compressed voicemail data over a telephone line is one of several emerging applications of speech coding. Using a notebook computer equipped with a modem, a user can remotely access a networked desktop unit located at their office or home to retrieve various types of information such as email, FAX, electronic documents as well as voicemail. By compressing the speech data, we re... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Acoustic driven viseme identification for face animation

    Publication Year: 1997, Page(s):7 - 12
    Cited by:  Papers (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (404 KB)

    Unlike other image templates, visemes have identities in two different media. In audio domain, they are often related to basic linguistic units such as phonemes. In image domain, they are defined by the images of human articulators, such as mouth shapes, chin movements, etc. In this paper, an approach of extracting visemes from both image and acoustic domains is presented. In image domain, the mou... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Joint optimization of lattice vector quantizer and entropy coder for a Laplacian source

    Publication Year: 1997, Page(s):113 - 118
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (348 KB)

    This paper presents a joint optimization algorithm for lattice vector quantization (LVQ) and entropy coding for a Laplacian source at all ranges of bit rates. Entropy-constrained lattice vector quantizers (ECLVQs) are often used in practical coding systems. In order to develop an ECLVQ design algorithm, we derive estimation expressions for both distortion and entropy. From these estimations, we de... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Highly realistic modeling of persons for 3D videoconferencing systems

    Publication Year: 1997, Page(s):286 - 291
    Cited by:  Papers (2)  |  Patents (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (460 KB)

    This contribution describes the creation of highly realistic 3D models of participants for distributed 3D videoconferencing systems. These models consist of a flexible triangle mesh surrounding an interior skeleton structure, which is based on a simplified human skeleton. The vertices of the predefined mesh template are arranged in rigid rings along the bones of the skeleton. Using 3D data obtaine... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Multiple description image coding for noisy channels by pairing transform coefficients

    Publication Year: 1997, Page(s):419 - 424
    Cited by:  Papers (101)  |  Patents (9)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (656 KB)

    Multiple description coding (MDC) is a way of trading off coding gain with robustness to channel errors. This paper presents a new method for MDC using the framework of transform coding. Instead of using the Karhunen-Loeve transform (KLT) that decorrelates all the coefficients, we choose the transform bases so that the coefficients are correlated pair-wise. This is accomplished by rotating every t... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Lip parameters extraction based on projection of raw images onto reference shapes

    Publication Year: 1997, Page(s):1 - 6
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (304 KB)

    This paper presents a method for the extraction of articulatory parameters from direct processing of raw images of the lips. After an overview of speechreading and the existing lipreading systems, a set of 23 reference lip shapes phonetically labelled, called visemes, are presented. Our system architecture can be seen as made of three independent parts. First, a new greyscale mouth image is centre... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Asynchronous rate conversion

    Publication Year: 1997, Page(s):107 - 112
    Cited by:  Papers (1)  |  Patents (5)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (312 KB)

    A new approach for sampling rate conversion of digitally sampled signals is presented. The approach enables conversion from any given source sampling rate to any desired target sampling rate even if the source and destination clocks are not synchronous. The importance of algorithm is in a communication system where the sampling is done on one system and the playback is done on another, in such cas... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Block-based depth estimation from image triples with unrestricted camera setup

    Publication Year: 1997, Page(s):280 - 285
    Cited by:  Papers (11)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (380 KB)

    A block-based depth estimation approach is presented that evaluates image triples captured with an unrestricted calibrated camera setup. From each image triple, three image pairs are formed which are evaluated independently after respective rectification. The two resulting depth maps per image, one with respect to each of the other images, are averaged according to their reliability. Compared to b... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Compact multimedia players with PC memory cards-silicon view and shopping navigation

    Publication Year: 1997, Page(s):451 - 456
    Cited by:  Patents (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (676 KB)

    Two kinds of compact and motor-less multimedia players are developed employing PC memory cards as the storage device. The VHS-quality video and the compact-disc quality sound play is supported. The MPEG-1 coding is incorporated for video and audio compression. Since the compressed information is stored in memory LSIs and moving mechanical components are not required, lightweight, shock proof and h... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • On performance of the wavelet domain multirate transmission for wireless multimedia

    Publication Year: 1997, Page(s):413 - 418
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (468 KB)

    Inter-symbol interference (ISI) is often regarded as an important issue for high speed wireless applications. Due to the fast fading characteristic combining with the ISI, channel equalization in the wireless system often involves with dedicate design for each specific application. Multi-carrier modulation (MCM) which converts single high-rate symbol stream into parallel low-rate streams prolongs ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A hierarchical algorithm for image retrieval by sketch

    Publication Year: 1997, Page(s):564 - 569
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (340 KB)

    In this paper, we introduce a hierarchical algorithm for image retrieval by sketch, The application scenario is that the user inputs a rough sketch depicting the prominent edges or contours of objects and wishes to retrieve database images that have similar shapes. We can only expect to get a rough query sketch from the user, which is likely a distorted version of the intended database image, henc... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Region of interest priority coding for sign language videoconferencing

    Publication Year: 1997, Page(s):531 - 536
    Cited by:  Papers (7)  |  Patents (11)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (464 KB)

    This paper investigates compression of sign-language video sequences for transmission over low bitrate channels. The motion of hands and arms inherent in sign language requires high temporal resolution and greater compression for band-limited channels. Most existing compression schemes treat the entire image uniformly. However, the background is much less critical than the foreground for sign lang... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Audio coding for conversion to MIDI

    Publication Year: 1997, Page(s):101 - 106
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (320 KB)

    We concentrate on the problem of converting audio samples into MIDI data, and describe a technique for processing music signals that attempts to extract note onsets and pitches, for a small class of music signals. Segmentation of the signals is based upon common onsets of tones and their proximity to edges in the signal. Pitch detection is performed by decomposing the short-time spectra of the seg... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Animated talking head with personalized 3D head model

    Publication Year: 1997, Page(s):274 - 279
    Cited by:  Papers (5)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (468 KB)

    Natural Human-Computer Interface requires integration of realistic audio and visual information for perception and display. An example of such an interface is an animated talking head displayed on the computer screen in the form of a human-like computer agent. This system converts text to acoustic speech with synchronized animation of mouth movements. The talking head is based on a generic 3D huma... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Packet delays in multiplexed video streams

    Publication Year: 1997, Page(s):395 - 400
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (336 KB)

    A model for multiplexed video sources and the resulting packet delay distribution for a fluid buffer is given. A measurement driven Markov chain based traffic model for a variable bit rate (VBR) source is used in the analysis. A characterization of the queueing delays of a multiplexed stream comprised of statistically identical sources is undertaken. It is shown that inadequate spectral content in... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Real time full-duplex H.263 video codec system

    Publication Year: 1997, Page(s):445 - 450
    Cited by:  Papers (5)  |  Patents (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (348 KB)

    A real-time H.263 video codec system is presented is this paper. The TMS320C80 which consists of four parallel DSPs is used in the system. A few strategies are used to reduce the computation complexity and hence increase the frame rate. The parallel implementation of the H.263 algorithm on the parallel DSPs is described and the performance of the real-time codec is discussed View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Visual text reader for virtual image communication on networks

    Publication Year: 1997, Page(s):495 - 500
    Cited by:  Patents (16)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (316 KB)

    This paper presents a media conversion system from text to video, which can be used as a virtual image communication tool over narrow-band networks. The proposed system analyzes a plain text, such as e-mail, and generates a video sequence for a human's bust shot which includes actions and facial expressions related to the contents. The voice sounds are also generated using text-to-speech system. B... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • New fast motion estimation algorithm for video coding

    Publication Year: 1997, Page(s):201 - 206
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (284 KB)

    Block motion estimation using full search is computationally intensive. We present a new fast algorithm for block motion estimation that produce similar performance to that of full search but with computational time reduce to 7%. From the experimental results, the proposed algorithm is superior to TSS both in performance of computational time and accuracy of motion vectors View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Optimal reduced pyramid interpolation for lossless and progressive image coding

    Publication Year: 1997, Page(s):151 - 156
    Cited by:  Papers (6)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (256 KB)

    Reduced pyramids, including in particular pyramids without analysis filters are known to produce excellent results when used for lossless signal and image compression. The present paper presents a methodology for the optimal construction of such pyramids by selecting the interpolation synthesis post-filters so as to minimize the error variance at each level of the pyramid. This establishes optimal... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.