By Topic

Multimedia Signal Processing, 1997., IEEE First Workshop on

Date 23-25 June 1997

Filter Results

Displaying Results 1 - 25 of 97
  • Proceedings of First Signal Processing Society Workshop on Multimedia Signal Processing

    Publication Year: 1997
    Save to Project icon | Request Permissions | PDF file iconPDF (326 KB)  
    Freely Available from IEEE
  • Author index

    Publication Year: 1997 , Page(s): 593 - 596
    Save to Project icon | Request Permissions | PDF file iconPDF (101 KB)  
    Freely Available from IEEE
  • New fast motion estimation algorithm for video coding

    Publication Year: 1997 , Page(s): 201 - 206
    Save to Project icon | Request Permissions | Click to expandAbstract | PDF file iconPDF (284 KB)  

    Block motion estimation using full search is computationally intensive. We present a new fast algorithm for block motion estimation that produce similar performance to that of full search but with computational time reduce to 7%. From the experimental results, the proposed algorithm is superior to TSS both in performance of computational time and accuracy of motion vectors View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Multimodal interaction in multimedia applications

    Publication Year: 1997 , Page(s): 25 - 30
    Cited by:  Papers (1)
    Save to Project icon | Request Permissions | Click to expandAbstract | PDF file iconPDF (332 KB)  

    Multimedia utilizes may different channels to convey information to a user. For an optimum interaction these channels must be coordinated in a clear and informative way. Research has produced several rules-of-thumb which facilitate this process. The paper outlines two software applications, AudioScript and WebSpy, which use these guidelines to investigate multimodal integration in multimedia appli... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Frame rate and viseme analysis for multimedia applications

    Publication Year: 1997 , Page(s): 13 - 18
    Cited by:  Papers (6)  |  Patents (1)
    Save to Project icon | Request Permissions | Click to expandAbstract | PDF file iconPDF (308 KB)  

    In the future, multimedia technology will be able to provide video frame rates equal to or better than 30 frames per second (FPS). Until that time the hearing impaired community will be using band limited communication systems over unshielded twisted pair copper wiring. As a result, multimedia communication systems will use a coder/decoder (CODEC) to compress the video and audio signals for transm... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Media integration in multimodal interfaces

    Publication Year: 1997 , Page(s): 31 - 36
    Cited by:  Papers (3)  |  Patents (2)
    Save to Project icon | Request Permissions | Click to expandAbstract | PDF file iconPDF (348 KB)  

    Combining several modalities in the same interface requires certain characteristics from input and output devices and the ability to provide some specific information which is important at the technical level. Unfortunately, several of the current devices do not provide such information. The reason is simple: they have been designed keeping in mind that they will be used in an isolated way, not in... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Estimating sinusoidal parameters of musical tones based on global waveform fitting

    Publication Year: 1997 , Page(s): 95 - 100
    Cited by:  Papers (3)
    Save to Project icon | Request Permissions | Click to expandAbstract | PDF file iconPDF (268 KB)  

    A novel approach to the analysis of musical tones using a quadratic polynomial phase sinusoids plus residual data model is presented. Taking advantage of the fact that musical signals are usually analyzed off-line, the proposed approach estimates the sinusoidal parameters of a musical tone such that its waveform fits the data model in a least square sense, and has obvious advantages over the exist... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Scalable QOS control for VBR video servers

    Publication Year: 1997 , Page(s): 570 - 575
    Cited by:  Papers (2)  |  Patents (2)
    Save to Project icon | Request Permissions | Click to expandAbstract | PDF file iconPDF (392 KB)  

    This paper accompanies a demonstration of a variable bit-rate (VBR) video server with scalable quality-of-service (QoS) control. Multimedia applications dynamically demand different QoS grades from servers and networks during a session. Since video is a critical component of distributed multimedia applications, video servers have to be responsive to the applications' requirements and to the availa... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Perceptual watermarking of still images

    Publication Year: 1997 , Page(s): 363 - 368
    Cited by:  Papers (24)  |  Patents (3)
    Save to Project icon | Request Permissions | Click to expandAbstract | PDF file iconPDF (360 KB)  

    Content providers on the Internet are faced with the problem of how to secure electronic data. This problem has generated research activity in the area of digital watermarking of electronic content. The challenge is to introduce a digital watermark that is both transparent and highly robust to common signal processing and possible attacks. The two basic requirements for an effective watermarking s... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Audio coding for conversion to MIDI

    Publication Year: 1997 , Page(s): 101 - 106
    Cited by:  Papers (1)
    Save to Project icon | Request Permissions | Click to expandAbstract | PDF file iconPDF (320 KB)  

    We concentrate on the problem of converting audio samples into MIDI data, and describe a technique for processing music signals that attempts to extract note onsets and pitches, for a small class of music signals. Segmentation of the signals is based upon common onsets of tones and their proximity to edges in the signal. Pitch detection is performed by decomposing the short-time spectra of the seg... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Multi-criteria video segmentation for TV news

    Publication Year: 1997 , Page(s): 319 - 324
    Cited by:  Papers (5)  |  Patents (1)
    Save to Project icon | Request Permissions | Click to expandAbstract | PDF file iconPDF (292 KB)  

    In the near future, personalized TV news programs should become a relevant method to access TV news. They will be produced from selecting and reordering “stories” contained in news programs. Automatic segmentation of news programs into story is an important step to reach this goal. We present a set rules for such segmentation. Methods for implementing these rules are based on image and... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Dynamic search range motion estimation for video coding

    Publication Year: 1997 , Page(s): 207 - 212
    Cited by:  Papers (2)  |  Patents (1)
    Save to Project icon | Request Permissions | Click to expandAbstract | PDF file iconPDF (296 KB)  

    A fast algorithm for motion estimation in interframe video coding is proposed in this paper. In contrast to previously proposed fast algorithms which use limited number of check points in a constant search range, the proposed algorithm performs search in a dynamic search range. It provides better estimation accuracy than that of previously proposed fast algorithms which use limited search points. ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A recursively structured solution for handwriting and speech recognition

    Publication Year: 1997 , Page(s): 587 - 592
    Cited by:  Papers (2)  |  Patents (1)
    Save to Project icon | Request Permissions | Click to expandAbstract | PDF file iconPDF (328 KB)  

    This paper extends the basic theory of DAGs (Directed Acyclic Graphs) and their DAG-Compare operation to produce a recursive architecture for language recognition systems. Building upon theory and practical implementation, we treat the cases of multiple interacting levels of language recognition. We propose that DAG data structure and its complementary comparison operation are a structural inducti... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • SaFe: a general framework for integrated spatial and feature image search

    Publication Year: 1997 , Page(s): 301 - 306
    Cited by:  Papers (4)  |  Patents (5)
    Save to Project icon | Request Permissions | Click to expandAbstract | PDF file iconPDF (440 KB)  

    We present a system for querying for images by the spatial and feature attributes of regions. The system enables the user to find the images that contain an arrangement of regions similar to that diagrammed in a query image. We propose a general framework which allows for different types of features (e.g., color, texture, shape, motion) to be integrated with spatial information in the query proces... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Development of a media server system based on the DAVIC standard

    Publication Year: 1997 , Page(s): 457 - 462
    Save to Project icon | Request Permissions | Click to expandAbstract | PDF file iconPDF (280 KB)  

    Interactive multimedia systems with guaranteed service quality require complex integration of servers, broadband networks and terminals. In the future such systems might become common, even reaching individual homes but this requires development of new standards in order to achieve interoperability comparable to the Internet. The task for devising standardized end-to-end solutions for interactive ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Models for audiovisual fusion in a noisy-vowel recognition task

    Publication Year: 1997 , Page(s): 37 - 44
    Save to Project icon | Request Permissions | Click to expandAbstract | PDF file iconPDF (448 KB)  

    This paper presents a comparison of four basic architectures dealing with audiovisual speech in a noisy-vowel recognition task. Provided contextual input (signal-to-noise ratio), three of the four architectures respect the “synergy” criterion which means that audiovisual (AV) recognition is better than audio-alone (A) or visual-alone (V) recognition, both in global terms and for each i... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An algorithm-hardware-system approach to VLIW multimedia processors

    Publication Year: 1997 , Page(s): 433 - 438
    Cited by:  Papers (1)
    Save to Project icon | Request Permissions | Click to expandAbstract | PDF file iconPDF (328 KB)  

    A number of recently published DSPs and multimedia processors emphasize on Very Long Instruction Word (VLIW) architectures to achieve flexibility, processing power and high-level language programmability needed for future multimedia applications. In this paper we show that exclusive exploitation of instruction level parallelism decreases in efficiency as the degree of parallelism increases. This i... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Object-oriented framework for audio compression research

    Publication Year: 1997 , Page(s): 77 - 82
    Save to Project icon | Request Permissions | Click to expandAbstract | PDF file iconPDF (260 KB)  

    We have developed an object oriented software framework for the development and experimentation of generic audio compression research. The framework is designed to minimize the time required to develop and evaluate new audio coding algorithms. The high-level requirements and benefits of this framework are discussed. In addition, a few framework designs that emerged as a result of taking an object ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Acoustic driven viseme identification for face animation

    Publication Year: 1997 , Page(s): 7 - 12
    Cited by:  Papers (2)
    Save to Project icon | Request Permissions | Click to expandAbstract | PDF file iconPDF (404 KB)  

    Unlike other image templates, visemes have identities in two different media. In audio domain, they are often related to basic linguistic units such as phonemes. In image domain, they are defined by the images of human articulators, such as mouth shapes, chin movements, etc. In this paper, an approach of extracting visemes from both image and acoustic domains is presented. In image domain, the mou... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Significance-linked connected component analysis for high performance low bit rate wavelet coding

    Publication Year: 1997 , Page(s): 145 - 150
    Cited by:  Papers (6)
    Save to Project icon | Request Permissions | Click to expandAbstract | PDF file iconPDF (480 KB)  

    Recent success in wavelet image coding is mainly attributed to the recognition of importance of data organization and representation, There have been several very competitive wavelet coders developed, namely, embedded zerotree wavelets (EZW), morphological representation of wavelet data (MRWD), and set partitioning in hierarchical trees (SPIHT). In this paper, we develop a novel wavelet image code... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A hierarchical algorithm for image retrieval by sketch

    Publication Year: 1997 , Page(s): 564 - 569
    Save to Project icon | Request Permissions | Click to expandAbstract | PDF file iconPDF (340 KB)  

    In this paper, we introduce a hierarchical algorithm for image retrieval by sketch, The application scenario is that the user inputs a rough sketch depicting the prominent edges or contours of objects and wishes to retrieve database images that have similar shapes. We can only expect to get a rough query sketch from the user, which is likely a distorted version of the intended database image, henc... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Using HMMs in audio-to-visual conversion

    Publication Year: 1997 , Page(s): 19 - 24
    Cited by:  Papers (2)
    Save to Project icon | Request Permissions | Click to expandAbstract | PDF file iconPDF (316 KB)  

    One emerging application which exploits the correlation between audio and video is speech driven facial animation. The goal of speech driven facial animation is to synthesize realistic video sequences from acoustic speech. Much of the previous research has implemented this audio to visual conversion strategy with existing techniques such as vector quantization and neural networks. We examine how t... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Joint optimization of lattice vector quantizer and entropy coder for a Laplacian source

    Publication Year: 1997 , Page(s): 113 - 118
    Save to Project icon | Request Permissions | Click to expandAbstract | PDF file iconPDF (348 KB)  

    This paper presents a joint optimization algorithm for lattice vector quantization (LVQ) and entropy coding for a Laplacian source at all ranges of bit rates. Entropy-constrained lattice vector quantizers (ECLVQs) are often used in practical coding systems. In order to develop an ECLVQ design algorithm, we derive estimation expressions for both distortion and entropy. From these estimations, we de... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A low cost multimedia tool for telemedicine network rapid design

    Publication Year: 1997 , Page(s): 525 - 530
    Cited by:  Patents (1)
    Save to Project icon | Request Permissions | Click to expandAbstract | PDF file iconPDF (312 KB)  

    In a hospital, the improvement of patient care requires the use of reliable (fault tolerant) and multimedia communication technologies between the patient's bed and the nurse's room. Today's solutions are inflexible or expensive. To answer these needs, we propose MediaFlow, a flexible inexpensive solution based on a versatile software/hardware tool View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Rate optimization by true motion estimation

    Publication Year: 1997 , Page(s): 187 - 194
    Cited by:  Papers (8)  |  Patents (2)
    Save to Project icon | Request Permissions | Click to expandAbstract | PDF file iconPDF (464 KB)  

    We propose a rate-optimized motion estimation based on a “true” motion tracker. We observe that the piecewise continuous motion field reduces the bit rate for differentially encoded motion vectors. Hence, a neighborhood relaxation method is proposed. In addition, in current MPEG-4 video VM, each video-object-plane (VOP) is individually coded by a block-based approach. The bit rate can ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.