By Topic

IEEE Transactions on Speech and Audio Processing

Issue 5 • Date Sep 1998

Filter Results

Displaying Results 1 - 8 of 8
  • A structural model for binaural sound synthesis

    Publication Year: 1998, Page(s):476 - 488
    Cited by:  Papers (60)  |  Patents (10)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (452 KB)

    A structural model is presented for synthesizing binaural sound from a monaural source. The model produces well-controlled vertical as well as horizontal effects. The model is based on a simplified time-domain description of the physics of wave propagation and diffraction. The components of the model have a one-to-one correspondence with the physical sources of sound diffraction, delay, and reflec... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A multispan language modeling framework for large vocabulary speech recognition

    Publication Year: 1998, Page(s):456 - 467
    Cited by:  Papers (29)  |  Patents (61)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (296 KB)

    A new framework is proposed to construct multispan language models for large vocabulary speech recognition, by exploiting both local and global constraints present in the language. While statistical n-gram modeling can readily take local constraints into account, global constraints have been more difficult to handle within a data-driven formalism. In this work, they are captured via a paradigm fir... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An auditory-based distortion measure with application to concatenative speech synthesis

    Publication Year: 1998, Page(s):489 - 495
    Cited by:  Papers (6)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (196 KB)

    This study presents a new auditory-based distance measure with application to concatenative speech synthesis. This measure employs the Carney auditory model to produce a feature vector related to auditory perception. For concatenative synthesis, the new measure is employed to assess perceived discontinuities at segment transitions. Evaluations using a restricted data base environment show that the... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A hybrid model for text-to-speech synthesis

    Publication Year: 1998, Page(s):426 - 434
    Cited by:  Papers (2)  |  Patents (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (256 KB)

    This paper describes a hybrid model developed for high-quality, concatenation-based, text-to-speech synthesis. The speech signal is submitted to a pitch-synchronous analysis and decomposed into a harmonic component, with a variable maximum frequency, plus a noise component. The harmonic component is modeled as a sum of sinusoids with frequencies that are multiples of the pitch. The noise component... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A hybrid mono/stereo acoustic echo canceler

    Publication Year: 1998, Page(s):468 - 475
    Cited by:  Papers (11)  |  Patents (20)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (224 KB)

    In many applications, such as teleconferencing, multimedia workstations, televideo gaming, etc., stereo sound is already, or will soon be, implemented to give spatial realism that mono systems cannot offer. In such hands-free systems, stereophonic acoustic echo cancelers are absolutely necessary for full-duplex communication. We propose a new acoustic echo canceler (AEC) based on a fundamental exp... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • HMM-based strategies for enhancement of speech signals embedded in nonstationary noise

    Publication Year: 1998, Page(s):445 - 455
    Cited by:  Papers (70)  |  Patents (22)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (272 KB)

    An improved hidden Markov model-based (HMM-based) speech enhancement system designed using the minimum mean square error principle is implemented and compared with a conventional spectral subtraction system. The improvements to the system are: (1) incorporation of mixture components in the HMM for noise in order to handle noise nonstationarity in a more flexible manner, (2) two efficient methods i... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A new phase model for sinusoidal transform coding of speech

    Publication Year: 1998, Page(s):495 - 501
    Cited by:  Papers (7)  |  Patents (4)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (196 KB)

    A phase modeling algorithm for sinusoidal analysis-synthesis of speech is presented, where short-time sinusoidal phases are approximated using a combination of linear prediction, spectral sampling, delay compensation, and phase correction techniques. The algorithm is different to phase compensation methods proposed for source-system LPC in that it has been tailored to sinusoidal representation of ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Multistep coding of speech parameters for compression

    Publication Year: 1998, Page(s):435 - 444
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (332 KB)

    This paper presents specific new techniques for coding of speech representations and a new general approach to coding for compression that directly utilizes the multidimensional nature of the input data. Many methods of speech analysis yield a two-dimensional (2-D) pattern, with time as one of the dimensions. Various such speech representations, and power spectrum sequences in particular, are show... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.

Aims & Scope

Covers the sciences, technologies and applications relating to the analysis, coding, enhancement, recognition and synthesis of audio, music, speech and language.

 

This Transactions ceased publication in 2005. The current retitled publication is IEEE/ACM Transactions on Audio, Speech, and Language Processing.

Full Aims & Scope