By Topic

IEEE Transactions on Speech and Audio Processing

Issue 2 • Date March 2004

Filter Results

Displaying Results 1 - 15 of 15
  • Table of contents

    Publication Year: 2004, Page(s): c1
    Request permission for commercial reuse | PDF file iconPDF (35 KB)
    Freely Available from IEEE
  • IEEE Transactions on Speech and Audio Processing publication information

    Publication Year: 2004, Page(s): c2
    Request permission for commercial reuse | PDF file iconPDF (35 KB)
    Freely Available from IEEE
  • Linear predictive method for improved spectral modeling of lower frequencies of speech with small prediction orders

    Publication Year: 2004, Page(s):93 - 99
    Cited by:  Papers (8)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (224 KB) | HTML iconHTML

    An all-pole modeling technique, Linear Prediction with Low-frequency Emphasis (LPLE), which emphasizes the lower frequency range of the input signal, is presented. The method is based on first interpreting conventional linear predictive (LP) analyses of successive prediction orders with parallel structures using the concept of symmetric linear prediction. In these implementations, symmetric linear... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Singing voice identification using spectral envelope estimation

    Publication Year: 2004, Page(s):100 - 109
    Cited by:  Papers (18)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (280 KB) | HTML iconHTML

    In this paper, we present a spectrum-based system for singer identification that operates for the ideal case in which audio samples contain only the singer's voice. Our method begins with the computation of a robust estimate of the spectral envelope called the composite transfer function (CTF). The CTF is derived from the instantaneous amplitude and frequency of the sinusoidal partials which make ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Audio modeling based on delayed sinusoids

    Publication Year: 2004, Page(s):110 - 120
    Cited by:  Papers (19)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (512 KB) | HTML iconHTML

    In this work, we present an evolution of the Damped and Delayed Sinusoidal (DDS) model introduced within the framework of the general signal modeling. This model, named the Partial Damped and Delayed Sinusoidal (PDDS) model, takes into account a single time delay parameter for a set (sum) of damped sinusoids. The proposed modification is more consistent with the transient audio modeling problem. T... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A perceptual subspace approach for modeling of speech and audio signals with damped sinusoids

    Publication Year: 2004, Page(s):121 - 132
    Cited by:  Papers (18)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (584 KB) | HTML iconHTML

    The problem of modeling a signal segment as a sum of exponentially damped sinusoidal components arises in many different application areas, including speech and audio processing. Often, model parameters are estimated using subspace based techniques which arrange the input signal in a structured matrix and exploit the so-called shift-invariance property related to certain vector spaces of the input... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Enhancement of log Mel power spectra of speech using a phase-sensitive model of the acoustic environment and sequential estimation of the corrupting noise

    Publication Year: 2004, Page(s):133 - 143
    Cited by:  Papers (65)  |  Patents (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (376 KB) | HTML iconHTML

    This paper presents a novel speech feature enhancement technique based on a probabilistic, nonlinear acoustic environment model that effectively incorporates the phase relationship (hence phase sensitive) between the clean speech and the corrupting noise in the acoustic distortion process. The core of the enhancement algorithm is the MMSE (minimum mean square error) estimator for the log Mel power... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Efficient perceptual tuning of hearing aids with genetic algorithms

    Publication Year: 2004, Page(s):144 - 155
    Cited by:  Papers (15)  |  Patents (5)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (328 KB) | HTML iconHTML

    We describe a system for integrating a genetic algorithm (GA) with perceptual feedback to perform an efficient search in a perceptual space. The main system components are an efficient method for estimating perceptual rank order and genetic operators that take advantage of the types of parameters found in certain classes of audio processing systems. Preference judgments are used, resulting in a li... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Audio textures: theory and applications

    Publication Year: 2004, Page(s):156 - 167
    Cited by:  Papers (15)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (1040 KB) | HTML iconHTML

    In this paper, we introduce a new audio medium, called audio texture, as a means of synthesizing long audio stream according to a given short example audio clip. The example clip is first analyzed to extract its basic building patterns. An audio stream of arbitrary length is then synthesized using a sequence of extracted building patterns. The patterns can be varied in the synthesis process to add... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Speaker adaptation using constrained transformation

    Publication Year: 2004, Page(s):168 - 174
    Cited by:  Papers (3)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (224 KB) | HTML iconHTML

    In speech recognition research, transformation-based adaptation algorithms provide an effective way of adapting acoustic models to improve the recognition accuracy. However, when only limited amounts of adaptation data are available, the transformation is often poorly estimated, which may cause performance degradation. This paper presents the Markov Random Field Linear Regression (MRFLR) algorithm... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Estimation of articulatory movements from speech acoustics using an HMM-based speech production model

    Publication Year: 2004, Page(s):175 - 185
    Cited by:  Papers (42)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (576 KB) | HTML iconHTML

    We present a method that determines articulatory movements from speech acoustics using a Hidden Markov Model (HMM)-based speech production model. The model statistically generates speech spectrum and articulatory parameters from a given phonemic string. It consists of HMMs of articulatory parameters for each phoneme and an articulatory-to-acoustic mapping for each HMM state. For a given speech spe... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • IEEE Transactions on Speech and Audio Processing EDICS

    Publication Year: 2004, Page(s): 186
    Request permission for commercial reuse | PDF file iconPDF (29 KB)
    Freely Available from IEEE
  • IEEE Transactions on Speech and Audio Processing Information for authors

    Publication Year: 2004, Page(s):187 - 188
    Request permission for commercial reuse | PDF file iconPDF (42 KB)
    Freely Available from IEEE
  • IEEE Signal Processing Society Information

    Publication Year: 2004, Page(s): c3
    Request permission for commercial reuse | PDF file iconPDF (29 KB)
    Freely Available from IEEE
  • Blank page [back cover]

    Publication Year: 2004, Page(s): c4
    Request permission for commercial reuse | PDF file iconPDF (2 KB)
    Freely Available from IEEE

Aims & Scope

Covers the sciences, technologies and applications relating to the analysis, coding, enhancement, recognition and synthesis of audio, music, speech and language.

 

This Transactions ceased publication in 2005. The current retitled publication is IEEE/ACM Transactions on Audio, Speech, and Language Processing.

Full Aims & Scope