By Topic

IEEE Transactions on Audio, Speech, and Language Processing

Issue 4 • Date May 2008

Filter Results

Displaying Results 1 - 24 of 24
  • Table of contents

    Publication Year: 2008, Page(s):C1 - C4
    Request permission for commercial reuse | PDF file iconPDF (104 KB)
    Freely Available from IEEE
  • IEEE Transactions on Audio, Speech, and Language Processing publication information

    Publication Year: 2008, Page(s): C2
    Request permission for commercial reuse | PDF file iconPDF (38 KB)
    Freely Available from IEEE
  • Compression Artifacts in Perceptual Audio Coding

    Publication Year: 2008, Page(s):681 - 695
    Cited by:  Papers (9)  |  Patents (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (4439 KB) | HTML iconHTML

    Perceptual audio coding achieves a high compression ratio by exploiting the perceptual irrelevance and data redundancies. By using advanced and sophisticated signal processing methods, perceptual coding has generated artifacts that are quite different from the traditional distortions. A new audio technology becomes mature through the successful modeling, measuring, and control on the artifacts inc... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Efficient Acoustic Echo Cancellation With Reduced-Rank Adaptive Filtering Based on Selective Decimation and Adaptive Interpolation

    Publication Year: 2008, Page(s):696 - 710
    Cited by:  Papers (4)  |  Patents (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (1044 KB) | HTML iconHTML

    This paper presents a new approach to efficient acoustic echo cancellation (AEC) based on reduced-rank adaptive filtering equipped with selective-decimation and adaptive interpolation. We propose a novel structure of an AEC scheme that jointly optimizes an interpolation filter, a decimation unit, and a reduced-rank filter. With a practical choice of parameters in AEC, the total computational compl... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Dual-Source Transfer-Function Generalized Sidelobe Canceller

    Publication Year: 2008, Page(s):711 - 727
    Cited by:  Papers (23)  |  Patents (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (1331 KB) | HTML iconHTML

    Full-duplex hands-free man/machine interface often suffers from directional nonstationary interference, such as a competing speaker, as well as stationary interferences which may comprise both directional and nondirectional signals. The transfer-function generalized sidelobe canceller (TF-GSC) exploits the nonstationarity of the speech signal to enhance it when the undesired interfering signals ar... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Binaural Tracking of Multiple Moving Sources

    Publication Year: 2008, Page(s):728 - 739
    Cited by:  Papers (36)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (865 KB) | HTML iconHTML

    This paper addresses the problem of tracking multiple moving sources using binaural input. We observe that binaural cues are strongly correlated with source locations in time-frequency regions dominated by only one source. Based on this observation, we propose a novel tracking algorithm that integrates probabilities across reliable frequency channels in order to produce a likelihood function in th... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • The Spherical-Shell Microphone Array

    Publication Year: 2008, Page(s):740 - 747
    Cited by:  Papers (28)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (392 KB) | HTML iconHTML

    Spherical microphone arrays have been recently studied for a wide range of applications. In particular, microphones arranged around an open or virtual sphere are useful in scanning microphone arrays for sound field analysis. However, open-sphere spherical arrays have been shown to have poor robustness at frequencies related to the zeros of the spherical Bessel functions. This paper presents a fram... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Acoustic Source Separation of Convolutive Mixtures Based on Intensity Vector Statistics

    Publication Year: 2008, Page(s):748 - 756
    Cited by:  Papers (23)  |  Patents (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (615 KB) | HTML iconHTML

    Various techniques have previously been proposed for the separation of convolutive mixtures. These techniques can be classified as stochastic, adaptive, and deterministic. Stochastic methods are computationally expensive since they require an iterative process for the calculation of the demixing filters based on a separation criterion that usually assumes that the source signals are statistically ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • On the Importance of the Pearson Correlation Coefficient in Noise Reduction

    Publication Year: 2008, Page(s):757 - 765
    Cited by:  Papers (16)  |  Patents (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (853 KB) | HTML iconHTML

    Noise reduction, which aims at estimating a clean speech from noisy observations, has attracted a considerable amount of research and engineering attention over the past few decades. In the single-channel scenario, an estimate of the clean speech can be obtained by passing the noisy signal picked up by the microphone through a linear filter/transformation. The core issue, then, is how to find an o... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Unsupervised Single-Channel Music Source Separation by Average Harmonic Structure Modeling

    Publication Year: 2008, Page(s):766 - 778
    Cited by:  Papers (31)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (2040 KB) | HTML iconHTML

    Source separation of musical signals is an appealing but difficult problem, especially in the single-channel case. In this paper, an unsupervised single-channel music source separation algorithm based on average harmonic structure modeling is proposed. Under the assumption of playing in narrow pitch ranges, different harmonic instrumental sources in a piece of music often have different but stable... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A Flexible Classifier Design Framework Based on Multiobjective Programming

    Publication Year: 2008, Page(s):779 - 789
    Cited by:  Papers (4)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (743 KB) | HTML iconHTML

    We propose a multiobjective programming (MOP) framework for finding compromise solutions that are satisfactory for each of multiple competing performance criteria in a pattern classification task. The fundamental idea for our formulation of classifier learning, which we refer to as iterative constrained optimization (ICO), evolves around improving one objective while allowing the rest to degrade. ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Temporal Compression Of Speech: An Evaluation

    Publication Year: 2008, Page(s):790 - 796
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (271 KB) | HTML iconHTML

    Efficient browsing of speech recordings is problematic. The linear nature of speech, coupled with the lack of abstraction that the medium affords, means that listeners have to listen to long segments of a recording to locate points of interest. We explore temporal compression algorithms that attempt to reduce the amount of time users require to listen to speech recordings, while retaining the impo... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Exploiting Acoustic and Syntactic Features for Automatic Prosody Labeling in a Maximum Entropy Framework

    Publication Year: 2008, Page(s):797 - 811
    Cited by:  Papers (30)  |  Patents (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (646 KB) | HTML iconHTML

    In this paper, we describe a maximum entropy-based automatic prosody labeling framework that exploits both language and speech information. We apply the proposed framework to both prominence and phrase structure detection within the Tones and Break Indices (ToBI) annotation scheme. Our framework utilizes novel syntactic features in the form of supertags and a quantized acoustic-prosodic feature re... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Fast Tracing of Acoustic Beams and Paths Through Visibility Lookup

    Publication Year: 2008, Page(s):812 - 824
    Cited by:  Papers (23)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (1165 KB) | HTML iconHTML

    The beam tracing method can be used for the fast tracing of a large number of acoustic paths through a direct lookup of a special tree-like data structure (beam tree) that describes the iterated visibility information from one specific position. This structure describes the branching of bundles of rays (beams) as they encounter reflectors in their paths. For this reason, beam tracing is suitable f... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Environment-Optimized Speech Enhancement

    Publication Year: 2008, Page(s):825 - 834
    Cited by:  Papers (18)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (610 KB) | HTML iconHTML

    In this paper, we present a training-based approach to speech enhancement that exploits the spectral statistical characteristics of clean speech and noise in a specific environment. In contrast to many state-of-the-art approaches, we do not model the probability density function (pdf) of the clean speech and the noise spectra. Instead, subband-individual weighting rules for noisy speech spectral a... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Online Noise Estimation Using Stochastic-Gain HMM for Speech Enhancement

    Publication Year: 2008, Page(s):835 - 846
    Cited by:  Papers (6)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (932 KB) | HTML iconHTML

    We propose a noise estimation algorithm for single-channel noise suppression in dynamic noisy environments. A stochastic-gain hidden Markov model (SG-HMM) is used to model the statistics of nonstationary noise with time-varying energy. The noise model is adaptive and the model parameters are estimated online from noisy observations using a recursive estimation algorithm. The parameter estimation i... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Towards Link Characterization From Content: Recovering Distributions From Classifier Output

    Publication Year: 2008, Page(s):847 - 858
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (776 KB) | HTML iconHTML

    In processing large volumes of speech and language data, we are often interested in the distribution of languages, speakers, topics, etc. For large data sets, these distributions are typically estimated at a given point in time using pattern classification technology. It is well known that such estimates can be highly biased, especially for rare classes. While these biases have been addressed in s... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Histogram-Based Quantization for Robust and/or Distributed Speech Recognition

    Publication Year: 2008, Page(s):859 - 873
    Cited by:  Papers (5)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (1995 KB) | HTML iconHTML

    In a distributed speech recognition (DSR) framework, the speech features are quantized and compressed at the client and recognized at the server. However, recognition accuracy is degraded by environmental noise at the input, quantization distortion, and transmission errors. In this paper, histogram-based quantization (HQ) is proposed, in which the partition cells for quantization are dynamically d... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • IEEE Transactions on Audio, Speech, and Language Processing Edics

    Publication Year: 2008, Page(s):874 - 875
    Request permission for commercial reuse | PDF file iconPDF (31 KB)
    Freely Available from IEEE
  • IEEE Transactions on Audio, Speech, and Language Processing Information for authors

    Publication Year: 2008, Page(s):876 - 877
    Request permission for commercial reuse | PDF file iconPDF (46 KB)
    Freely Available from IEEE
  • Special issue on processing morphologically rich languages

    Publication Year: 2008, Page(s): 878
    Request permission for commercial reuse | PDF file iconPDF (173 KB)
    Freely Available from IEEE
  • IEEE International Conference on Acoustics, Speech, and Signal Processing

    Publication Year: 2008, Page(s): 879
    Request permission for commercial reuse | PDF file iconPDF (708 KB)
    Freely Available from IEEE
  • Special issue on dsp techniques for rf/analog circuit impairments

    Publication Year: 2008, Page(s): 880
    Request permission for commercial reuse | PDF file iconPDF (132 KB)
    Freely Available from IEEE
  • IEEE Signal Processing Society Information

    Publication Year: 2008, Page(s): C3
    Request permission for commercial reuse | PDF file iconPDF (33 KB)
    Freely Available from IEEE

Aims & Scope

IEEE Transactions on Audio, Speech and Language Processing covers the sciences, technologies and applications relating to the analysis, coding, enhancement, recognition and synthesis of audio, music, speech and language.

 

This Transactions ceased publication in 2013. The current retitled publication is IEEE/ACM Transactions on Audio, Speech, and Language Processing.

Full Aims & Scope

Meet Our Editors

Editor-in-Chief
Li Deng
Microsoft Research