By Topic

Speech and Audio Processing, IEEE Transactions on

Issue 6 • Date Nov. 2005

Filter Results

Displaying Results 1 - 23 of 23
  • Table of contents

    Publication Year: 2005 , Page(s): c1 - c4
    Save to Project icon | Request Permissions | PDF file iconPDF (40 KB)  
    Freely Available from IEEE
  • IEEE Transactions on Speech and Audio Processing publication information

    Publication Year: 2005 , Page(s): c2
    Save to Project icon | Request Permissions | PDF file iconPDF (34 KB)  
    Freely Available from IEEE
  • Comparative analysis of linear and nonlinear speech signals predictors

    Publication Year: 2005 , Page(s): 1093 - 1097
    Save to Project icon | Request Permissions | Click to expandAbstract | PDF file iconPDF (264 KB) |  | HTML iconHTML  

    The paper presents a new approach to speech production modeling based on nonlinear predictors of signals. The coefficients of latter are found by solving the system of linear algebraic equations with use of least squares method. The comparative experiments were carried out to demonstrate the absolute superiority of nonlinear models over linear one in terms of normalized mean-square error. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Nonlinear speech analysis using models for chaotic systems

    Publication Year: 2005 , Page(s): 1098 - 1109
    Cited by:  Papers (14)
    Save to Project icon | Request Permissions | Click to expandAbstract | PDF file iconPDF (576 KB) |  | HTML iconHTML  

    In this paper, we use concepts and methods from chaotic systems to model and analyze nonlinear dynamics in speech signals. The modeling is done not on the scalar speech signal, but on its reconstructed multidimensional attractor by embedding the scalar signal into a phase space. We have analyzed and compared a variety of nonlinear models for approximating the dynamics of complex systems using a sm... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Processing of reverberant speech for time-delay estimation

    Publication Year: 2005 , Page(s): 1110 - 1118
    Cited by:  Papers (25)
    Save to Project icon | Request Permissions | Click to expandAbstract | PDF file iconPDF (1288 KB) |  | HTML iconHTML  

    In this paper, we present a method of extracting the time-delay between speech signals collected at two microphone locations. Time-delay estimation from microphone outputs is the first step for many sound localization algorithms, and also for enhancement of speech. For time-delay estimation, speech signals are normally processed using short-time spectral information (either magnitude or phase or b... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An effective subband OSF-based VAD with noise reduction for robust speech recognition

    Publication Year: 2005 , Page(s): 1119 - 1129
    Cited by:  Papers (27)
    Save to Project icon | Request Permissions | Click to expandAbstract | PDF file iconPDF (784 KB) |  | HTML iconHTML  

    An effective voice activity detection (VAD) algorithm is proposed for improving speech recognition performance in noisy environments. The approach is based on the determination of the speech/nonspeech divergence by means of specialized order statistics filters (OSFs) working on the subband log-energies. This algorithm differs from many others in the way the decision rule is formulated. Instead of ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Fast QRD-lattice-based unconstrained optimal filtering for acoustic noise reduction

    Publication Year: 2005 , Page(s): 1130 - 1143
    Cited by:  Papers (4)
    Save to Project icon | Request Permissions | Click to expandAbstract | PDF file iconPDF (1112 KB) |  | HTML iconHTML  

    We derive a fast QRD-least-squares lattice (QRD-LSL) based unconstrained optimal filtering algorithm for multichannel acoustic noise reduction. As known from the literature, the unconstrained optimal filtering approach is an alternative to the popular GSC beamforming, which does not rely on a priori information and hence possesses improved robustness. The optimal filtering problem involved is spec... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Subspace constrained Gaussian mixture models for speech recognition

    Publication Year: 2005 , Page(s): 1144 - 1160
    Cited by:  Papers (18)
    Save to Project icon | Request Permissions | Click to expandAbstract | PDF file iconPDF (456 KB) |  | HTML iconHTML  

    A standard approach to automatic speech recognition uses hidden Markov models whose state dependent distributions are Gaussian mixture models. Each Gaussian can be viewed as an exponential model whose features are linear and quadratic monomials in the acoustic vector. We consider here models in which the weight vectors of these exponential models are constrained to lie in an affine subspace shared... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Noise robust speech recognition using feature compensation based on polynomial regression of utterance SNR

    Publication Year: 2005 , Page(s): 1161 - 1172
    Cited by:  Papers (20)  |  Patents (3)
    Save to Project icon | Request Permissions | Click to expandAbstract | PDF file iconPDF (672 KB) |  | HTML iconHTML  

    A feature compensation (FC) algorithm based on polynomial regression of utterance signal-to-noise ratio (SNR) for noise robust automatic speech recognition (ASR) is proposed. In this algorithm, the bias between clean and noisy speech features is approximated by a set of polynomials which are estimated from adaptation data from the new environment by the expectation-maximization (EM) algorithm unde... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Automatic transcription of conversational telephone speech

    Publication Year: 2005 , Page(s): 1173 - 1185
    Cited by:  Papers (6)  |  Patents (1)
    Save to Project icon | Request Permissions | Click to expandAbstract | PDF file iconPDF (432 KB) |  | HTML iconHTML  

    This paper discusses the Cambridge University HTK (CU-HTK) system for the automatic transcription of conversational telephone speech. A detailed discussion of the most important techniques in front-end processing, acoustic modeling and model training, language and pronunciation modeling are presented. These include the use of conversation side based cepstral normalization, vocal tract length norma... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Recognizing GSM digital speech

    Publication Year: 2005 , Page(s): 1186 - 1205
    Cited by:  Papers (3)
    Save to Project icon | Request Permissions | Click to expandAbstract | PDF file iconPDF (1016 KB) |  | HTML iconHTML  

    The Global System for Mobile (GSM) environment encompasses three main problems for automatic speech recognition (ASR) systems: noisy scenarios, source coding distortion, and transmission errors. The first one has already received much attention; however, source coding distortion and transmission errors must be explicitly addressed. In this paper, we propose an alternative front-end for speech reco... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An efficient method of Huffman decoding for MPEG-2 AAC and its performance analysis

    Publication Year: 2005 , Page(s): 1206 - 1209
    Cited by:  Papers (15)  |  Patents (1)
    Save to Project icon | Request Permissions | Click to expandAbstract | PDF file iconPDF (328 KB) |  | HTML iconHTML  

    This paper presents a new method for Huffman decoding specially designed for the MPEG-2 AAC audio. The method significantly enhances the processing efficiency of the conventional Huffman decoding realized with the ordinary binary tree search method. A data structure of one-dimensional array is newly designed based on the numerical interpretation of the incoming bit stream and its utilization for t... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Improved noise reduction in audio signals using spectral resolution enhancement with time-domain signal extrapolation

    Publication Year: 2005 , Page(s): 1210 - 1216
    Cited by:  Papers (1)  |  Patents (2)
    Save to Project icon | Request Permissions | Click to expandAbstract | PDF file iconPDF (680 KB) |  | HTML iconHTML  

    In this paper, we present significant improvement to frame-by-frame noise reduction methods which are based on spectral domain processing. This work mainly focuses on the analysis stage of the noise reduction process. It is common knowledge that better performance is obtained by increasing the spectral resolution. However, effective spectral resolution depends directly on the number of samples in ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Leaky-FXLMS algorithm: stochastic analysis for Gaussian data and secondary path modeling error

    Publication Year: 2005 , Page(s): 1217 - 1230
    Cited by:  Papers (8)
    Save to Project icon | Request Permissions | Click to expandAbstract | PDF file iconPDF (1104 KB) |  | HTML iconHTML  

    This paper presents a stochastic analysis of the leaky filtered-X least-mean-square (LFXLMS) algorithm. The version with leakage of the adaptive algorithm is used in practical implementations aiming to reduce undesirable effects due to numerical errors in finite-precision machines, overload of the secondary source, among others. Based on new analysis assumptions, instead of the ordinary independen... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Acoustic echo cancellation and doubletalk detection using estimated loudspeaker impulse responses

    Publication Year: 2005 , Page(s): 1231 - 1237
    Cited by:  Papers (4)
    Save to Project icon | Request Permissions | Click to expandAbstract | PDF file iconPDF (296 KB) |  | HTML iconHTML  

    In this paper, we present a new approach to acoustic echo cancellation and doubletalk detection for a teleconferencing system including a loudspeaker for which an estimate of the loudspeaker impulse response is available. The approach is general in the sense that it may be applied to most existing acoustic echo cancellation and doubletalk detection algorithms. We show that the new approach reduces... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • List of reviewers

    Publication Year: 2005 , Page(s): 1238 - 1240
    Save to Project icon | Request Permissions | PDF file iconPDF (29 KB)  
    Freely Available from IEEE
  • IEEE Transactions on Speech and Audio Processing Edics

    Publication Year: 2005 , Page(s): 1241
    Save to Project icon | Request Permissions | PDF file iconPDF (25 KB)  
    Freely Available from IEEE
  • IEEE Transactions on Speech and Audio Processing Information for authors

    Publication Year: 2005 , Page(s): 1242 - 1243
    Save to Project icon | Request Permissions | PDF file iconPDF (50 KB)  
    Freely Available from IEEE
  • Special issue on objective quality assessment of speech and audio

    Publication Year: 2005 , Page(s): 1244
    Save to Project icon | Request Permissions | PDF file iconPDF (120 KB)  
    Freely Available from IEEE
  • Special issue on blind signal processing for speech and audio applications

    Publication Year: 2005 , Page(s): 1245
    Save to Project icon | Request Permissions | PDF file iconPDF (105 KB)  
    Freely Available from IEEE
  • IEEE Odyssey 2006: The Speaker and Language Recognition Workshop

    Publication Year: 2005 , Page(s): 1246
    Save to Project icon | Request Permissions | PDF file iconPDF (642 KB)  
    Freely Available from IEEE
  • 2005 Index

    Publication Year: 2005 , Page(s): 1247 - 1260
    Save to Project icon | Request Permissions | PDF file iconPDF (201 KB)  
    Freely Available from IEEE
  • IEEE Signal Processing Society Information

    Publication Year: 2005 , Page(s): c3
    Save to Project icon | Request Permissions | PDF file iconPDF (34 KB)  
    Freely Available from IEEE

Aims & Scope

Covers the sciences, technologies and applications relating to the analysis, coding, enhancement, recognition and synthesis of audio, music, speech and language.

 

This Transactions ceased publication in 2005. The current retitled publication is IEEE/ACM Transactions on Audio, Speech, and Language Processing.

Full Aims & Scope