By Topic

IEEE Transactions on Speech and Audio Processing

Issue 6 • Date Nov. 2000

Filter Results

Displaying Results 1 - 16 of 16
  • Comments on "Efficient training algorithms for HMMs using incremental estimation"

    Publication Year: 2000, Page(s):751 - 754
    Cited by:  Papers (5)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (97 KB)

    The paper entitled "Efficient training algorithms for HMMs using incremental estimation" by Gotoh et al. (IEEE Trans. Speech Audio Processing, vol.6, p.539-48, Nov. 1998) investigated expectation maximization (EM) procedures that increase training speed. The claim of Gotoh et al. that these procedures are generalized EM (Dempster et al. 1977) procedures is shown to be incorrect in the present pape... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • List of reviewers

    Publication Year: 2000, Page(s):757 - 758
    Request permission for commercial reuse | PDF file iconPDF (7 KB)
    Freely Available from IEEE
  • A DCT-based fast signal subspace technique for robust speech recognition

    Publication Year: 2000, Page(s):747 - 751
    Cited by:  Papers (14)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (168 KB)

    In this correspondence, a fast computational method is proposed to approximate the Karhunen-Loeve transform (KLT) for the covariance matrix of the autoregressive process. A fast algorithm which reduces the computation of eigenvalues of an N×N symmetric Toeplitz matrix from O(N3) in KLT to N2 is further developed. Experimental results demonstrate that the performance of ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Efficient distance measure for quantization of LSF and its Karhunen-Loeve transformed parameters

    Publication Year: 2000, Page(s):744 - 746
    Cited by:  Papers (5)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (104 KB)

    This paper presents a new distance measure that is based on the spectral sensitivity of the line spectrum frequency parameters (LSFs) and its Karhunen-Loeve (KL) transformed coefficients. It is shown that the proposed distance measure achieves better performance of vector quantization (VQ) compared to other methods in the field of LSF coding. In most cases, the percentage of outliers is reduced wh... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Cell-based beamforming (CE-BABE) for speech acquisition with microphone arrays

    Publication Year: 2000, Page(s):738 - 743
    Cited by:  Papers (5)  |  Patents (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (1160 KB)

    This paper introduces a microphone array processing method that possesses the robustness of fixed beamforming along with the ability to be dynamically reconfigured to limit interference and reverberation. The basic approach is to partition the environment into two regions: an interior region (containing sources that are physically present within the room enclosure), and an exterior region (contain... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Rapid speaker adaptation in eigenvoice space

    Publication Year: 2000, Page(s):695 - 707
    Cited by:  Papers (274)  |  Patents (8)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (212 KB)

    This paper describes a new model-based speaker adaptation algorithm called the eigenvoice approach. The approach constrains the adapted model to be a linear combination of a small number of basis vectors obtained offline from a set of reference speakers, and thus greatly reduces the number of free parameters to be estimated from adaptation data. These “eigenvoice” basis vectors are ort... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Nonminimum-phase equalization and its subjective importance in room acoustics

    Publication Year: 2000, Page(s):728 - 737
    Cited by:  Papers (40)  |  Patents (6)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (216 KB)

    This paper investigates the perceptual significance of residual phase distortion due to an approximate equalization of the nonminimum-phase room response from a sound source to a microphone in a reverberant room. It is shown that disrupted phase relationships introduced by a minimum-phase equalization filter may have a detrimental effect on perceived sound quality. The subjective assessment of pha... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A minimax search algorithm for robust continuous speech recognition

    Publication Year: 2000, Page(s):688 - 694
    Cited by:  Papers (10)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (196 KB)

    In this paper, we propose a novel implementation of a minimax decision rule for continuous density hidden Markov-model-based robust speech recognition. By combining the idea of the minimax decision rule with a normal Viterbi search, we derive a recursive minimax search algorithm, where the minimax decision rule is repetitively applied to determine the partial paths during the search procedure. Bec... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Fractionally addressed delay lines

    Publication Year: 2000, Page(s):717 - 727
    Cited by:  Papers (4)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (656 KB)

    While traditional implementations of variable-length digital delay lines are based on a circular buffer accessed by two pointers, we propose an implementation where a single fractional pointer is used both for read and write operations. On modern general-purpose architectures, the proposed method is nearly as efficient as the popular interpolated circular buffer, and it behaves well for delay-leng... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • The time-conditioned approach in dynamic programming search for LVCSR

    Publication Year: 2000, Page(s):676 - 687
    Cited by:  Papers (3)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (280 KB)

    This paper presents the time-conditioned approach in dynamic programming search for large-vocabulary continuous-speech recognition. The following topics are presented: the baseline algorithm, a time-synchronous beam search version, a comparison with the word-conditioned approach, a comparison with stack decoding. The approach has been successfully tested on the NAB task using a vocabulary of 64000... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Double-talk robust fast converging algorithms for network echo cancellation

    Publication Year: 2000, Page(s):656 - 663
    Cited by:  Papers (81)  |  Patents (4)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (224 KB)

    There is a need for echo cancelers for echo paths with long impulse responses (⩾64 ms). This in turn creates a need for more rapidly converging algorithms in order to meet the specifications for network echo cancelers. Faster convergence, however, in general implies a higher sensitivity to near-end disturbances, especially “double-talk.” Previously, a fast converging algorithm has ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A good read

    Publication Year: 2000, Page(s): 645
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (12 KB)

    First Page of the Article
    View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A computationally efficient multipitch analysis model

    Publication Year: 2000, Page(s):708 - 716
    Cited by:  Papers (135)  |  Patents (9)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (144 KB)

    A computationally efficient model for multipitch and periodicity analysis of complex audio signals is presented. The model essentially divides the signal into two channels, below and above 1000 Hz, computes a “generalized” autocorrelation of the low-channel signal and of the envelope of the high-channel signal, and sums the autocorrelation functions. The summary autocorrelation functio... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • GA-based noisy speech recognition using two-dimensional cepstrum

    Publication Year: 2000, Page(s):664 - 675
    Cited by:  Papers (11)  |  Patents (6)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (232 KB)

    Among various kinds of speech features, the two-dimensional (2-D) cepstrum (TDC) is a special one, which can simultaneously represent several types of information contained in the speech waveform: static and dynamic features, as well as global and fine frequency structures. Analysis results show that the coefficients located at lower indexes portion of the TDC matrix seem to be more significant th... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • R/D optimal linear prediction

    Publication Year: 2000, Page(s):646 - 655
    Cited by:  Papers (21)  |  Patents (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (236 KB)

    A common technique to extend linear prediction to nonstationary signals is time segmentation: the signal is split into small portions and the modelization is carried out locally. The accuracy of the analysis is, however, dependent on the window size and on the signal characteristics, so that the problem of finding a good segmentation is crucial to the entire modeling scheme. In this paper, we pres... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Control of feedback in hearing aids-a robust filter design approach

    Publication Year: 2000, Page(s):754 - 756
    Cited by:  Papers (4)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (76 KB)

    A bound on the variability of the feedback path is employed in the design of fixed FIR hearing aid filters that are robust to the specified variability, thus avoiding instability and howling in everyday use. A design example is presented for a linear gain hearing aid filter with a given maximal mismatch of the feedback cancellation filter View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.

Aims & Scope

Covers the sciences, technologies and applications relating to the analysis, coding, enhancement, recognition and synthesis of audio, music, speech and language.

 

This Transactions ceased publication in 2005. The current retitled publication is IEEE/ACM Transactions on Audio, Speech, and Language Processing.

Full Aims & Scope