By Topic

IEEE Transactions on Speech and Audio Processing

Issue 4 • Date Jul 2000

Filter Results

Displaying Results 1 - 14 of 14
  • Low peak amplitudes for wavetable synthesis

    Publication Year: 2000, Page(s):467 - 470
    Cited by:  Papers (5)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (92 KB)

    The peak amplitude of a waveform for a particular spectrum depends on the phases of its harmonic components. Previous work on peak amplitude reduction has only considered individual spectra. This paper compares various phase selection methods, and shows that genetic algorithm optimization gives results 10%-25% lower than the other methods View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Speech visualization by integrating features for the hearing impaired

    Publication Year: 2000, Page(s):454 - 466
    Cited by:  Papers (14)  |  Patents (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (352 KB)

    Describes development of a new speech visualization system that creates readable patterns by integrating different speech features into a single picture. The system extracts the phonemic and prosodic features from speech signals and converts them into a visual image using neither speech segmentation nor speech recognition. We used four time-delay neural networks (TDNNs) to generate phonemic featur... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Steady-state analysis of continuous adaptation in acoustic feedback reduction systems for hearing-aids

    Publication Year: 2000, Page(s):443 - 453
    Cited by:  Papers (65)  |  Patents (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (344 KB)

    Acoustic feedback is a problem in hearing aids that contain a substantial amount of gain, hearing aids that are used in conjunction with vented or open molds, and in-the-ear hearing aids. Acoustic feedback is both annoying and reduces the maximum usable gain of hearing-aid devices. This paper studies analytically the steady-state convergence behavior of LMS-based adaptive algorithms when used in c... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A comparative study of traditional and newly proposed features for recognition of speech under stress

    Publication Year: 2000, Page(s):429 - 442
    Cited by:  Papers (80)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (568 KB)

    It is well known that the performance of speech recognition algorithms degrade in the presence of adverse environments where a speaker is under stress, emotion, or Lombard (1911) effect. This study evaluates the effectiveness of traditional features in recognition of speech under stress and formulates new features which are shown to improve stressed speech recognition. The focus is on formulating ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Vector quantization based on Gaussian mixture models

    Publication Year: 2000, Page(s):385 - 401
    Cited by:  Papers (89)  |  Patents (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (744 KB)

    We model the underlying probability density function of vectors in a database as a Gaussian mixture (GM) model. The model is employed for high rate vector quantization analysis and for design of vector quantizers. It is shown that the high rate formulas accurately predict the performance of model-based quantizers. We propose a novel method for optimizing GM model parameters for high rate performan... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Cluster adaptive training of hidden Markov models

    Publication Year: 2000, Page(s):417 - 428
    Cited by:  Papers (114)  |  Patents (3)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (380 KB)

    When performing speaker adaptation, there are two conflicting requirements. First, the speaker transform must be powerful enough to represent the speaker. Second, the transform must be quickly and easily estimated for any particular speaker. The most popular adaptation schemes have used many parameters to adapt the models to be representative of an individual speaker. This limits how rapidly the m... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Channel optimized predictive vector quantization

    Publication Year: 2000, Page(s):370 - 384
    Cited by:  Papers (7)  |  Patents (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (332 KB)

    The paper investigates how channel optimization techniques can be applied to predictive vector quantizers. In particular, an efficient encoder search procedure and two design methods are derived. The design methods proposed here, one sample iterative and one block iterative, simultaneously optimize the predictor and the codebook. Extensive simulations show the advantage of this quantization method... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Synthesis of sinusoids via non-overlapping inverse Fourier transform

    Publication Year: 2000, Page(s):471 - 477
    Cited by:  Papers (5)  |  Patents (5)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (160 KB)

    Additive synthesis is a powerful tool for the analysis/modification/synthesis of complex audio or speech signals. However, the cost of wavetable sinusoidal synthesis can become prohibitive for large numbers of sinusoids (more than a few hundred). In that case, techniques based on the inverse Fourier transform offer an attractive alternative, being 200-300% more efficient than wavetable synthesis d... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Subband-based adaptive decorrelation filtering for co-channel speech separation

    Publication Year: 2000, Page(s):402 - 406
    Cited by:  Papers (5)  |  Patents (3)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (160 KB)

    A subband-based adaptive decorrelation filtering algorithm (SBADF) is proposed for co-channel speech separation. The SBADF decomposes the input signals into several frequency subbands, and uses the adaptive decorrelation filtering algorithm (ADF) to process the signals in each subband independently. The processed subband signals are then combined for each channel to form the separated speech. Expe... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Voice activity detection in nonstationary noise

    Publication Year: 2000, Page(s):478 - 482
    Cited by:  Papers (85)  |  Patents (5)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (144 KB)

    A new fusion method for voice activity detection in additive nonstationary noise is suggested. A performance study of the methods: fusion, the geometrically adaptive energy level, periodicity measure, and zero crossings rates, is presented. The new method is shown to operate reliably down to -5 dB SNR View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • On the temporal decorrelation of feature parameters for noise-robust speech recognition

    Publication Year: 2000, Page(s):407 - 416
    Cited by:  Papers (4)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (304 KB)

    We propose a new frame decorrelation method for robust speech recognition in noisy environments. In most cases, signal perturbation is caused by channel distortion and additive background noise, and can be modeled as a slowly varying term in either the log spectral or the linear-spectral domains. Thus, it is effective to deemphasize slowly varying stationary components in the spectral feature doma... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Design of vocabulary-independent Mandarin keyword spotters

    Publication Year: 2000, Page(s):483 - 487
    Cited by:  Patents (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (144 KB)

    This paper considers the design of a vocabulary-independent keyword spotter for Mandarin speech according to the framework by Huang et al. (1994). This paper considers three varieties of filler model structures for the framework based on subsyllabic grammar of Mandarin speech. On the basis of the three structures, we infer the problems of this framework through three arguments and presents two met... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • On the use of backward adaptation in a perceptual audio coder

    Publication Year: 2000, Page(s):488 - 490
    Cited by:  Papers (3)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (64 KB)

    Typically, perceptual audio coders have followed a subband or transform coding scheme with forward-adaptive quantization. In this letter we present an alternative scheme which uses backward-adaptive quantization, we discuss the effects of this strategy on perceptual coding and show that it can be successfully applied View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • On time-frequency masking in voiced speech

    Publication Year: 2000, Page(s):361 - 369
    Cited by:  Papers (9)  |  Patents (4)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (228 KB)

    This paper addresses the issue of masking of noise in voiced speech. First, we examine the audibility of cyclostationary narrow-band noise bursts added to voiced speech generated by synthetic excitation. Varying the temporal location of noise within a pitch cycle corresponds to varying its phase spectrum. Using this fact, we found that a change of phase of the noise in the high frequency region is... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.

Aims & Scope

Covers the sciences, technologies and applications relating to the analysis, coding, enhancement, recognition and synthesis of audio, music, speech and language.

 

This Transactions ceased publication in 2005. The current retitled publication is IEEE/ACM Transactions on Audio, Speech, and Language Processing.

Full Aims & Scope