By Topic

2000 IEEE Workshop on Speech Coding. Proceedings. Meeting the Challenges of the New Millennium (Cat. No.00EX421)

17-20 Sept. 2000

Filter Results

Displaying Results 1 - 25 of 56
  • 2000 IEEE Workshop on Speech Coding. Proceedings. Meeting the Challenges of the New Millennium (Cat. No.00EX421)

    Publication Year: 2000
    Request permission for commercial reuse | PDF file iconPDF (367 KB)
    Freely Available from IEEE
  • An overview of text-to-speech synthesis

    Publication Year: 2000
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (34 KB)

    Summary form only given. The article gives an overview of text-to-speech (TTS) technology and a description of some issues of potential interest to speech coding experts. After motivation for the use of TTS technology, it describes the general architecture of a text-to-speech system with particular emphasis on the speech synthesis component. Both formant synthesis and concatenative synthesis are p... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Signal processing for cochlear implants and low-rate speech coding

    Publication Year: 2000
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (58 KB)

    Summary form only given. Cochlear implants are now established as a new option for individuals with profound (sensorineural) hearing impairment. Many of the cochlear implant patients are able to understand speech without lip-reading, and some can communicate over the phone. The success of cochlear implants can be attributed to the combined efforts of scientists from various disciplines including b... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Acoustic front-end processing for communication systems

    Publication Year: 2000
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (49 KB)

    Summary form only given. As communication systems have become more mobile and portable, we now have situations where audio communication in difficult acoustic environments is common. Speech coders at low bit rates tend to have problems with non-speech signals that are typically found in noisy acoustic environments. As a result, there can be degradation in the perceived audio quality for low-bit ra... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Trellis-based optimization of MPEG-4 advanced audio coding

    Publication Year: 2000, Page(s):142 - 144
    Cited by:  Papers (11)  |  Patents (7)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (339 KB)

    We outline a method to perform efficient low rate quantization for MPEG-4 advanced audio coding (AAC). The AAC bit stream consists of indices for quantized spectral coefficients as well as side information about quantizer step sizes and Huffman codebooks. The MPEG-4 Verification Model does not explicitly account for side information bits in its optimization and suffers from poor compression effici... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Author index

    Publication Year: 2000, Page(s): 157
    Request permission for commercial reuse | PDF file iconPDF (56 KB)
    Freely Available from IEEE
  • On the perceptual weighting function for phase quantization of speech

    Publication Year: 2000, Page(s):62 - 64
    Cited by:  Papers (1)  |  Patents (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (256 KB)

    This paper addresses the issue on the utilization of the perceptual characteristics of the human auditory system for the phase quantization of speech signals. Taking into account the phase quantization noise, we propose the perceptual weighting function to make the quantization noise below the threshold of human perception. The weighting function is derived from psychoacoustic experiments in which... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Enhancing MPEG-4 CELP by jointly optimized inter/intra-frame LSP predictors

    Publication Year: 2000, Page(s):90 - 92
    Cited by:  Papers (2)  |  Patents (20)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (220 KB)

    This paper presents an LSP quantization design method for bandwidth scalable coders such as the MPEG-4 CELP coder. In the enhancement layer of these coders, the LSP parameters are quantized using both interframe and intraframe predictors. The proposed design algorithm enables us to jointly optimize these predictors. Objective and subjective test results show that the quantizer obtained with the pr... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • New objective measures for characterisation of noise suppression algorithms

    Publication Year: 2000, Page(s):23 - 25
    Cited by:  Papers (5)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (244 KB)

    We present two new objective quality measures for the assessment of the performance of noise suppression (NS) algorithms. The signal-to-noise ratio improvement (SNRI) measure attempts to characterise the capability of an NS method to enhance the speech component of a noisy speech signal from an additive background noise. The SNRI measure includes a segmentation of the input speech signal into thre... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Exploring the characteristics of analytic decomposition of speech signals

    Publication Year: 2000, Page(s):59 - 61
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (276 KB)

    This paper investigates the properties of analytic transformation of speech into envelope and phase functions. The envelope is shown to evolve slowly with the pitch of the input speech, whilst the phase consists of two components; one evolving slowly with pitch and another that exhibits a more rapid evolution. We investigate decomposing the phase component further using two distinct methods: (a) f... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • PDF optimized parametric vector quantization of speech line spectral frequencies

    Publication Year: 2000, Page(s):87 - 89
    Cited by:  Papers (5)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (224 KB)

    A computationally efficient, high quality, vector quantization scheme based on a parametric probability density function (PDF) is developed for encoding speech line spectral frequencies (LSF). For this purpose, speech LSFs are modeled as i.i.d realizations of a multivariate normal mixture density. The mixture model parameters are efficiently estimated from the training data using the expectation m... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Application of multidimensional scaling to subjective evaluation of coded speech

    Publication Year: 2000, Page(s):20 - 22
    Cited by:  Papers (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (244 KB)

    We propose a new procedure for subjective evaluation of coded speech. This procedure has the potential of providing an anchorable measure of quality that contains more information than the single number provided by MOS testing. A stimulus space and the relationship between this space and speech quality are established with multidimensional scaling techniques in a large-scale listening test. In the... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Improved signal analysis and time-synchronous reconstruction in waveform interpolation coding

    Publication Year: 2000, Page(s):56 - 58
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (244 KB)

    This paper presents a waveform-matched waveform interpolation (WMWI) technique which enables improved speech analysis over existing WI coders. In WMWI, an accurate representation of speech evolution is produced by extracting critically-sampled pitch periods of a time-warped, constant pitch residual. The technique also offers waveform-matching capabilities by using an inverse warping process to nea... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Diversity control among multiple coders: a simple approach to multiple descriptions

    Publication Year: 2000, Page(s):69 - 71
    Cited by:  Papers (5)  |  Patents (8)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (244 KB)

    This paper presents a voice communication arrangement in which multiple coder-decoder (codec) pairs are coordinated to provide diversified descriptions of the source signal. This arrangement allows the system to robustly mitigate channel erasures that may occur while providing the best possible quality in the absence of channel impairments. Since single descriptive systems have been largely deploy... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Very low rate speech coding using temporal decomposition and waveform interpolation

    Publication Year: 2000, Page(s):29 - 31
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (264 KB)

    In very low rate coding the aim is to accurately represent speech characteristics as efficiently as possible. High coding gains for the spectral features can be achieved through the use of temporal decomposition. Waveform interpolation coders accurately represent the excitation using characteristic waveforms (CWs) extracted at a constant rate. In this paper, the two approaches are combined into a ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • 4 kb/s improved multi-pulse based CELP speech coding with multiple location codebook and post-processing

    Publication Year: 2000, Page(s):17 - 19
    Cited by:  Patents (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (216 KB)

    This paper proposes an improved MP-CELP (Multi-Pulse-based CELP) speech coding at 4 kb/s. In MP-CELP, amplitudes or signs of multi-pulse excitation are simultaneonsly vector quantized (VQ). In order to improve speech quality for voiced speech, a multiple pulse location codebook is stored to enhance the coverage of the location. The optimum combination among the pulse location codebook, pulse ampli... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Design and performance of a 4.0 kbit/s speech coder based on frequency-domain interpolation

    Publication Year: 2000, Page(s):8 - 10
    Cited by:  Patents (3)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (248 KB)

    The 4.0 kbit/s speech codec described is based on a frequency domain interpolative (FDI) coding technique, which belongs to the class of prototype waveform interpolation (PWI) coding techniques. The codec also has an integrated voice activity detector (VAD) and a noise reduction capability. The input signal is subjected to LPC analysis and the prediction residual is separated into a slowly evolvin... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A novel algorithm for low bit rate speech compression using a hybrid LP-harmonics model

    Publication Year: 2000, Page(s):41 - 43
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (252 KB)

    We present a new LP-harmonic speech codec. At the coder speech signal is pre-processed, and an LP analysis is performed, together with pitch estimation and voicing decision. At the decoder and when the frame is voiced, the encoded parameters are used to estimate the spectrum envelope, extract and classify the harmonics as either strong or weak depending on their relative distance from multiples of... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Error concealment by near optimum MMSE-estimation of source codec parameters

    Publication Year: 2000, Page(s):84 - 86
    Cited by:  Papers (13)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (252 KB)

    In digital communications source coding is indispensable to achieve a high bandwidth efficiency in applications where bandwidth is a limited resource. Usually these source coding algorithms determine speech or audio parameters which are highly sensitive to transmission errors. This paper deals with an error concealment technique that benefits from residual redundancy remaining after source coding.... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Regularized linear prediction all-pole models

    Publication Year: 2000, Page(s):96 - 98
    Cited by:  Papers (8)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (252 KB)

    For many cases of voiced speech, linear prediction (LP) based all-pole spectral envelopes exhibit unnatural vocal tract transfer functions that underestimate the formant bandwidths. To obtain smoother contoured all-pole spectral envelopes, we employ a regularization measure which discourages nonsmooth behavior of the transfer function. In particular, we demonstrate how a simple regularization sche... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A novel approach to excitation coding in low-bit-rate high-quality CELP coders

    Publication Year: 2000, Page(s):14 - 16
    Cited by:  Patents (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (312 KB)

    A significant improvement in the efficiency of excitation coding with CELP at low bit rates is achieved by a new paradigm for encoding the fixed excitation. In the proposed scheme, the non-zero fixed-codebook excitation elements are substantially localized in a set of windows, with positions adaptive to the pitch peaks. Highly efficient coding is thus achieved by allocating most of the available e... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Predictive quantization of spectral amplitudes for harmonic coders

    Publication Year: 2000, Page(s):47 - 49
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (244 KB)

    We present a novel predictive vector quantization scheme for coding the variable-dimension spectral amplitude vectors produced by harmonic coders. The scheme has a safety-net prediction structure, but it uses analysis-by-synthesis codebook search. A “closed-loop” codebook design algorithm is devised. Significant improvement over conventional predictive VQ design methods is obtained View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Model based spectrum prediction

    Publication Year: 2000, Page(s):117 - 119
    Cited by:  Papers (15)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (264 KB)

    This paper presents methods for speech spectrum prediction based on Gaussian mixture models. Spectrum prediction may be useful in a packet transmission system where the sensitivity to packet losses is a major problem. Models of speech are trained by the expectation maximization algorithm using pairs, triples etc. of consecutive cepstral vectors. The models are used to design first, second etc. ord... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A preprocessing method for perfect reconstruction WI coding

    Publication Year: 2000, Page(s):53 - 55
    Cited by:  Papers (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (248 KB)

    Waveform interpolation (WI) coding with perfect reconstruction is a promising speech coding method with high-level speech quality. However, for efficient coding several pitch values have to be transmitted per frame to guarantee the required accuracy of the pitch contour. In this paper we propose a preprocessing algorithm which slightly modifies the residual signal such that its pitch period evolve... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Coding of spectral magnitudes using optimized linear transformations

    Publication Year: 2000, Page(s):5 - 7
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (224 KB)

    This paper introduces a novel vector quantization (VQ) technique, wherein the quantized vector is obtained by applying a linear transformation selected from a first codebook to a codevector selected from a second codebook. The transformation is selected from a family of linear transformations, represented by a matrix codebook. Vectors in the second codebook are called residual codevectors. In orde... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.