Proceedings of the 2001 IEEE Workshop on the Applications of Signal Processing to Audio and Acoustics (Cat. No.01TH8575)

24-24 Oct. 2001

Filter Results

Displaying Results 1 - 25 of 59
  • 2001 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics [front matter]

    Publication Year: 2001, Page(s):0_1 - viii
    Request permission for commercial reuse | PDF file iconPDF (563 KB)
    Freely Available from IEEE
  • Robust matching of audio signals using spectral flatness features

    Publication Year: 2001, Page(s):127 - 130
    Cited by:  Papers (18)  |  Patents (6)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (457 KB) | HTML iconHTML

    Stimulated by the ever-increasing amount of available multimedia data, content-related techniques for the management of audio material have received much interest recently. This paper discusses the problem of robust identification of audio signals by matching them to a known reference. In order to perform well under realworld conditions, the matching process needs to rely on features which are rob... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Author index

    Publication Year: 2001, Page(s):231 - 232
    Request permission for commercial reuse | PDF file iconPDF (60 KB)
    Freely Available from IEEE
  • A robust technique for sound source localization in consideration of room capacity

    Publication Year: 2001, Page(s):63 - 66
    Cited by:  Papers (4)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (336 KB) | HTML iconHTML

    This paper proposes a robust technique for a sound source localization based on the time difference of arrival in noisy or reverberant environments. A nonlinear minimization problem for estimating source positions is formulated as a constrained optimization problem in consideration of room capacity. Then, a penalty function for a feasible region of this problem is used as the objective function. A... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Sweet spot widening for stereophonic sound reproduction

    Publication Year: 2001, Page(s):191 - 194
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (294 KB) | HTML iconHTML

    The correction of the degradation of the stereophonic illusion due to off-centre listening is investigated. The main idea here is that the directivity pattern of a loudspeaker array should have a well defined shape such that a good stereo sound reproduction is achieved in a large listening area. Optimal digital filters are designed and applied to individual drivers of linear loudspeaker arrays in ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Robust time-delay estimation in highly adverse acoustic environments

    Publication Year: 2001, Page(s):59 - 62
    Cited by:  Papers (3)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (359 KB) | HTML iconHTML

    This paper describes an algorithm for robust time-delay estimation (TDE) in situations where a large amount of additive noise and reverberation is present. J. Benesty (see Journal Acoust. Soc. of America, vol.107, no.1, p.384-91, 2000) developed an adaptive eigenvalue decomposition algorithm for TDE between two microphones in highly reverberant acoustic environments. We extend that algorithm to hi... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Performance bounds on sound field reproduction using a loudspeaker array

    Publication Year: 2001, Page(s):187 - 190
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (334 KB) | HTML iconHTML

    A fundamental problem in acoustic signal processing is how best to use an array of loudspeakers to reproduce a sound field. We derive performance bounds on how well a given loudspeaker array can reproduce a plane-wave sound field within a spherical region of space. The development is based on spherical harmonics analysis View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Separation of harmonic sounds using multipitch analysis and iterative parameter estimation

    Publication Year: 2001, Page(s):83 - 86
    Cited by:  Papers (7)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (354 KB) | HTML iconHTML

    A signal processing method for the separation of concurrent harmonic sounds is described. The method is based on a two-stage approach. First, a multipitch estimator is applied to find initial sound parameters which are reliable, but inaccurate and static. In a second stage, more accurate and time-varying sinusoidal parameters are estimated in an iterative procedure, which imposes certain constrain... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Structural composition and decomposition of HRTFs

    Publication Year: 2001, Page(s):103 - 106
    Cited by:  Papers (12)  |  Patents (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (545 KB) | HTML iconHTML

    The analysis and modeling of the response of parts of the body provides valuable insight into many features of the head-related transfer function (HRTF). In spatial sound simulations, partial models, such as the spherical head model, can also generate simple and effective approximate localization cues. We consider the composition of an approximate HRTF from the responses of structural components b... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A cluster centroid method for room response equalization at multiple locations

    Publication Year: 2001, Page(s):55 - 58
    Cited by:  Papers (22)  |  Patents (4)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (350 KB) | HTML iconHTML

    We address the problem of simultaneous room response equalization for multiple listeners. Traditional approaches to this problem have used a single microphone at the listening position to measure impulse responses from a loudspeaker and then use an inverse filter to correct the frequency response. The problem with that approach is that it only works well for that one point and in most cases is not... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A psychoacoustic model for audio coding based on a cochlear filter bank

    Publication Year: 2001, Page(s):139 - 142
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (338 KB) | HTML iconHTML

    Perceptual audio coders use an estimated masked threshold for the determination of the maximum permissible just-inaudible noise level introduced by quantization. This estimate is derived from a psychoacoustic model mimicking the psychoacoustics of masking. Current applications use a uniform spectral decomposition as first stage of that model to approximate the frequency selectivity of the human au... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Empirical and modeled acoustic transfer functions in a simple room: effects of distance and direction

    Publication Year: 2001, Page(s):183 - 186
    Cited by:  Papers (3)  |  Patents (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (377 KB) | HTML iconHTML

    Empirical transfer functions were measured for a manikin head as a function of source position (relative to the listener) and listener position (relative to the room) for sources within a meter of the listener. Empirical results are compared to room simulations using a standard image-method model combined with anechoic, distance-dependent head-related transfer functions (HRTFs). Results suggest th... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Intraframe time-scaling of nonstationary sinusoids within the phase vocoder

    Publication Year: 2001, Page(s):215 - 218
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (287 KB) | HTML iconHTML

    An observation regarding the nature of the spectrum of a windowed swept-frequency sinusoid is exploited to time-scale (stretch or compress) time-variant sinusoids within the window or frame of an otherwise basic phase vocoder process. Nonstationary sinusoids are more closely represented as a series of windowed linearly swept sinusoids than as a series of windowed constant frequency sinusoids. Both... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Multi-Gabor dictionaries for audio time-frequency analysis

    Publication Year: 2001, Page(s):43 - 46
    Cited by:  Papers (9)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (450 KB) | HTML iconHTML

    We consider the construction of multiresolution Gabor dictionaries appropriate for audio signal analysis. Motivated by a desire for parsimony and efficiency, we propose and formalise the idea of reduced multi-Gabor systems, showing that they constitute a frame for L2 (R) and other Hilbert spaces of interest. In order to demonstrate the practicality of such a scheme, we apply it... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Speech segregation based on pitch tracking and amplitude modulation

    Publication Year: 2001, Page(s):79 - 82
    Cited by:  Papers (5)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (335 KB) | HTML iconHTML

    Speech segregation is an important task of auditory scene analysis (ASA), in which the speech of a certain speaker is separated from other interfering signals. D.L. Wang and G.J. Brown (see IEEE Trans. Neural Network, vol.10, p.684-97, 1999) proposed a multistage neural model for speech segregation, the core of which is a two-layer oscillator network. We extend their model by adding further proces... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • The CIPIC HRTF database

    Publication Year: 2001, Page(s):99 - 102
    Cited by:  Papers (250)  |  Patents (7)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (410 KB) | HTML iconHTML

    This paper describes a public-domain database of high-spatial-resolution head-related transfer functions measured at the UC Davis CIPIC Interface Laboratory and the methods used to collect the data.. Release 1.0 (see http://interface.cipic.ucdavis.edu) includes head-related impulse responses for 45 subjects at 25 different azimuths and 50 different elevations (1250 directions) at approximately 5&d... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Effects of thresholding on a small-scale matched filter array

    Publication Year: 2001, Page(s):171 - 174
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (308 KB) | HTML iconHTML

    Impulse responses (IRs) from eight microphones were measured in a conference room for evaluation with matched filter array (MFA) processing. Since truncation of the filters can be applied to the MFA with little loss in signal-to-noise ratio (SNR), the effects of diffusion and specular reflections are studied. An amplitude threshold is applied to divide the IR's into specular reflections and diffus... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • The relative salience of auditory motion cues

    Publication Year: 2001, Page(s):111 - 114
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (344 KB) | HTML iconHTML

    The relative salience of auditory motion cues was measured in a series of four experiments. In the first three experiments, all combinations of three different auditory motion cues (intensity changes, Doppler frequency shifts and interaural time delays) were presented at various source trajectories, parallel to the listener's frontal plane. In the first experiment, the velocity of the source was v... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Combined spectral envelope normalization and subtraction of sinusoidal components in the ODFT and MDCT frequency domains

    Publication Year: 2001, Page(s):51 - 54
    Cited by:  Papers (3)  |  Patents (17)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (358 KB) | HTML iconHTML

    Recent research in high-quality audio coding seeks not only improved coding gains but also new functionalities such as easy semantic access to compressed audio material and audio modification in the compressed domain. These objectives imply the decomposition of the audio signal into several components of specific semantic value, such as sinusoidal components, that take advantage of selective codin... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Estimating tempo, swing and beat locations in audio recordings

    Publication Year: 2001, Page(s):135 - 138
    Cited by:  Papers (25)  |  Patents (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (363 KB) | HTML iconHTML

    The problem of estimating the tempo of audio recordings (the number of beats per minute, or BPM) has received an increasing amount of attention in the past few years. Applications include the synchronization of multiple audio tracks for simultaneous playback, "tempo-synchronous" audio effects, automatic looping of audio tracks etc. This article presents techniques for estimating the tempo and the ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Algorithm design of a stereophonic acoustic echo canceler system

    Publication Year: 2001, Page(s):179 - 182
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (253 KB) | HTML iconHTML

    A software program has been designed that successfully runs a stereophonic acoustic echo canceler natively on a personal computer. This is a major achievement since an echo canceler requires that the soundcard's input and output signals are time-synchronous. Synchronizing the audio streams is a great challenge in such an "asynchronous" environment as the operating system of a PC. Furthermore, ster... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Predicting time-varying audio spectra from pitch and loudness: related synthesis techniques

    Publication Year: 2001, Page(s):211 - 214
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (253 KB) | HTML iconHTML

    A technique is described for estimating the time-varying spectrum of an audio signal based on a conditional probability density function (PDF) of spectral coding vectors conditioned on pitch and loudness values. Using this PDF a time-varying output spectrum is generated as a function of time-varying pitch and loudness sequences arriving from an electronic music instrument controller or derived fro... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Discrete representation of signals on a logarithmic frequency scale

    Publication Year: 2001, Page(s):39 - 42
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (350 KB) | HTML iconHTML

    Logarithmic frequency representation plays an important role in many audio and acoustic signal processing applications. This article presents a methodology for frequency-warped signal processing where the frequency representation is logarithmic above a certain limit frequency. It is demonstrated how this approach can be used with FFT or linear prediction to perform non-parametric, or parametric co... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Approximate Kalman filtering for the harmonic plus noise model

    Publication Year: 2001, Page(s):75 - 78
    Cited by:  Papers (5)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (388 KB) | HTML iconHTML

    We present a probabilistic description of the harmonic plus noise model (HNM) for speech signals. This probabilistic formulation permits maximum likelihood (ML) parameter estimation and speech synthesis becomes a straightforward sampling from a distribution. It also permits the development of a Kalman filter that tracks model parameters such as pitch, harmonic amplitudes, and autoregressive coeffi... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Efficient evaluation of reverberant sound fields

    Publication Year: 2001, Page(s):203 - 206
    Cited by:  Papers (4)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (308 KB) | HTML iconHTML

    An image method due to Allen and Berkley (1979) is often used to simulate the effect of reverberation in rooms. This method is relatively expensive computationally. We present a fast method for conducting such simulations using multipole expansions. For M real and image sources and N evaluation points, while the image method requires O(MN) operations, our method achieves the calculations in O(M + ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.