Proceedings of the 2001 IEEE Workshop on the Applications of Signal Processing to Audio and Acoustics (Cat. No.01TH8575)

24-24 Oct. 2001

Filter Results

Displaying Results 1 - 25 of 59
  • 2001 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics [front matter]

    Publication Year: 2001, Page(s):0_1 - viii
    Request permission for commercial reuse | PDF file iconPDF (563 KB)
    Freely Available from IEEE
  • Robust matching of audio signals using spectral flatness features

    Publication Year: 2001, Page(s):127 - 130
    Cited by:  Papers (21)  |  Patents (8)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (457 KB) | HTML iconHTML

    Stimulated by the ever-increasing amount of available multimedia data, content-related techniques for the management of audio material have received much interest recently. This paper discusses the problem of robust identification of audio signals by matching them to a known reference. In order to perform well under realworld conditions, the matching process needs to rely on features which are rob... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Author index

    Publication Year: 2001, Page(s):231 - 232
    Request permission for commercial reuse | PDF file iconPDF (60 KB)
    Freely Available from IEEE
  • A psychoacoustic model for audio coding based on a cochlear filter bank

    Publication Year: 2001, Page(s):139 - 142
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (338 KB) | HTML iconHTML

    Perceptual audio coders use an estimated masked threshold for the determination of the maximum permissible just-inaudible noise level introduced by quantization. This estimate is derived from a psychoacoustic model mimicking the psychoacoustics of masking. Current applications use a uniform spectral decomposition as first stage of that model to approximate the frequency selectivity of the human au... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Estimating tempo, swing and beat locations in audio recordings

    Publication Year: 2001, Page(s):135 - 138
    Cited by:  Papers (25)  |  Patents (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (363 KB) | HTML iconHTML

    The problem of estimating the tempo of audio recordings (the number of beats per minute, or BPM) has received an increasing amount of attention in the past few years. Applications include the synchronization of multiple audio tracks for simultaneous playback, "tempo-synchronous" audio effects, automatic looping of audio tracks etc. This article presents techniques for estimating the tempo and the ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Tatum grid analysis of musical signals

    Publication Year: 2001, Page(s):131 - 134
    Cited by:  Papers (12)  |  Patents (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (328 KB) | HTML iconHTML

    An algorithm for analyzing the rhythmic content of acoustic signals of polyphonic and multitimbral Western music is presented. The analysis consists of detecting sound onsets, computing an inter-onset interval (IOI) histogram, and estimating the duration of the shortest notes, i.e. the tatum period, from the histogram. Robustness against tempo changes has explicitly been built into the system by u... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Robust time-delay estimation in highly adverse acoustic environments

    Publication Year: 2001, Page(s):59 - 62
    Cited by:  Papers (3)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (359 KB) | HTML iconHTML

    This paper describes an algorithm for robust time-delay estimation (TDE) in situations where a large amount of additive noise and reverberation is present. J. Benesty (see Journal Acoust. Soc. of America, vol.107, no.1, p.384-91, 2000) developed an adaptive eigenvalue decomposition algorithm for TDE between two microphones in highly reverberant acoustic environments. We extend that algorithm to hi... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A cluster centroid method for room response equalization at multiple locations

    Publication Year: 2001, Page(s):55 - 58
    Cited by:  Papers (23)  |  Patents (4)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (350 KB) | HTML iconHTML

    We address the problem of simultaneous room response equalization for multiple listeners. Traditional approaches to this problem have used a single microphone at the listening position to measure impulse responses from a loudspeaker and then use an inverse filter to correct the frequency response. The problem with that approach is that it only works well for that one point and in most cases is not... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • MACS: music audio characteristic sequence indexing for similarity retrieval

    Publication Year: 2001, Page(s):123 - 126
    Cited by:  Papers (7)  |  Patents (19)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (337 KB) | HTML iconHTML

    We present a prototype method of indexing raw-audio music files in a way that facilitates content-based similarity retrieval. The algorithm tries to capture the intuitive notion of similarity perceived by humans: two pieces are similar if they are fully or partially based on the same score, even if they are performed by different people or at different speed. Local peaks in signal power are identi... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Acoustic echo suppression in the STFT domain

    Publication Year: 2001, Page(s):175 - 178
    Cited by:  Papers (28)  |  Patents (6)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (370 KB) | HTML iconHTML

    We describe and evaluate an acoustic echo suppression method that operates in the short-time Fourier transform (STFT) domain. The system estimates the short-time spectrum of the acoustic interference component that causes the echo at the far end and subtracts it from the short-time spectrum of the microphone input using a nonlinear spectral subtraction rule, allowing for a trade-off between speech... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A fixed point solution for convolved audio source separation

    Publication Year: 2001, Page(s):87 - 90
    Cited by:  Papers (3)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (462 KB) | HTML iconHTML

    We examine the problem of blind audio source separation using independent component analysis (ICA). In order to separate audio sources recorded in a real recording environment, we need to model the mixing process as convolutional. Many methods have been introduced for separating convolved mixtures, the most successful of which require working in the frequency domain. This paper proposes a fixed-po... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Intraframe time-scaling of nonstationary sinusoids within the phase vocoder

    Publication Year: 2001, Page(s):215 - 218
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (287 KB) | HTML iconHTML

    An observation regarding the nature of the spectrum of a windowed swept-frequency sinusoid is exploited to time-scale (stretch or compress) time-variant sinusoids within the window or frame of an otherwise basic phase vocoder process. Nonstationary sinusoids are more closely represented as a series of windowed linearly swept sinusoids than as a series of windowed constant frequency sinusoids. Both... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Efficient evaluation of reverberant sound fields

    Publication Year: 2001, Page(s):203 - 206
    Cited by:  Papers (4)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (308 KB) | HTML iconHTML

    An image method due to Allen and Berkley (1979) is often used to simulate the effect of reverberation in rooms. This method is relatively expensive computationally. We present a fast method for conducting such simulations using multipole expansions. For M real and image sources and N evaluation points, while the image method requires O(MN) operations, our method achieves the calculations in O(M + ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Combined spectral envelope normalization and subtraction of sinusoidal components in the ODFT and MDCT frequency domains

    Publication Year: 2001, Page(s):51 - 54
    Cited by:  Papers (3)  |  Patents (18)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (358 KB) | HTML iconHTML

    Recent research in high-quality audio coding seeks not only improved coding gains but also new functionalities such as easy semantic access to compressed audio material and audio modification in the compressed domain. These objectives imply the decomposition of the audio signal into several components of specific semantic value, such as sinusoidal components, that take advantage of selective codin... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Locating singing voice segments within music signals

    Publication Year: 2001, Page(s):119 - 122
    Cited by:  Papers (38)  |  Patents (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (436 KB) | HTML iconHTML

    A sung vocal line is the prominent feature of much popular music. It would be useful to locate the portions of a musical track during which the vocals are present reliably, both as a 'signature' of the piece and as a precursor to automatic recognition of lyrics. We approach this problem by using the acoustic classifier of a speech recognizer as a detector for speech-like sounds. Although singing (... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Sweet spot widening for stereophonic sound reproduction

    Publication Year: 2001, Page(s):191 - 194
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (294 KB) | HTML iconHTML

    The correction of the degradation of the stereophonic illusion due to off-centre listening is investigated. The main idea here is that the directivity pattern of a loudspeaker array should have a well defined shape such that a good stereo sound reproduction is achieved in a large listening area. Optimal digital filters are designed and applied to individual drivers of linear loudspeaker arrays in ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Effects of thresholding on a small-scale matched filter array

    Publication Year: 2001, Page(s):171 - 174
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (308 KB) | HTML iconHTML

    Impulse responses (IRs) from eight microphones were measured in a conference room for evaluation with matched filter array (MFA) processing. Since truncation of the filters can be applied to the MFA with little loss in signal-to-noise ratio (SNR), the effects of diffusion and specular reflections are studied. An amplitude threshold is applied to divide the IR's into specular reflections and diffus... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Separation of harmonic sounds using multipitch analysis and iterative parameter estimation

    Publication Year: 2001, Page(s):83 - 86
    Cited by:  Papers (7)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (354 KB) | HTML iconHTML

    A signal processing method for the separation of concurrent harmonic sounds is described. The method is based on a two-stage approach. First, a multipitch estimator is applied to find initial sound parameters which are reliable, but inaccurate and static. In a second stage, more accurate and time-varying sinusoidal parameters are estimated in an iterative procedure, which imposes certain constrain... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Predicting time-varying audio spectra from pitch and loudness: related synthesis techniques

    Publication Year: 2001, Page(s):211 - 214
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (253 KB) | HTML iconHTML

    A technique is described for estimating the time-varying spectrum of an audio signal based on a conditional probability density function (PDF) of spectral coding vectors conditioned on pitch and loudness values. Using this PDF a time-varying output spectrum is generated as a function of time-varying pitch and loudness sequences arriving from an electronic music instrument controller or derived fro... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Efficient representation of spatial audio using perceptual parametrization

    Publication Year: 2001, Page(s):199 - 202
    Cited by:  Papers (11)  |  Patents (111)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (361 KB) | HTML iconHTML

    We introduce a new scheme for simultaneous placement of a number of sources in auditory space. The scheme is based on an assumption about the relevance of localization cues in different critical bands. Given the sum signal of a number of sources, i.e. a monophonic signal, and a set of parameters (side-information) the scheme is capable of generating a binaural signal by spatially placing the sourc... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Accurate estimation in the ODFT domain of the frequency, phase and magnitude of stationary sinusoids

    Publication Year: 2001, Page(s):47 - 50
    Cited by:  Papers (10)  |  Patents (6)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (286 KB) | HTML iconHTML

    This paper addresses the extraction of parametric information in an audio coder that uses the MDCT filter bank. The computation of the filter bank is reformulated as a function of the odd-DFT, in order to allow the estimation of the frequency, the phase and the magnitude of stationary sinusoids. Closed expressions delivering accurate estimates are derived and explained, and their implementation an... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Discrete representation of signals on a logarithmic frequency scale

    Publication Year: 2001, Page(s):39 - 42
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (350 KB) | HTML iconHTML

    Logarithmic frequency representation plays an important role in many audio and acoustic signal processing applications. This article presents a methodology for frequency-warped signal processing where the frequency representation is logarithmic above a certain limit frequency. It is demonstrated how this approach can be used with FFT or linear prediction to perform non-parametric, or parametric co... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Can one hear the volume of a shape?

    Publication Year: 2001, Page(s):115 - 118
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (334 KB) | HTML iconHTML

    The shape of three-dimensional cavities affects the timbral quality of sound sources located within them. Moreover, the resonances of the cavities may impress a sort of pitch to noise-like excitation sounds, and the pitch height is somehow related to the size of the cavity. It is interesting to investigate how differently-shaped enclosures give rise to different perceived pitches. From a first exp... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Robustness analysis of GSVD based optimal filtering and generalized sidelobe canceller for hearing aid applications

    Publication Year: 2001, Page(s):31 - 34
    Cited by:  Papers (6)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (327 KB) | HTML iconHTML

    Multi-microphone noise reduction techniques for hearing aid applications go together with the use of small-sized arrays. Considerable noise reduction can be achieved with such arrays, but at the expense of an increased sensitivity to model errors or a priori assumptions. We evaluate the robustness of the generalized sidelobe canceller (GSC) and a generalized singular value decomposition (GSVD) bas... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Performance bounds on sound field reproduction using a loudspeaker array

    Publication Year: 2001, Page(s):187 - 190
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (334 KB) | HTML iconHTML

    A fundamental problem in acoustic signal processing is how best to use an array of loudspeakers to reproduce a sound field. We derive performance bounds on how well a given loudspeaker array can reproduce a plane-wave sound field within a spherical region of space. The development is based on spherical harmonics analysis View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.