By Topic

Applications of Signal Processing to Audio and Acoustics, 1997. 1997 IEEE ASSP Workshop on

Date 19-22 Oct. 1997

Filter Results

Displaying Results 1 - 25 of 62
  • Proceedings of 1997 Workshop on Applications of Signal Processing to Audio and Acoustics

    Save to Project icon | Request Permissions | PDF file iconPDF (40 KB)  
    Freely Available from IEEE
  • Voice source localization for automatic camera pointing system in videoconferencing

    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (408 KB)  

    This paper describes the voice source localization algorithm used in the PictureTel automatic camera pointing system (LimeLightTM , dynamic speech locating technology). The system uses an array of 46 cm wide and 30 cm high, which contains 4 microphones, and is mounted on top of the monitor. The three dimensional position of a sound source is calculated from the time delays of 4 pairs of microphones. In time delay estimation, the averaging of signal onsets of each frequency band is combined with phase correlation to reduce the influence of noise and reverberation. With this approach, it is possible to provide reliable three dimensional voice source localization by a small microphone array. Post processing based on a priori knowledge is also introduced to eliminate the influences of reflections from furniture such as tables. Results of speech source localization under real conference room conditions are given. Some system related issues are also discussed View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Efficient blind separation of convolved sound mixtures

    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (376 KB)  

    We present an extension to recent approaches to blind source separation. Bell and Sejnowski (see Neural Computation 7, MIT Press, Cambridge, MA., 1996) proposed a robust algorithm for separating instantaneous mixtures. Extensions were proposed by Torkkola (see IEEE Workshop on Neural Networks for Signal Processing, Kyoto, Japan, 1996) and Lee et al. (See Advances in Neural Information Processing Systems 9, MIT Press, Cambridge, MA., 1997) for separating convolved mixtures but the computational overhead and the convergence behavior of these algorithms were not ideal. A frequency domain extension is presented which improves the stability and the performance of these algorithms View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Sound localization of concurrent and continuous speech sources in reverberant environment

    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (444 KB)  

    This paper presents a model-based method for sound localization of concurrent and continuous speech sources in a reverberant environment. A new algorithm adopted from the echo-avoidance model of the precedence effect was used to detect the echo-free onsets by specifying a generalized pattern of impulse response. Fine structure time differences were calculated from the zero-crossing points in different microphones. They were integrated into an azimuth histogram by the restrictions between them. Two sound sources were localized in both an anechoic chamber and a normal room which has walls, floor and ceiling made of concrete. The time segment needed for localization was 0.5 to 2 seconds and the accuracy was a few degrees in both environments View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A microelectronic core for a programmable digital hearing aid

    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (540 KB)  

    We introduce a core for a digital hearing aid that compensates the signal spoken in sensorineural impaired listeners with object of improving their intelligibility. The technique implemented is based on a digital analysis/synthesis of speech: we divided the input signal into short time blocks then we make a multiband analysis, non-linear amplification and synthesis based in a sinusoidal model of the voice, according to the subject's dynamic range in each band. The system works in real time and has been implemented with only one ASIC in 1μ ES2 technology including 3 RAM memories with a capacity of 2432 bits and one 16×16 multiplier. The size of the die is 30.59 mm2 View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Modeling of reflections and air absorption in acoustical spaces a digital filter design approach

    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (448 KB)  

    A method is presented for modeling sound propagation in rooms using a signal processing approach. Low order digital filters are designed to match to sound propagation transfer functions calculated from boundary material and air absorption data. The technique is applied to low frequency, finite difference time domain (FDTD) simulation of room acoustics and to real-time image-source based virtual acoustics View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Some properties of tail-canceling IIR filters

    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (348 KB)  

    Infinite impulse response (IIR) recursive linear digital filters are widely used because of their low computational cost and low storage overhead requirements. Finite impulse response (FIR) filters, on the other hand, allow the possibility of implementing linear-phase linear digital filters which have constant group delay across all frequencies. The tradeoff is that to achieve similar magnitude transfer functions, FIR filters usually require much larger filter orders than their IIR counterparts. We describe an algorithm for the efficient implementation of certain classes of FIR filters. We introduce an extension of the truncated IIR (TIIR) algorithm which allows the truncation of arbitrary IIR filter tails. Our algorithm allows the possibility of implementing polynomial impulse responses. Additionally, we present an analysis of the effects of limited numerical precision and provide design guidelines for designing systems with acceptable noise tolerance View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A pitch-based approach to time-delay estimation of reverberant speech

    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (400 KB)  

    Generalized cross-correlation (GCC) has been the traditional method for estimating the relative time-delay associated with speech signals received by a pair of microphones in a reverberant, noisy environment. The filtering criterion employed is either focussed on the signal degradations due to additive noise or those due exclusively to multipath channel effects. There has been relatively little success at applying GCC weighting schemes which are robust to both of these conditions. This paper details an alternative approach which attempts to employ a signal dependent criterion, namely the estimated periodicity of harmonic spectral intervals, to design a GCC filter appropriate for the combination of noise and multipath signal distortions. Simulations are performed across a range of room conditions to illustrate the utility of the proposed time-delay estimation method relative to conventional GCC filtering approaches View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Adaptive beamforming with partitioned frequency-domain filters

    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (344 KB)  

    In this paper an adaptive broadband beamformer is presented which is based on a partitioned frequency-domain least-mean-square algorithm (PFDLMS). This block algorithm is known for its efficient computation and fast convergence even when the input signals are correlated. In applications where long filters are required but only a small processing delay is allowed, a frequency domain adaptive beamformer without partitioning demands a large FFT length despite the small block size. The FFT length can be shortened significantly by filter partitioning, without increasing the number of FFT operations. The weaker requirement on the FFT size makes the algorithm attractive for acoustical applications View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Superdirective microphone array for a set-top videoconferencing system

    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (324 KB)  

    In set-top videoconferencing, the complete videoconferencing system fits unobtrusively on top of the television. The microphone sound pickup system is one of the most important functional blocks with constraints of small size, high performance, and low cost. Persons speaking several feet away from the system must be picked up satisfactorily while noise generated internally in the system by the cooling fan and hard drive, and noise generated externally from air conditioning and nearby computers must be attenuated. In this paper, a three microphone superdirective array is described which meets these constraints. An analog highpass and lowpass filter are used to merge two of the microphone signals to form a single channel, so that a single stereo A/D converter is required to process the three microphone signals. The microphone signals are then linearly combined so as to maximize the signal-to-noise ratio, resulting in nulls steered toward nearby objectionable noise sources View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Adaptive noise cancellation with directional microphones

    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (328 KB)  

    The spatial correlation function between directional microphones is useful in the design and analysis of the performance of these microphones in actual acoustic noise fields. These correlation functions are well known for omnidirectional receivers, but not well known for directional receivers. This paper investigates the spatial correlation functions for Nth-order differential microphones in spherically isotropic noise fields. The results are used to calculate the amount of achievable cancellation from an adaptive noise cancellation application using combinations of differential microphones to remove unwanted noise from a desired signal. The results are also useful in determining signal-to-noise ratio gains from arbitrarily positioned differential microphone elements in microphone array applications View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Interpolation and extrapolation of room transfer functions based on common acoustical poles and their residues

    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (308 KB)  

    We propose a new method of modeling a room transfer function (RTF) that uses common acoustical poles and their residues. The common acoustical poles correspond to the resonance frequencies (eigenvalues) of the room, and their residues are composed of the eigenfunctions of the source and receiver positions in the room. Because the common acoustical poles do not depend on the source and receiver positions, this model expresses the RTF variations due to changes in the source and receiver positions by using residue variations. We also propose methods of interpolating and extrapolating RTFs based on the proposed common-acoustical-pole and residue model. Computer simulation demonstrated that unknown RTFs can be well estimated from known (measured) RTFs by using these methods View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Robust time delay estimation for sound source localization in noisy environments

    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (384 KB)  

    This paper addresses the problem of robust localization of a sound source in a wide range of operating environments. We use fractional lower order statistics in the frequency domain of two-sensor measurements to accurately locate the source in impulsive noise. We demonstrate a significant improvement in detection via simulation experiments of a sound source in α-stable noise. Applications of this technique include the efficient steering of a microphone array in teleconference applications View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Virtual-loudspeakers-based multichannel sound system

    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (372 KB)  

    We investigate the 3D virtual-loudspeakers-based multichannel sound system. This system uses the HRTFs (head related transfer functions) as the directional perception cues and makes the transmission paths transparent by using the crosstalk cancellers. We propose both the forward and feedback types of crosstalk cancellation systems and compare their complexities and performance such as equalization and crosstalk suppression factors View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Analysis of nonlinear and nonstationary processes in speech production

    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (384 KB)  

    Several techniques used in the analysis of dynamic nonlinear systems are applied in order to evidence and analyse some of the short-term nonlinear nonstationary characteristics of speech signal production. A new method of speech signal decomposition is introduced View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Continuously signal-adaptive filterbank for high-quality perceptual audio coding

    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (408 KB)  

    Historically, the choice of the optimum filterbank has been the subject of much research and discussion in the development of perceptual audio coders. Desirable properties of a good filterbank include both a good extraction of the signal's redundancy and effective utilization of that redundancy while maintaining control over perceptual demands. Often, there is a conflict between the use of perceptual constraints and the redundancy extraction, in that a filterbank with good resolution in both time and frequency is needed. Recently, a method for performing temporal noise shaping (TNS) of the error signal of a perceptual audio coder has been proposed, providing control over both the time and frequency structure of the coding noise. This paper focuses on the core part of the scheme, forming a continuously adaptive filterbank, and discusses its theoretical background, properties and limitations View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Head tracked 3-D audio using loudspeakers

    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (420 KB)  

    Existing loudspeaker 3-D audio systems suffer from a fixed listening location. This paper proposes using a head tracker to steer the equalization zone to the position of the tracked listener. Sound localization experiments show that this strategy greatly improves localization when the listener is displaced from the ideal listening location, and also enables dynamic localization cues View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Establishing the tonal context for musical pattern recognition

    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (428 KB)  

    We develop a method for establishing tonal contexts of musical patterns in a musical composition. This is subsequently incorporated into a system for recognition of musical patterns. Krumhansl's (1990) key-finding algorithm is used as a basis. The sequence of maximum correlations that it outputs is smoothed with a cubic spline and is used to determine weights for perceptual and absolute pitch errors. Statistically significant maximum correlations are used to create the assigned key sequence, which is then median filtered to improve the structure of the output of the key finding algorithm View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A DSP implementation of a digital hearing aid with recruitment of loudness compensation and acoustic echo cancellation

    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (460 KB)  

    This paper describes a DSP implementation of a digital hearing aid realized in the frequency domain that compensates for recruitment of loudness and cancels acoustic echos. In contrast to conventional systems which are based on a noise-probe signal, our echo canceler is adapted using only the available (e.g. speech) input signal. The main problems caused by a nonlinear feedforward filter are discussed using analytical results of the steady state behavior of the closed-loop hearing-aid system. The implemented DSP system is tested with a dummy behind-the-ear (bte) hearing-aid device on a KEMAR-head and results are presented View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Noiseless coding of quantized spectral components in MPEG-2 Advanced Audio Coding

    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (360 KB)  

    Advanced Audio Coding (AAC), part of ISO/MPEG-2, issued as an international standard in April, 1997. It supports single or multiple channel audio programs and delivers excellent audio quality at or below 64 kbps/channel by exploiting the compression capabilities of a high-resolution filterbank, backward-adaptive prediction, joint channel coding, nonlinear quantizers and noiseless (Huffman) coding. This paper describes the flexible Huffman coding algorithm used in AAC and discusses the compression provided by this component of the standard View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Compression circuit of a multiband analog system for hearing aid

    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (328 KB)  

    This paper describes the design and evaluation of a circuit which performs the compression of narrow-band signals within a multiband analog system for a hearing aid. The system has twelve narrow-band modules. Each module is formed by four stages. The first stage is a band-pass filter which selects the bandwidth of the module. The second stage, the object of this paper, is a compression circuit which performs a nonlinear operation for removing the attack and release times typical of automatic gain control systems. The third stage is another band-pass filter like the first, the function of which is to reduce the distortion produced by the compression stage. The fourth stage is a controlled linear gain stage. The simulation and experimental results obtained show that the compression circuit has good accuracy within the dynamic range of speech signals. The output narrow-band filter reduces to a great extent the harmonic and intermodulation distortion inherent in all compression systems when the input signal has some important formants very close View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Alias-free, multiresolution sinusoidal modeling for polyphonic, wideband audio

    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (460 KB)  

    We describe an improved method of generating more accurate sinusoidal parameters (amplitude, frequency, phase) from a wideband polyphonic audio source in a multiresolution, non-aliased fashion. This significantly improves upon previous work of sinusoidal modeling that assumes a single-pitched monophonic source, such as speech or an individual musical instrument. In addition to a more general analysis, we can now perform high-quality transformations such as time-stretching and pitch-shifting on polyphonic audio with ease View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Design of a broadside array for a binaural hearing aid

    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (412 KB)  

    This paper describes the design and implementation of a binaural directional hearing aid. This hearing aid consists of a microphone array of five directional microphones integrated into the front of a pair of spectacles. The signals of the microphones are processed with the aid of double beamforming into a left-ear and a right-ear signal. The directivity pattern of the left-ear signal has its main lobe at a small angle to the left, and the directivity pattern of the right-ear signal at a small angle to the right. These different main lobes cause an interaural level difference (ILD). In natural conditions, an ILD enables the human auditory brain to localize sound sources and to significantly improve speech intelligibility in noise. A computer simulation and an implementation in analogue electronics show that the main lobes for the left-ear and right-ear realize sufficient ILD at high frequencies to enable an effective localization of sound sources View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Filter bank constraints for subband and frequency-domain adaptive filters

    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (380 KB)  

    For many years now, subband and frequency-domain adaptive filtering techniques have been proposed for the cancellation of long acoustic echoes. Classical LMS based algorithms are less attractive as their computation load is higher and the convergence behaviour for coloured far-end inputs is worse. We specify 3 realization conditions for DFT modulated subband schemes. Standard subband adaptive filters cannot fulfil all conditions. We show that frequency-domain based algorithms can be considered as a special case of subband adaptive filtering and that the realization conditions can be fulfilled in this case View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An efficient HRTF model for 3-D sound

    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (364 KB)  

    A simple model is presented for synthesizing binaural sound from a monaural source. The model produces vertical as well as horizontal and externalization effects. The simplicity of the model permits efficient implementation, allowing for real-time multisource operation. Additionally, the parameters in the model can be adjusted to fit a particular individual's characteristics View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.