By Topic

Applications of Signal Processing to Audio and Acoustics, 1993. Final Program and Paper Summaries., 1993 IEEE Workshop on

Date 17-20 Oct. 1993

Filter Results

Displaying Results 1 - 25 of 44
  • Proceedings of IEEE Workshop on Applications of Signal Processing to Audio and Acoustics

    Save to Project icon | Request Permissions | PDF file iconPDF (43 KB)  
    Freely Available from IEEE
  • Parametric approximation of room impulse responses based on wavelet decomposition

    Page(s): 68 - 71
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (240 KB)  

    A new approach to the approximation and real-time simulation of room impulse responses is presented. Based on wavelet decomposition of measured impulse response data an energy-time-frequency representation of the system room is obtained. The wavelet coefficients in the frequency subbands are calculated by a multirate analysis filter bank providing aliasing-free subband processing and linear-phase filters. In a second step a modification of the Prony-method is used to obtain the parameters of cascaded moving average comb filter structures. Combining the approximated subband signals by a synthesis filter bank with perfect reconstruction properties gives an approximation of the broadband impulse reponse View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Hearing aids for profoundly deaf people based on a new parametric concept

    Page(s): 89 - 92
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (252 KB)  

    People with severe hearing loss only have a minor part of the frequency range available for reception of information in speech signals. These people do not benefit from normal hearing aids as the information in high frequency parts of the speech is not available. To overcome this problem the authors have developed a new method enabling to present information from the frequency range of interest in the frequency range available for the hearing disabled. By means of parametric modeling of the speech production system, transforming the speech production model to match the available frequency range, and then finally resynthesize the speech using this transformed model, one can present the speech information of interest in a frequency range at choice. This concept is believed to reduce wideband background noise which is a problem for hearing disabled as well as for people with normal hearing ability View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Robust real-time constrained hearing aid arrays

    Page(s): 81 - 84
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (292 KB)  

    The paper addresses the implementation of a real-time, robust, adaptive spatial filter used as a preprocessor for a monaural hearing aid. The goal of the ongoing study is the development of a processor that provides the user spatial selectivity and an attenuation of undesired interfering sources, while robustly controlling the response to a desired source. A four microphone, real-time, robust processor has been implemented and preliminary results are discussed in terms of improvement in SNR and an intelligibility measure View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Real-time generation of interactive virtual auditory environments

    Page(s): 106 - 109
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (324 KB)  

    Virtual auditory environments refer to a procedure in which auditory environments are created by means of a computer model (Lehnert & Blauert 1991). These artificial environments are perceived as being natural and they create the impression of being present in another physical space. The sense of tele-presence can greatly be improved by making these environments interactive, that is, the subject is allowed to move and act naturally in such an environment. To this end the behaviour of the subject, namely the rotations and translations of the head, has to be monitored and reacted on by the computer model. In the course of the European ESPRIT research project SCATIS (Spatially Coordinated Auditory/Tactile Interactive Scenario) a system, the SCAT-LAB, is under development which generates interactive virtual environments for the tactile and the auditory modality. In this paper some of the design aspects for the development of the auditory part of the SCAT-LAB are presented. Although being taken from a specific project, most of the results are regarded as being quite general for the task of creating interactive virtual environments with today's technology View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A new technique to measure electroacoustic transducer directivity indices in reverberant fields

    Page(s): 64 - 67
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (268 KB)  

    The paper presents a new method for measuring the directivity index of an electroacoustic transducer in a diffuse reverberant environment. The method that is proposed relies on the measurement of the spectral density variance of the transfer function between source and receiver. The method requires a measurement of the source/receiver transfer function, the distance between source and receiver, the directivity of either of the transducers, and an estimate of the room constant. A variant of the method eliminates the room constant variable. The modified method is a comparison technique that requires a known source and receiver directivity and the distance between source and receiver. The methods and their limitations are discussed for computer simulated rooms and actual measurements made in a reverberant room View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Analog/digital hybrid VLSI signal processing using single BIT modulators

    Page(s): 136 - 139
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (180 KB)  

    A hybrid analog/digital technique for efficient VLSI implementation of signal processing systems is presented. Single bit delta sigma modulators are used to modulate analog inputs into a form which can be considered simultaneously analog and digital, and directly manipulated as such. A cross-correlator is proposed, demonstrating the compactness of VLSI signal processing systems using this approach View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Autocorrelation method for high-quality time/pitch-scaling

    Page(s): 131 - 134
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (356 KB)  

    A new method is described for high-quality time or pitch modifications of audio signals. The method is a simple but efficient improvement of the splice method. Thanks to its simplicity, the algorithm can be implemented to run in real-time on standard microprocessors. Informal listening tests have demonstrated the method's capability to modify high-quality audio signals without introducing audible artifacts for moderate modification factors (up to 15%) View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Robust adaptive processing of microphone array data for hearing aids

    Page(s): 77 - 80
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (276 KB)  

    The problem of adaptively combining the outputs of an array of microphones as a single input for a hearing aid is investigated. A robust processor based on a constrained minimum variance optimization approach is used. One fundamental criteria employed in designing this robust beamformer limits the amount of cancellation of the desired signal. The results presented include the effects of acoustic headshadow, small room reverberation, microphone placement uncertainty, and desired speaker location uncertainty. Performance improvement is measured as a predicted change in the speech reception threshold (SRT) between single microphone and multi-microphone conditions. Performance improvements are demonstrated relative to the “best” single microphone in the array for block optimum and adaptive spatial filters. The performance of the block optimum arrays is shown to be attainable with adaptive implementations. A fast-attack, slow release input signal power averager allows the adaptive processor to avoid instabilities commonly experienced with nonstationary, impulsive inputs such as speech View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Perceptual consequences of interpolating head-related transfer functions during spatial synthesis

    Page(s): 102 - 105
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (324 KB)  

    In implementing a spatial auditory display, many engineering compromises must be made to achieve a practical system. One such compromise involves devising methods for interpolating between the head-related transfer functions (HRTFs) used to synthesize spatial stimuli in order to achieve smooth motion trajectories and locations at finer resolutions than the empirical data. The perceptual consequences of interpolation can only be assessed by psychophysical studies. This paper compares three subjects' localization judgments for stimuli synthesized from non-interpolated HRTFs. Simple linear interpolations of the empirical HRTFs, stimuli synthesized from non-interpolated minimum-phase approximations of the HRTFs, and linear interpolations of the minimum-phase HRTFs. The empirical HRTFs used were derived from a different subject (SDO) from a previous study by Wightman and Kistler (1989) and whose data are provided with the Convolvotron synthetic 3D audio system. In general, the three subjects showed the same high rates of front-back and up-down confusions that were observed in a recent experiment using non-individualized (non-interpolated) transforms from SDO. However, there were no obvious differences in localization accuracy between the different types of synthesis conditions View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Frequency-independent beamforming

    Page(s): 60 - 63
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (284 KB)  

    The beamwidth of a linear array decreases as frequency increases. For broadband beamformers such as microphone arrays for teleconferencing, this frequency dependence implies that signals incident on the outer portions of the main beam are subject to an undesirable lowpass filtering process. In the paper several ways of attaining beamwidth constancy are discussed, including a novel method based on superimposing several marginally steered beams to form a constant beamwidth multi-beam. This method provides an analytically tractable framework for designing realizable constant beamwidth beamformers View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Generalized overlap-add sinusoidal modeling applied to quasi-harmonic tone synthesis

    Page(s): 165 - 168
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (288 KB)  

    Analysis-by-synthesis/overlap-add (ABS/OLA) sinusoidal modeling has been successfully demonstrated as an accurate, flexible, and computationally tractable representation for the purposes of speech modification and harmonic tone synthesis; however, the model formulation used to synthesize these signals does not take full advantage of the structure of quasi-harmonic music signals. This paper describes a generalized overlap-add sinusoidal model formulation that accounts for the time-frequency behavior of quasi-harmonic tones and which reduces to the previous formulation as a special case View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Directional microphones in computer simulated and real rooms

    Page(s): 56 - 59
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (216 KB)  

    The subjective effects of utilizing highly directional microphones in a teleconferencing setting are not well understood. Computer simulation of both complex microphone systems and room environments offer one opportunity to study the combined effects. A complex microphone system can be modeled as a collection of point microphones distributed in space and summed with appropriate time delays. Established room simulation modeling methods were used. Combining both models together, with appropriate attention to spatial and time resolution of the microphone model, allows calculation of the impulse response of the complete structure, in an arbitrary room environment. The resulting impulse response can be used as a filter for speech. Results of such simulations are presented and compared with real room results View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Computation of modulation spectra for the speech transmission index using real speech

    Page(s): 110 - 113
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (236 KB)  

    While it has been suggested that the speech transmission index (STI) for on environment may be calculated using speech rather than test signals, computational artifacts distort the speech analyses whereas they have minimal impact on analyses with test signals. This report documents some of the difficulties encountered when using speech as the probe stimulus and proposes modifications in STI computations to circumvent some of the problems View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • The 2-D digital waveguide mesh

    Page(s): 177 - 180
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (256 KB)  

    An extremely efficient method for modeling wave propagation in a membrane is provided by the multidimensional extension of the digital waveguide. The 2-D digital waveguide mesh is constructed out of bi-directional delay units and scattering junctions. We show that it coincides with the standard finite difference scheme in the lossless case. Wave propagation in the mesh is compared with wave propagation in an ideal membrane; the dissipation and dispersion error is derived View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Hierarchic models of hearing for sound separation and reconstruction

    Page(s): 157 - 160
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (304 KB)  

    In building a machine to detect and segregate individual components in sound mixtures, the best example to copy is the human auditory system. Several models of auditory organization implement various rules of psychoacoustic grouping. We propose in addition to model auditory inference as exhibited in the well-known `phonemic restoration illusion' of Warren (1970). A hierarchy of abstracted features and source hypotheses similar to that of Nawab (1992) allows reconstruction of obliterated detail which can then be used to recreate an `idealized' sound without corruption. A preliminary example of fitting a harmonic model to a noisy recording of a clarinet gives a very convincing resynthesis with the interference totally removed. However, there are many issues including the design of the representation and the control architecture still to be addressed in building a more general system View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Multidimensional scaling analysis of head-related transfer functions

    Page(s): 98 - 101
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (240 KB)  

    Accurate rendering of auditory objects in a virtual auditory display depends on signal processing that is based on detailed measurements of the human free-field to eardrum transfer function (HRTF). The performance of an auditory display can be severely compromised if the HRTF measurements are not made individually, for each potential user. This requirement could sharply limit the practical application of auditory display technology. Thus, we have been working to develop a standard set of HRTFs that could be used to synthesize veridical virtual auditory objects for all users. Our latest effort along those lines has involved a feature analysis of HRTFs from 15 listeners who demonstrated high proficiency localizing virtual sources. The primary objectives were to quantify the differences among HRTFs, to identify listeners with similar and different HRTFs, and to test the localizability of virtual sources synthesized from the HRTFs of an individual with closely and not closely matched HRTFs. We used a multidimensional scaling algorithm, a statistical procedure which assesses the similarity of a set of objects and/or individuals, to analyze the HRTFs of the 15 listeners. Listeners with similar HRTFs were identified and their ability to localize virtual sources synthesized from the HRTFs of a “similar” listener was evaluated. All listeners were able to localize accurately. When these same listeners were tested with virtual sources synthesized from HRTFs that were identified to be “different” by the MDS analysis. Both azimuth and elevation of virtual sources were judged less accurately. Although we were able to identify “typical” listeners from the MDS analysis, our preliminary data suggest that several alternative sets of HRTFs may be necessary to produce a usable auditory display system View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Current and future standardization of high-quality digital audio coding in MPEG

    Page(s): 43 - 46
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (340 KB)  

    Since 1988 ISO/IEC JTCI/SC29 WG11 (MPEG) is working on the standardization of video and audio signals. The Audio subgroup of MPEG is working on bit rate reduction systems for high quality digital audio. Since the first phase of this standardization effort has been finished, MPEG/Audio is extending its work to multichannel audio coding systems as well as to medium quality coding at lower sampling frequencies and lower bit rates. Future standardization work aims at next-generation coder suitable for high quality audio transmission and storage at bit rates of 64 kb/s per channel and well below View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Interpolation of forced structural responses from non-uniform sparse measurements

    Page(s): 26 - 29
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (264 KB)  

    This paper presents a method for interpolating a sparse set of nonuniformly spaced velocity measurements on the surface of a vibrating structure. The method utilizes knowledge of the physical nature of the vibrating structure specified in terms of a given bound on the energy of the excitation forces, estimated mobilities of the structure and a known set of sparse velocity measurements. To minimize the maximum possible error of the estimated surface velocities. The method employs an estimation approach derived from the theory of optimal signal recovery. Results are presented which demonstrate the performance of the method on interpolating surface velocities of a rectangular plate. With only four randomly selected point velocity measurements out of 209 possible locations. The method estimates the structural surface velocity with a normalized error of only -45 dB. The ability to achieve this performance with a small number of sensors makes this method important for many active noise control applications where an accurate measure of structural surface velocity is required to predict the radiated acoustic field View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A fast converging, low complexity adaptive filtering algorithm

    Page(s): 4 - 7
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (260 KB)  

    This paper introduces a new adaptive filtering algorithm called fast affine projections (FAP). Its main attributes include RLS (recursive least squares) like convergence and tracking with NLMS (normalized least mean squares) like complexity. This mix of complexity and performance is similar to the recently introduced fast Newton transversal filter (FNTF) algorithm. While FAP shares some similar properties with FNTF it is derived from a different perspective, namely the generalization of the affine projection interpretation of NLMS. FAP relies on a sliding windowed fast RLS (FRLS) algorithm to generate forward and backward prediction vectors and expected prediction error energies. Since sliding windowed FRLS algorithms easily incorporate regularization of the implicit inverse of the covariance matrix, FAP is regularized as well View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Adaptive predictive coding with transform domain quantization using block size adaptation and high-resolution spectral modeling

    Page(s): 31 - 34
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (252 KB)  

    The adaptive predictive coding with transform domain quantization (APC-TQ) technique was proposed by Bhaskar (1991) for the compression of audio signals. Since then, significant developments have taken place leading to a reduction in the coding rate. While enhancing the audio quality. These developments include (i) the use of block size adaptation to exploit the variations in the stationarity of the signal, (ii) high resolution spectral modeling using LPC analysis orders up to 64, and (iii) an adaptive bit-allocation procedure to minimize coding noise power as well as minimize the perception of coding noise. The result is a near transparent quality compression of 5 kHz bandwidth audio at a rate of 17 kbit/s. This technology will find applications in the distribution and transmission of AM quality audio programming over low rate channels such as the INMARSAT Standard A, B and aeronautical systems View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A comparison of gradient-based algorithms for echo compensation with decorrelating properties

    Page(s): 12 - 15
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (240 KB)  

    Cancelling echoes by using the normalized least mean square (NLMS) algorithm has been state of the art for many years. In acoustical echo compensation, however, it is common to estimate more than 1000 parameters resulting in a too slow convergence when driven by speech signals. In order to overcome this drawback, a lot of modifications have been published in the last years, all having one goal: to decorrelate the driving process. Beginning with a deterministic approach we show that all these different ideas can be arranged in one scheme, allowing a uniform normalization. The different properties of the several algorithms are then obvious. A comparison of some algorithms with 2N-4N complexity is presented. Surprisingly, all algorithms do not work perfectly for a large compensator filter length and speech as input process View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Objective measures based on neural networks for hearing loss compensation techniques

    Page(s): 93 - 96
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (260 KB)  

    An objective measures system has been developed to predict the results of subject-based tests for sensorineural hearing loss compensation techniques. Parameters related to the loudness level of the compensated speech signal are extracted from its frequency spectrum. These parameters are then used to train a neural network based phoneme classifier. Good prediction results have been achieved for two hearing impaired subjects View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Developments in transaural stereo

    Page(s): 114 - 117
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (240 KB)  

    Transaural stereo achieves precision 3-D imaging by compensating for spectral distortions in the loudspeaker-to-car signal paths. The heart of transaural stereo, signal processing for crosstalk cancellation, is herein generalized to accommodate any number of loudspeakers and listeners in any layout. Transaural equations are written and then solved using standard algebraic methods. Worked-out examples are shown and several applications are proposed View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Computationally efficient compression of audio signals by means of RIQ-DPCM

    Page(s): 35 - 38
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (256 KB)  

    The need to transmit large amounts of data over limited bandwidth channels has resulted in many methods for digital data compression. The common approach is to identify and remove redundancy from the input data stream using knowledge of the source characteristics. In the case of signals intended for human observers (speech, music, pictures, etc.) it is also useful to consider the strengths and weaknesses of the human sensory systems in order to achieve a greater degree of data compression. Unfortunately, achieving perceptually transparent compression requires considerable computational resources. For situations requiring extremely low computational complexity without strictly transparent coding, such as multimedia applications on personal computer platforms, a new adaptive differential pulse code modulation (DPCM) data compression scheme is proposed. Although standard DPCM structures are widely used in single-talker speech coding systems, the models and statistical assumptions well-known for speech signals are not applicable to arbitrary audio signals such as music. The new DPCM formulation presented includes a recursively indexed quantizer (RIQ) to eliminate the problem of overload distortion, a simple predictor structure to take advantage of the short-term correlation present in wideband audio signals, and an adaptation strategy to optimize the system to the local statistics of the input signal. Thus, the new RIQ-DPCM formulation is presented as a computationally efficient means of wideband audio compression View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.