Close category search window
 

A Robust Method to Extract Talker Azimuth Orientation Using a Large-Aperture Microphone Array

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

2 Author(s)
Levi, A. ; Lab. for Eng. Man/Machine Syst. (LEMS), Brown Univ., Providence, RI, USA ; Silverman, H.

Knowing the orientation of a talker in the focal area of a large-aperture microphone array enables the development of better beamforming algorithms (to obtain higher-quality speech output), improves source-location/tracking algorithms, and allows better selection and control of cameras in a video conference situation. Measurements in an anechoic room (e.g., Chu and Warnock, 2002) have quantified the average frequency-dependent magnitude (source radiation pattern) of the human speech source showing a front-to-back difference in magnitude that increases with frequency by about 8 dB/decade reaching about 18 dB at 8000 Hz. These amplitude differences, while severely masked by both coherent and noncoherent noise in a real environment, are the most extractable phenomena from a talker's orientation when compared to other phenomena such as phase differences due to the source or effects due to diffraction at the mouth. In this paper, we propose a robust, source-radiation-pattern-based method for extraction of the azimuth angle of a single talker for whom an accurate point-source location estimate is known. The method requires no a priori training and has been tested in more than 100 situations with real human talkers having various locations and orientations in a room equipped with a large aperture microphone array. We compare these results against earlier published algorithms and find that the method proposed herein is the most robust and is sufficient to be considered for a real time system.

Published in:
Audio, Speech, and Language Processing, IEEE Transactions on  (Volume:18 ,  Issue: 2 )

Date of Publication: Feb. 2010

Need Help?


IEEE Advancing Technology for Humanity About IEEE Xplore | Contact | Help | Terms of Use | Nondiscrimination Policy | Site Map | Privacy & Opting Out of Cookies

A not-for-profit organization, IEEE is the world's largest professional association for the advancement of technology.
© Copyright 2013 IEEE - All rights reserved. Use of this web site signifies your agreement to the terms and conditions.