Cart (Loading....) | Create Account
Close category search window
 

Time–Frequency Matrix Feature Extraction and Classification of Environmental Audio Signals

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

2 Author(s)
Ghoraani, B. ; Dept. of Electr. & Comput. Eng., Ryerson Univ., Toronto, ON, Canada ; Krishnan, S.

Audio feature extraction and classification are important tools for audio signal analysis in many applications, such as multimedia indexing and retrieval, and auditory scene analysis. However, due to the nonstationarities and discontinuities exist in these signals, their quantification and classification remains a formidable challenge. In this paper, we develop a new approach for audio feature extraction to effectively quantify these nonstationarities in an attempt to achieve high classification accuracy for environmental audio signals. Our approach consists of three stages: first we propose to construct the time-frequency matrix (TFM) of audio signals using matching-pursuit time-frequency distribution (MP-TFD) technique, and then apply the non-negative matrix decomposition (NMF) technique to decompose the TFM into its significant components. Finally, we propose seven novel features from the spectral and temporal structures of the decomposed vectors in a way that they successfully represent joint TF structure of the audio signal, and combine them with the Mel-frequency cepstral coefficients (MFCCs) features. These features are examined using a database of 192 environmental audio signals which includes 20 aircraft, 17 helicopter, 20 drum, 15 flute, 20 piano, 20 animal, 20 bird, and 20 insect sounds, and the speech of 20 males and 20 females. The results of the numerical simulation support the effectiveness of the proposed approach for environmental audio classification with over 10% accuracy-rate improvement compared to the MFCC features.

Published in:

Audio, Speech, and Language Processing, IEEE Transactions on  (Volume:19 ,  Issue: 7 )

Date of Publication:

Sept. 2011

Need Help?


IEEE Advancing Technology for Humanity About IEEE Xplore | Contact | Help | Terms of Use | Nondiscrimination Policy | Site Map | Privacy & Opting Out of Cookies

A not-for-profit organization, IEEE is the world's largest professional association for the advancement of technology.
© Copyright 2014 IEEE - All rights reserved. Use of this web site signifies your agreement to the terms and conditions.