By Topic

Applications of positive time-frequency distributions to speech processing

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

3 Author(s)
Pitton, J.W. ; AT&T Bell Labs., Murray Hill, NJ, USA ; Atlas, L.E. ; Loughlin, P.J.

Much of our current knowledge and intuition of speech is derived from analyses involving assumptions of short-time stationarity (e.g., the speech spectrogram). Such methods are, by their very nature, incapable of revealing the true nonstationary nature of speech. A careful consideration of the theory of time-frequency distributions (TFDs), however, allows the construction of methods that reveal far more of the nonstationarities of speech, thereby highlighting just what it is that conventional approaches miss. We apply two iterative methods for generating positive time-frequency distributions (TFDs) to speech analysis. Both methods make use of multiple sources of information (e.g., multiple spectrograms) to yield a high-resolution estimate of the joint time-frequency energy density of speech. Plosive events and formant harmonic structure are simultaneously preserved in these TFDs. Rapidly time-varying formants are also resolved by these TFDs, and harmonic structure is revealed, independent of sweep rate; this result is quite different from that seen with conventional speech spectrograms. The speech features observed in these distributions demonstrate that conventional sliding window techniques lose or distort much of the rich nonstationary structure of speech. Examples for synthetic formants and real speech are provided. The differences between joint distributions and conditional distributions are also illustrated

Published in:

Speech and Audio Processing, IEEE Transactions on  (Volume:2 ,  Issue: 4 )