Journals & Magazines >IEEE/ACM Transactions on Audi... >Volume: 26 Issue: 3

Suppression by Selecting Wavelets for Feature Compression in Distributed Speech Recognition

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

Distributed speech recognition (DSR) splits the processing of data between a mobile device and a network server. In the front-end, features are extracted and compressed t...Show More

Metadata

Abstract:

Distributed speech recognition (DSR) splits the processing of data between a mobile device and a network server. In the front-end, features are extracted and compressed to transmit over a wireless channel to a back-end server, where the incoming stream is received and reconstructed for recognition tasks. In this paper, we propose a feature compression algorithm termed suppression by selecting wavelets (SSW) to achieve the two main goals of DSR: Minimizing memory and device requirements while also maintaining or even improving the recognition performance. The SSW approach first applies the discrete wavelet transform (DWT) to filter the incoming speech feature sequence into two temporal subsequences at the client terminal. Feature compression is achieved by keeping the low (modulation) frequency subsequence while discarding the high frequency counterpart. The low-frequency subsequence is then transmitted across the remote network for specific feature statistics normalization. Wavelets are favorable for resolving the temporal properties of the feature sequence, and the down-sampling process in DWT achieves data compression by reducing the amount of data at the terminal prior to transmission across the network. Once the compressed features have arrived at the server, the feature sequence can be enhanced by statistics normalization, reconstructed with inverse DWT, and compensated with a simple post filter to alleviate any over-smoothing effects from the compression stage. Results on a standard robustness task (Aurora-4) and on a Mandarin Chinese news corpus showed SSW outperforms conventional noise-robustness techniques while also providing nearly a 50% compression rate during the transmission stage of DSR systems.

Published in: IEEE/ACM Transactions on Audio, Speech, and Language Processing ( Volume: 26, Issue: 3, March 2018)

Page(s): 564 - 579

Date of Publication: 04 December 2017

ISSN Information:

DOI: 10.1109/TASLP.2017.2779787

Funding Agency:

Contents

References is not available for this document.

Suppression by Selecting Wavelets for Feature Compression in Distributed Speech Recognition

Abstract:

Metadata

Abstract:

ISSN Information:

Funding Agency:

References

IEEE Account

Purchase Details

Profile Information

Need Help?

Suppression by Selecting Wavelets for Feature Compression in Distributed Speech Recognition

Alerts

Abstract:

Metadata

Abstract:

ISSN Information:

Funding Agency:

Authors

Figures

References

Citations

Keywords

Metrics

References

IEEE Account

Purchase Details

Profile Information

Need Help?