Conferences >2022 23rd International Sympo...

An Audio Frequency Unfolding Framework for Ultra-Low Sampling Rate Sensors

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

Recent audio super-resolution works have achieved significant success in promoting audio quality by improving a sensor’s sampling rate, e.g., from 8 kHz to 48 kHz. Howeve...Show More

Metadata

Abstract:

Recent audio super-resolution works have achieved significant success in promoting audio quality by improving a sensor’s sampling rate, e.g., from 8 kHz to 48 kHz. However, these works fail to maintain the performance when the sampling rate at the sensor is ultra-low, where the audios suffer serious frequency aliasing. In this paper, we propose an audio frequency unfolding framework that efficiently reconstructs the aliasing audios to be perceptually recognizable. The intuition is that the audios generated by humans have a regular pattern on the spectrums; by learning such a regular pattern, our framework can reconstruct audio that sounds similar to real human voices. We evaluate our framework in a perceptual way: an automatic speech recognition (ASR) system is used to judge whether the words in the reconstructed audios can be correctly recognized. In the implementation based on AudioMNIST, when reconstructing the sampling rate from 2 kHz to 16 kHz, the recognition accuracy of the reconstructed audio reaches 77.1%.

Published in: 2022 23rd International Symposium on Quality Electronic Design (ISQED)

Date of Conference: 06-07 April 2022

Date Added to IEEE Xplore: 29 June 2022

ISBN Information:

ISSN Information:

DOI: 10.1109/ISQED54688.2022.9806149

Conference Location: Santa Clara, CA, USA

Contents

I. Introduction

The audio of human voices becomes an essential data source in many applications in reality, such as speech recognition [7], [2], user identification [5] and human localization [15], [19]. The common method to acquire these audios always requires a microphone with a high sampling rate. In general, a microphone with a sampling rate over 8 kHz can be considered speech-recognizable and with a sampling rate of 48 kHz is of good quality [10]. Such a high sampling rate of a microphone usually renders high power consumption, which limits the microphone’s wider deployment on low-power devices. On the other hand, a microphone’s being low-power means its low sampling rate, which suffers frequency aliasing according to the Nyquist sampling theorem. Besides the power consumption issue of the microphone, recent works [12], [18] focus on extracting audios from inertial measurement units (IMU). Compared to the microphones, the audios extracted from IMUs concentrate on the sound sources traveled from the solid mediums, less interfered by the noise source far away. However, the sampling rate of the IMU, much lower than that of a microphone [18], also suffers frequency aliasing. Hereby, given the benefits of the low-power microphones and the IMUs over the traditional microphones, is it possible to address the frequency aliasing problem, i.e., reconstructing their low sampling rates to a high sampling rate?

References is not available for this document.

An Audio Frequency Unfolding Framework for Ultra-Low Sampling Rate Sensors

Abstract:

Metadata

Abstract:

ISSN Information:

I. Introduction

References

IEEE Account

Purchase Details

Profile Information

Need Help?

An Audio Frequency Unfolding Framework for Ultra-Low Sampling Rate Sensors

Alerts

Abstract:

Metadata

Abstract:

ISSN Information:

I. Introduction

Authors

Figures

References

Keywords

Metrics

References

IEEE Account

Purchase Details

Profile Information

Need Help?