Pitch Estimation by Multiple Octave Decoders | IEEE Journals & Magazine | IEEE Xplore

Pitch Estimation by Multiple Octave Decoders


Abstract:

Pitch estimation is an essential task in audio processing due to its key role in many speech and music applications. Still, accurately predicting a continuous value from ...Show More

Abstract:

Pitch estimation is an essential task in audio processing due to its key role in many speech and music applications. Still, accurately predicting a continuous value from a high range of pitch frequencies is a challenging task. Inspired by the success of signal processing filterbank methods, we propose a novel deep architecture for accurate pitch estimation. The proposed method is composed of an encoder and multiple decoders. The encoder is implemented by a convolutional neural network that provides a good representation of the raw audio signal, and its output is fed into a set of decoders. Each decoder predicts the pitch value within a specific frequency band and is implemented by a fully-connected neural network. Such a construction allows each decoder to specialize in a particular frequency regime, which turns into a more accurate estimation of pitch values for music and speech signals.
Published in: IEEE Signal Processing Letters ( Volume: 28)
Page(s): 1610 - 1614
Date of Publication: 29 July 2021

ISSN Information:

Funding Agency:


I. Introduction

Pitch is the perceived frequency at which vocal cords vibrate in voiced sounds. The pitch plays a key role in the human voice perception, speaker identity, prosodic information, emotions, and the speech itself. The pitch is also used in speech applications such as speech enhancement [1], speech coding [2], and speech recognition [3], [4]. It is used in many music applications such as bass tracking [5], automatic transcription [6], and melody extraction [7], [8]. Moreover, an accurate pitch is desired in various analyses of speech for different psychological and pathological conditions [9], [10].

Contact IEEE to Subscribe

References

References is not available for this document.