Skip to Main Content
This paper presents a phoneme recognition method based on distinctive phonetic features (DPFs). The method comprises three stages. The first stage extracts 3 DPF vectors of 15 dimensions each from local features (LFs) of an input speech signal using three multilayer neural networks (MLNs). The second stage incorporates an Inhibition/Enhancement (In/En) network to obtain more categorical DPF movement and decorrelates the DPF vectors using the Gram-Schmidt orthogonalization procedure. Then, the third stage embeds acoustic models (AMs) and language models (LMs) of syllable-based subwords to output more precise phoneme strings. The proposed method provides a higher phoneme correct rate as well as phoneme accuracy with fewer mixture components in hidden Markov models (HMMs).