Skip to Main Content
This article introduces automatic recognition of speech without any audio information. Movements of the tongue, lips, and jaw are tracked by an Electro-Magnetic Articulography (EMA) device and are used as features to create hidden Markov models (HMMs) and conduct automatic speech recognition in a conventional way. The results obtained are promising, which confirm that phonetic features characterizing articulation are as discriminating as those characterizing acoustics (except for voicing). The results also show that using tongue parameters result in a higher accuracy compared with the lip parameters.