I. Introduction
With the advances in speech recognition technology, Computer-Assisted Language Learning (CALL) [1]–[2] has become a hot research interest, and the objective automatic evaluation of pronunciation quality is the core technology of computer-assisted language learning (CALL) system [3]. This technology will change the existing language learning environment and the teaching mode, greatly enhancing the efficiency of language learning. In language pronunciation learning, timely, accurate, objective evaluation and feedback can help learners find the gap between the standard pronunciation and their own pronunciation and correct the pronunciation errors.