Skip to Main Content
New design procedures of time-frequency alignment for automatic speech morphing are proposed. The frequency alignment function at a specific frame is represented as a weighted average of vowel alignment functions based on similarity to each vowel. Julian, an open source speech recognition system, was used to design a time alignment function. Objective and subjective tests were conducted to evaluate the proposed method, and test results indicated that the proposed method yields comparable naturalness to the manually morphed samples in terms of time alignment. The results also illustrated that the proposed frequency alignment provides significantly better naturalness than morphed samples without frequency alignment.