Skip to Main Content
The use of exemplar-based techniques for pitch generation in a text-to-speech system has shown a high degree of success and very comparable results compared to other techniques. The use of these techniques, however, requires that all units occur in the corpus. One of the limitations of this requirement is that the prosodically correlated data to the input found in the corpus does not always contain suitable units, and sometimes no units could be found in the corpus. These non-existent units can be seen as missing parts from the pitch signal. The work presented in this paper overcomes the missing units problem by using sparse representations for missing pitch data recovery. The framework proposed works in two stages; the first stage uses a unit selection approach to generate the initial pitch contour, the second stage adopts a sparse representation to generate the pitch contour for the missing units identified in the first stage. The approach followed showed comparable results compared to other pitch generation methods.
Date of Conference: 2-5 July 2012