Skip to Main Content
In a speech recognition system based on dynamic time warping (DTW) the DTW operation is used to time-align two words while calculating the distance between them. The DTW operation is repeated many times as the number of the reference templates in the system. As a result the processing becomes time-consuming. We propose the generation of a single pattern to be used as a time aligning pattern (TAP) for all the words in the system (references and unknowns). The unknown word is first time-aligned with the TAP only. Then the distance between the time-aligned version of the unknown word and any of the reference templates (which have already been time-aligned with the TAP in the training mode) is directly calculated without any further need for DTW operations. Thus a great reduction in the processing time results. Two methods for generating TAPs are proposed. The approach is extended to using more than one TAP. In both cases, the processing time saving, when compared with conventional DTW systems, is in the range of 95% for comparable recognition accuracy.