Skip to Main Content
This paper studies an independent-speaker isolated word speech recognition based on mean-shift framing using hybrid HMM/SVM classifier. The proposed framework includes two main units: preprocessing unit, and classification unit. The first unit tries to segment the speech signal into proper frames using the benefits of mean-shift gradient clustering algorithm and extract time-frequency relevant features in a way that maximize relative entropy of time-frequency energy distribution among segments. Then the second unit classifies words into the proper classes. To fulfill this intention, self-adaptive HMM calculates word's likelihood of each existed class and finally support vector machine (SVM) classifies it by using all classes' likelihood as an input vector. To validate method's accuracy and stability, the method verified within TULIPS1 dataset in the present of different kind of additive noises provided by SPIB. Comparing the results with the outcomes of the previous paper shows 3.2% improvement.