Definition of VAD reference using different HHM topologies and frame dropping strategy | IEEE Conference Publication | IEEE Xplore

Definition of VAD reference using different HHM topologies and frame dropping strategy


Abstract:

In this paper the segmentation of the Aurora 2 database with three different types of models is presented. The segmentation is based on speech recognition results obtaine...Show More

Abstract:

In this paper the segmentation of the Aurora 2 database with three different types of models is presented. The segmentation is based on speech recognition results obtained by tests on the Aurora 2 database. Three types of tests are performed. In the first test the speech units are words (16 state HMMs) and in the second test the speech units are monophones (3 state HMMs). In these two tests the silence unit is made of 3 state hidden Markov model. In the third test the speech and silence units are made of only one state. One state presents the time duration of 10 ms. The estimation of the best procedure for creation of VAD reference is obtained by speech recognition accuracy, correctly recognized words and number of inserted words based on frame dropping strategy. The best speech recognition accuracy is achieved by the use of monophone speech units. This is due to the smallest number of inserted words.
Date of Conference: 16-18 June 2011
Date Added to IEEE Xplore: 08 August 2011
ISBN Information:

ISSN Information:

Conference Location: Sarajevo, Bosnia and Herzegovina

Contact IEEE to Subscribe

References

References is not available for this document.