Abstract:
The open-vocabulary KWS system allows users to customize wake words, but its application is limited by the model size. In this paper, we design a dynamic acoustic model w...Show MoreMetadata
Abstract:
The open-vocabulary KWS system allows users to customize wake words, but its application is limited by the model size. In this paper, we design a dynamic acoustic model with input-dependent parameters. We find that acoustic frames with similar pronunciation generate similar subnetworks, and different parameters contribute to recognizing different phonemes. Based on this observation, we further constrain the structural similarity among the subnetworks with the same phoneme pseudo-label, thus independent subnetworks to recognize different phonemes could be pruned out. When used in the end-to-end KWS system, the subnetworks recognizing phonemes in the keyword would be combined as a keyword-specific acoustic model, and the parameters that do not contribute to recognizing the keyword are pruned off. Experiments demonstrate that the proposed method can prune more than 80% of the parameters without performance loss.
Published in: ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
Date of Conference: 04-10 June 2023
Date Added to IEEE Xplore: 05 May 2023
ISBN Information: