Conferences >2015 IEEE Workshop on Automat...

Structured discriminative models using deep neural-network features

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

State-of-the-art speech recognisers employ neural networks in various configurations. A standard (hybrid) speech recogniser computes the likelihood for one time frame and...Show More

Metadata

Abstract:

State-of-the-art speech recognisers employ neural networks in various configurations. A standard (hybrid) speech recogniser computes the likelihood for one time frame and state, using only one out of thousands of possible neural-network outputs. However, the whole output vector carries information. In this paper, features from state-of-the-art speech recognisers are collected per phone given a particular context, and input to a discriminative log-linear model. The log-linear model is trained with conditional maximum likelihood or a large-margin criterion. A key element is the prior on the parameters of the log-linear model. The mean of the prior is set to the point where the performance of the original systems is attained. The log-linear model then provides an additional increase over the state-of-the-art performance of the individual systems.

Published in: 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU)

Date of Conference: 13-17 December 2015

Date Added to IEEE Xplore: 11 February 2016

ISBN Information:

DOI: 10.1109/ASRU.2015.7404789

Conference Location: Scottsdale, AZ, USA

Keywords assist with retrieval of results and provide a means to discovering other relevant content. Learn more.