How to train a discriminative front end with stochastic gradient descent and maximum mutual information | IEEE Conference Publication | IEEE Xplore