1. INTRODUCTION
Deep Neural Networks (DNNs) have become popular in the speech community over the last few years, showing significant gains over state-of-the-art GMM/HMM systems on a wide variety of small and large vocabulary tasks [1],[2],[3],[4]. However, one drawback of DNNs is that training is very slow, in part because DNNs can have a much larger number of parameters compared to GMMs [3], [4].