By Topic

Maximum entropy language modeling and the smoothing problem

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

3 Author(s)
Martin, S.C. ; Lehrstuhl fur Inf. VI, Tech. Hochschule Aachen ; Ney, H. ; Hamacher, C.

This paper discusses various aspects of smoothing techniques in maximum entropy language modeling. This topic is typically not addressed in literature. The results can be summarized in four statements: 1) straightforward maximum entropy models with nested features, e.g., tri-, bi-, and uni-grams, result in unsmoothed relative frequencies models, 2) maximum entropy models with nested features and discounted feature counts approximate backing-off smoothed relative frequencies models with Kneser's advanced marginal back-off distribution. This explains some of the reported success of maximum entropy models in the past. 3) We give perplexity results for nested and nonnested features, e.g., trigrams and distance-trigrams, on a 4 million word subset of the Wall Street Journal Corpus. From these results we conclude that the smoothing method has more effect on the perplexity than the method of how to combine the different types of features. 4) We show perplexity results for nonnested features using log-linear interpolation of conventionally smoothed language models, giving evidence that this approach may be a first step to overcome the smoothing problem in the context of maximum entropy

Published in:

Speech and Audio Processing, IEEE Transactions on  (Volume:8 ,  Issue: 5 )