Scheduled System Maintenance:
On Monday, April 27th, IEEE Xplore will undergo scheduled maintenance from 1:00 PM - 3:00 PM ET (17:00 - 19:00 UTC). No interruption in service is anticipated.
By Topic

Efficient sampling and feature selection in whole sentence maximum entropy language models

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

2 Author(s)
Chen, S.F. ; Sch. of Comput. Sci., Carnegie Mellon Univ., Pittsburgh, PA, USA ; Rosenfeld, R.

Conditional maximum entropy models have been successfully applied to estimating language model probabilities of the form P(w|h), but are often to demanding computationally. Furthermore, the conditional framework does not lend itself to expressing global sentential phenomena. We have previously introduced a non-conditional maximum entropy language model which directly models the probability of an entire sentence or utterance. The model treats each utterance as a “bag of features”, where features are arbitrary computable properties of the sentence. Using the model is computationally straightforward since it does not require normalization. Training the model requires efficient sampling of sentences from an exponential distribution. In this paper, we further develop the model and demonstrate its feasibility and power. We compare the efficiency of several sampling techniques. implement smoothing to accommodate rare features, and suggest an efficient algorithm for improving the convergence rate. We then present a novel procedure for feature selection, which exploits discrepancies between the existing model and the training corpus. We demonstrate our ideas by constructing and analyzing competitive modes in the Switchboard domain

Published in:

Acoustics, Speech, and Signal Processing, 1999. Proceedings., 1999 IEEE International Conference on  (Volume:1 )

Date of Conference:

15-19 Mar 1999