Skip to Main Content
We present a technique which improves the Kneser-Ney smoothing algorithm on small data sets for bigrams, and we develop a numerical algorithm which computes the parameters for the heuristic formula with a correction. We give motivation for the formula with correction on a simple example. Using the same example, we show the possible difficulties one may run into with the numerical algorithm. Applying the algorithm to test data we show how the new formula improves the results on cross-entropy.
Audio, Speech, and Language Processing, IEEE Transactions on (Volume:15 , Issue: 6 )
Date of Publication: Aug. 2007