In the last decade, stochastic grammars became an important probabilistic tool for modeling biological sequences. The inside-outside (IO) algorithm is the most popular estimator for learning the probability parameters associated to the rules from training sequences. The inside-outside algorithm needs O(L3M3) (L is the length of training sequence and M is the number of non-terminals) operations. The IO algorithm considers a number of useless inside variables (α-variables) and useless outside variables (β-variables) in each training iteration. In this paper we give a method to avoid these useless variables. Our method uses a prediction function for this purpose. We give an example based on RNA sequences (taken from [Y. Sakakibara, et al., 1994]) to illustrate the percentage of such useless variables avoided in our method.
Published in:
Machine Learning and Cybernetics, 2003 International Conference on
(Volume:4
)
Date of Conference: 2-5 Nov. 2003