Skip to Main Content
The first spoken dialogue system is developed for the Persian language is introduced. This is a ticket reservation system with Persian ASR and NLU modules. The focus of the paper is on learning the dialogue management module. In this work, real on-line training data are used during the learning process. For on-line learning, the effect of the variations of discount factor (γ) on the learning speed is investigated as the second contribution of the research. The optimal values for γ were found and the variation pattern of the action-value function (Q) in the learning process was obtained. A probabilistic policy for selecting actions is used in this work for the first time instead of greedy policies employed in previous works.