Skip to Main Content
The generalization performance of a learned Bayesian network largely depends on the quality of the prior provided to the learning machine. Indeed, the prior distribution is designed to provide additive domain expert knowledge to the parameters in a Bayesian network which tolerate some variance around these initial counts. The learning task is combinatorial regulates on this initial counts by the data statistics. The use of a prior distribution becomes even more critical in case of scarce data. One essential problem in specifying Dirichlet prior (commonly used in maximum-a-posterior estimation) is that, it is often impossible for a domain expert to accurately specify the parameters of a Dirichlet distribution since they are unintuitive to a domain expert. Consequently, in practice, the parameters of a Dirichlet prior are either randomly assigned or set equally which results either non-informative prior or statistically inaccurate prior. When data is scarce, this inaccurate prior induce additive bias in selecting the single best model. On the other hand, there is usually qualitative information available from many resources in a domain, such as domain expert, literature, database etc. These knowledge usually contains validated information describing certain qualitative relations, e.g. inequality or logic, among entities in the domain. This type of qualitative knowledge have been ignored in Bayesian network learning due to their properties of lacking quantitative information which is desired to a learning machine. In this paper, we propose a novel framework for learning parameters in Bayesian networks by integrating generic qualitative domain knowledge with quantitative data. We firstly translate qualitative information into mathematic formulations, e.g. constraints. Then we recruit Monte Carlo sampling to reconstruct the quantitative prior distribution out of the qualitative domain knowledge and design a novel score to combine the prior distribution with data statisti- - cs. We test our algorithms (QMAP) both in genetic regulatory network in computational biology and in facial Action Unit recognition network in computer vision. The learning results show that i) with even very generic qualitative domain knowledge, QMAP outperforms drastically ML and MAP estimation, ii) QMAP achieve surprisingly good estimation even with very scarce data and dramatically reduced the dependence on the amount of training dataset.