I. Introduction
Several important classes of probability distributions studied in statistical physics, coding theory, and machine learning can be succinctly represented as factor graphs [32], [48]. Informally, they provide a succinct way to describe multivariate functions by specifying variables and relations between them in a form of a hypergraph [27]. In this context, of interest are the inference problem of estimating marginal probabilities of certain variables and the problem of estimating the partition function of such a factor graph. In computer vision one applies such inference primitives to learn about objects in a stage being captured by several cameras [20]. They are also essential components for decoding algorithms for low-density parity check codes [21], [42]. In statistical physics, factor graphs are used to model physical systems, for instance a set of particles – in such a setting the energy of a configuration is inversely proportional to the probability at which it occurs, thus intuitively, inference problems on such factor graphs correspond to learning “typical configurations” of a given system (see the book [32]).