Skip to Main Content
In this paper, we propose a fast mean-field method called LHMF to handle probabilistic models of large-scale data in high dimensional space. By using diffusion map locally linear embedding method which is a non-linear dimensionality reduction method, we first embed the high dimensional data into a low dimensional space. Then we construct a coarse-grained graph which preserves the spectral properties of original weighted graph in the high dimensional space by clustering. A new spin model is defined in the diffusion space and the geometric centroids of clusters represent variables in the new spin model. The calculation demand of mean-field methods can be reduced greatly on the coarse-grained spin model. The final marginal moments of original variables are derived from the states of geometric centroids by using geometric harmonics. We first tested the proposed method on the MNIST hand-written digits dataset. Experimental results show that the LHMF method is competent with consistency approach, a state-of-the-art semi-supervised learning method. Then we applied the proposed method to a large-scale colonic polyp dataset from computed tomography (CT) scans. Free-response operator characteristic analysis shows that our method achieves higher sensitivity with lower false positive rate compared with support vector machines.