<![CDATA[ IEEE Transactions on Knowledge and Data Engineering - new TOC ]]>
http://ieeexplore.ieee.org
TOC Alert for Publication# 69 2018April 26<![CDATA[A Unified View of Social and Temporal Modeling for B2B Marketing Campaign Recommendation]]>3058108231020<![CDATA[Answering Natural Language Questions by Subgraph Matching over Knowledge Graphs]]>3058248372237<![CDATA[Density-Based Place Clustering Using Geo-Social Network Data]]>3058388512483<![CDATA[Diagnosing and Minimizing Semantic Drift in Iterative Bootstrapping Extraction]]>305852865772<![CDATA[Efficient and Scalable Integrity Verification of Data and Query Results for Graph Databases]]>3058668791091<![CDATA[Efficient Information Flow Maximization in Probabilistic Graphs]]>F-tree as a specialized data structure, that identifies independent components of the graph for which the information flow can either be computed analytically and efficiently, or for which traditional Monte-Carlo sampling can be applied independently of the remaining network. For the problem of finding the optimal edges, we propose a series of heuristics that exploit properties of this data structure. Our evaluation shows that these heuristics lead to high quality solutions, thus yielding high information flow, while maintaining low running time.]]>3058808941390<![CDATA[FBSGraph: Accelerating Asynchronous Graph Processing via Forward and Backward Sweeping]]>Asynchronous Graph Processing (AGP) is becoming a promising model to support graph algorithm on large-scale distributed computing platforms because it enables faster convergence speed and lower synchronization cost than the synchronous model for no barrier between iterations. However, existing AGP methods still suffer from poor performance for inefficient vertex state propagation. In this paper, we propose an effective and low-cost forward and backward sweeping execution method to accelerate state propagation for AGP, based on a key observation that states in AGP can be propagated between vertices much faster when the vertices are processed sequentially along the graph path within each round. Through dividing graph into paths and asynchronously processing vertices on each path in an alternative forward and backward way according to their order on this path, vertex states in our approach can be quickly propagated to other vertices and converge in a faster way with only little additional overhead. In order to efficiently support it over distributed platforms, we also propose a scheme to reduce the communication overhead along with a static priority ordering scheme to further improve the convergence speed. Experimental results on a cluster with 1,024 cores show that our approach achieves excellent scalability for large-scale graph algorithms and the overall execution time is reduced by at least 39.8 percent, in comparison with the most cutting-edge methods.]]>3058959072554<![CDATA[Game-Theoretic Cross Social Media Analytic: How Yelp Ratings Affect Deal Selection on Groupon?]]>305908921860<![CDATA[Index-Based Densest Clique Percolation Community Search in Networks]]>$k$-clique percolation community model was proposed and has been proven effective in many applications. Motivated by this, in this paper, we adopt the $k$-clique percolation community model and study the densest clique percolation community search problem which aims to find the $k$-clique percolation community with the maximum $k$ value that contains a given set of query nodes. We adopt an index-based approach to solve this problem. Based on the observation that a $k$-clique percolation community is a union of maximal cliques, we devise a novel compact index, $mathsf {DCPC}$-$mathsf {Index}$, to preserve the max-
mal cliques and their connectivity information of the input graph. With $mathsf {DCPC}$- $mathsf {Index}$, we can answer the densest clique percolation community query efficiently. Besides, we also propose an index construction algorithm based on the definition of $mathsf {DCPC}$-$mathsf {Index}$ and further improve the algorithm in terms of efficiency and memory consumption. We conduct extensive performance studies on real graphs and the experimental results demonstrate the efficiency of our index-based query processing algorithm and index construction algorithm.]]>3059229351054<![CDATA[$K$ -Ary Tree Hashing for Fast Graph Classification]]>$K$-ary trees. Based on the traversal table, KATH employs a recursive indexing process that performs only $r$ times of matrix indexing to generate all $(r-1)$-depth $K$-ary trees, where the leaf node labels of a tree can uniquely specify the pattern. After that, the MinHash scheme is used to fingerprint the acquired subtree patterns for a graph. Our experimental results on both real world and synthetic data sets show that KATH runs significantly faster than state-of-the-art methods while achieving competitive or better accuracy.]]>3059369491316<![CDATA[Minority Oversampling in Kernel Adaptive Subspaces for Class Imbalanced Datasets]]>305950962983<![CDATA[Range-Based Nearest Neighbor Queries with Complex-Shaped Obstacles]]>range-based obstructed nearest neighbor (RONN) search. As a natural generalization of continuous obstructed nearest-neighbor (CONN), an RONN query retrieves a set of obstructed nearest neighbors corresponding to every point in a specified range. We propose a new index, namely binary obstructed tree (called OB-tree), for indexing complex objects in the obstructed space. The novelty of OB-tree lies in the idea of dividing the obstructed space into non-obstructed subspaces, aiming to efficiently retrieve highly qualified candidates for RONN processing. We develop an algorithm for construction of the OB-tree and propose a space division scheme, called optimal obstacle balance (OOB2) scheme, to address the tree balance problem. Accordingly, we propose an efficient algorithm, called RONN by OB-tree Acceleration (RONN-OBA), which exploits the OB-tree and a binary traversal order of data objects to accelerate query processing of RONN. In addition, we extend our work in several aspects regarding the shape of obstacles, and range-based $k$ NN queries in obstructed space. At last, we conduct a comprehensive performance evaluation using both real and synthetic datasets to validate our ideas and the proposed algorithms. The experimental result shows that the RONN-OBA algorithm outperforms the two R-tree based algorithms and RONN-OA significantly.]]>3059639771815<![CDATA[Robust Prototype-Based Learning on Data Streams]]>3059789911634<![CDATA[UniWalk: Unidirectional Random Walk Based Scalable SimRank Computation over Large Graph]]>$k$ SimRank computation over large undirected graphs. UniWalk directly locates the top- $k$ similar vertices for any single source vertex $u$ via $R$ sampling paths originating from $u$ , which avoids selecting candidate vertex set $mathcal{C}$ and the following $O(|mathcal{C}|R)$ bidirectional sampling paths. We also devise a path enumeration strategy to improve the SimRank precision by using path probabilities instead of path frequencies when sampling, a space-efficient method to-
reduce intermediate results, and a path-sharing strategy to lower the redundant path sampling cost for multiple source vertices. Furthermore, we extend UniWalk to existing distributed graph processing frameworks to improve its scalability. We conduct extensive experiments to illustrate that UniWalk has high scalability, and outperforms the state-of-the-art methods by orders of magnitude.]]>30599210061180