<![CDATA[ IEEE Transactions on Knowledge and Data Engineering - new TOC ]]>
http://ieeexplore.ieee.org
TOC Alert for Publication# 69 2018March 19<![CDATA[A Novel Representation and Compression for Queries on Trajectories in Road Networks]]>3046136291720<![CDATA[Authenticating Aggregate Queries over Set-Valued Data with Confidentiality]]>3046306441223<![CDATA[Bidding Machine: Learning to Bid for Directly Optimizing Profits in Display Advertising]]>Bidding Machine, a comprehensive learning to bid framework, which consists of three optimizers dealing with each challenge above, and as a whole, jointly optimizes these three parts. We show that such a joint optimization would largely increase the campaign effectiveness and the profit. From the learning perspective, we show that the bidding machine can be updated smoothly with both offline periodical batch or online sequential training schemes. Our extensive offline empirical study and online A/B testing verify the high effectiveness of the proposed bidding machine.]]>3046456591146<![CDATA[Community Deception or: How to Stop Fearing Community Detection Algorithms]]>$mathcal{C}$) from community detection algorithms. This need emerges whenever a group (e.g., activists, police enforcements, or network participants in general) want to observe and cooperate in a social network while avoiding to be detected. We introduce and formalize the community deception problem and devise an efficient algorithm that allows to achieve deception by identifying a certain number ($beta$) of $mathcal{C}$’s members connections to be rewired. Deception can be practically achieved in social networks like Facebook by friending or unfriending network members as indicated by our algorithm. We compare our approach with another technique based on modularity. By considering a variety of (large) real networks, we provide a systematic evaluation of the robustness of community detection algorithms to deception techniques. Finally, we open some challenging research questions about the design of detection algorithms robust to deception techniques.]]>3046606731922<![CDATA[Efficient Computation of G-Skyline Groups]]>minimum dominance search (MDS), to solve the g-skyline problem, a latest group-based skyline problem. MDS consists of two steps: In the first step, we construct a novel g-skyline support structure, i.e., minimum dominance graph (MDG), which proves to be a minimum g-skyline support structure. In the second step, we search for g-skyline groups based on the MDG through two searching algorithms, and a skyline-combination based optimization strategy is employed to improve these two algorithms. We conduct comprehensive experiments on both synthetic and real-world data sets, and show that our algorithms are orders of magnitude faster than the state-of-the-art in most cases.]]>3046746881136<![CDATA[Janus: A Hybrid Scalable Multi-Representation Cloud Datastore]]>304689702935<![CDATA[Learning Dynamic Conditional Gaussian Graphical Models]]>$n^{4/5}$ for estimating a local graphical model when the bandwidth parameter $h$ of kernel smoother is chosen as $h; asymp; n^{-1/5}$ for describing the dynamic. Finally, the extensive numerical experiments on both synthetic and real datasets are provided to support the effectiveness of the proposed method.]]>304703716502<![CDATA[Linguistic Petri Nets Based on Cloud Model Theory for Knowledge Representation and Reasoning]]>304717728614<![CDATA[Propagation-Based Temporal Network Summarization]]>NetCondense, a scalable and effective algorithm which solves this problem using careful transformations in sub-quadratic running time, and linear space complexities. Our extensive experiments show that we can reduce the size of large real temporal networks (from multiple domains such as social, co-authorship, and email) significantly without much loss of information. We also show the wide-applicability of NetCondense by leveraging it for several tasks: for example, we use it to understand, explore, and visualize the original datasets and to also speed-up algorithms for the influence-maximization and event detection problems on temporal networks.]]>304729742845<![CDATA[Realizing Memory-Optimized Distributed Graph Processing]]>Pregel, Apache Giraph, and GraphX . However, the unprecedented scale now reached by real-world graphs hardens the task of graph processing due to excessive memory demands even for distributed environments. By and large, such contemporary graph processing systems employ ineffective in-memory representations of adjacency lists. Therefore, memory usage patterns emerge as a primary concern in distributed graph processing. We seek to address this challenge by exploiting empirically-observed properties demonstrated by graphs generated by human activity. In this paper, we propose 1) three compressed adjacency list representations that can be applied to any distributed graph processing system, 2) a variable-byte encoded representation of out-edge weights for space-efficient support of weighted graphs, and 3) a tree-based compact out-edge representation that allows for efficient mutations on the graph elements. We experiment with publicly-available graphs whose size reaches two-billion edges and report our findings in terms of both space-efficiency and execution time. Our suggested compact representations do reduce respective memory requirements for accommodating the graph elements up–to 5 times if compared with state-of-the-art methods. At the same time, our memory-optimized methods retain the efficiency of uncompressed structures and enable the execution of algorithms for large scale graphs in settings where contemporary alternative structures fail due to memory errors.]]>3047437561033<![CDATA[Reverse $k$ Nearest Neighbor Search over Trajectories]]>Reverse $k$ Nearest Neighbor Search over Trajectories ($mathbf{R}{k}mathbf{NNT}$), which can be used for route planning and capacity estimation. Given a set of existing routes $mathcal{D}_{mathcal{R}}$, a set of passenger transitions $mathcal{D}_{mathcal{T}}$, and a query route $Q$, an $mathbf{R}{k}mathbf{NNT}$ query returns all transitions that take $Q$ as one of its $k$ nearest travel routes. To solve the problem, we first develop an index to handle dynamic trajectory updates, so that the most up-to-date transition data are available for answering an $mathbf{R}{k}mathbf{NNT}$ query. Then we introduce a filter refinement framework for processing $mathbf{R}{k}mathbf{NNT}$ queries using the proposed indexes. Next, we show how to use $mathbf{R}{k}mathbf{NNT}$ to solve the optimal route planning problem $mathbf{MaxR}{k}mathbf{NNT}$ ($mathbf{MinR}{k}mathbf{NNT}$), which is to search for the optimal route from a start location to an end location that could attract the maximum (or minimum) number of passengers based on a predefined travel distance threshold. Experiments on real datasets demonstrate the efficiency and scalability of our approaches. To the best of our knowledge, this is the first work to study the $mathbf{R}{k}mathbf{NNT}$ problem for route planning.]]>3047577711802<![CDATA[To Meet or Not to Meet: Finding the Shortest Paths in Road Networks]]> MPP) query, which consists of two pairs of source and destination and a user-specified weight $alpha$ to balance the two different needs. The result is a pair of paths connecting the two sources and destinations respectively, with minimal overall cost of the two paths and the shortest route between them. To solve MPP queries, we devise algorithms by enumerating node pairs. We adopt a location-based pruning strategy to reduce the number of node pairs for enumeration. An efficient algorithm based on point-to-point shortest path calculation is proposed to further improve query efficiency. We also give two fast approximate algorithms with approximation bounds. Extensive experiments are conducted to show the effectiveness and efficiency of our methods.]]>3047727851002<![CDATA[Topic Models for Unsupervised Cluster Matching]]>304786795805<![CDATA[Towards Why-Not Spatial Keyword Top-$k$ Queries: A Direction-Aware Approach]]>direction-aware spatial keyword query that aims to retrieve the top-$k$ objects that best match query parameters in terms of spatial distance and textual similarity in a given query direction. In some cases, it can be difficult for users to specify appropriate query parameters. After getting a query result, users may find some desired objects are unexpectedly missing and may therefore question the entire result. Enabling why-not questions in this setting may aid users to retrieve better results, thus improving the overall utility of the query functionality. This paper studies the direction-aware why-not spatial keyword top-$k$ query problem. We propose efficient query refinement techniques to revive missing objects by minimally modifying users’ direction-aware queries. We prove that the best refined query directions lie in a finite solution space for a special case and reduce the search for the optimal refinement to a linear programming problem for the general case. Extensive experimental studies demonstrate that the proposed techniques outperform a baseline method by two orders of magnitude and are robust in a broad range of settings.]]>3047968091030