2016 IEEE 16th International Conference on Data Mining (ICDM)

12-15 Dec. 2016

Filter Results

Displaying Results 1 - 25 of 190
  • [Front cover]

    Publication Year: 2016, Page(s): c1
    Request permission for commercial reuse | PDF file iconPDF (7624 KB)
    Freely Available from IEEE
  • [Title page i]

    Publication Year: 2016, Page(s): i
    Request permission for commercial reuse | PDF file iconPDF (100 KB)
    Freely Available from IEEE
  • [Title page iii]

    Publication Year: 2016, Page(s): iii
    Request permission for commercial reuse | PDF file iconPDF (140 KB)
    Freely Available from IEEE
  • [Copyright notice]

    Publication Year: 2016, Page(s): iv
    Request permission for commercial reuse | PDF file iconPDF (122 KB)
    Freely Available from IEEE
  • Table of contents

    Publication Year: 2016, Page(s):v - xvi
    Request permission for commercial reuse | PDF file iconPDF (176 KB)
    Freely Available from IEEE
  • Message from the Conference General Chairs

    Publication Year: 2016, Page(s):xvii - xviii
    Request permission for commercial reuse | PDF file iconPDF (90 KB) | HTML iconHTML
    Freely Available from IEEE
  • Message from the Program Chairs

    Publication Year: 2016, Page(s):xix - xx
    Request permission for commercial reuse | PDF file iconPDF (97 KB) | HTML iconHTML
    Freely Available from IEEE
  • Organizing Committee

    Publication Year: 2016, Page(s):xxi - xxii
    Request permission for commercial reuse | PDF file iconPDF (104 KB)
    Freely Available from IEEE
  • Program Committee

    Publication Year: 2016, Page(s):xxiii - xxx
    Request permission for commercial reuse | PDF file iconPDF (131 KB)
    Freely Available from IEEE
  • Keynotes

    Publication Year: 2016, Page(s):xxxi - xxxvi
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (158 KB)

    Provides an abstract for each of the keynote presentations and may include a brief professional biography of each View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Auditing Black-Box Models for Indirect Influence

    Publication Year: 2016, Page(s):1 - 10
    Cited by:  Papers (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (927 KB) | HTML iconHTML

    Data-trained predictive models see widespread use, but for the most part they are used as black boxes which output a prediction or score. It is therefore hard to acquire a deeper understanding of model behavior, and in particular how different features influence the model prediction. This is important when interpreting the behavior of complex models, or asserting that certain problematic attribute... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Asynchronous Multi-task Learning

    Publication Year: 2016, Page(s):11 - 20
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (406 KB) | HTML iconHTML

    Many real-world machine learning applications involve several learning tasks which are inter-related. For example, in healthcare domain, we need to learn a predictive model of a certain disease for many hospitals. The models for each hospital may be different because of the inherent differences in the distributions of the patient populations. However, the models are also closely related because of... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Unsupervised Exceptional Attributed Sub-Graph Mining in Urban Data

    Publication Year: 2016, Page(s):21 - 30
    Cited by:  Papers (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (2869 KB) | HTML iconHTML

    Geo-located social media provide a wealth of information that describes urban areas based on user descriptions and comments. Such data makes possible to identify meaningful city neighborhoods on the basis of the footprints left by a large and diverse population that uses this type of media. In this paper, we present some methods to exhibit the predominant activities and their associated urban area... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • ADAGIO: Fast Data-Aware Near-Isometric Linear Embeddings

    Publication Year: 2016, Page(s):31 - 40
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (790 KB) | HTML iconHTML

    Many important applications, including signal reconstruction, parameter estimation, and signal processing in a compressed domain, rely on a low-dimensional representation of the dataset that preserves all pairwise distances between the data points and leverages the inherent geometric structure that is typically present. Recently Hedge, Sankaranarayanan, Yin and Baraniuk [19] proposed the first dat... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Causal Inference by Compression

    Publication Year: 2016, Page(s):41 - 50
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (257 KB) | HTML iconHTML

    Causal inference is one of the fundamental problems in science. In recent years, several methods have been proposed for discovering causal structure from observational data. These methods, however, focus specifically on numeric data, and are not applicable on nominal or binary data. In this work, we focus on causal inference for binary data. Simply put, we propose causal inference by compression. ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • On Dense Subgraphs in Signed Network Streams

    Publication Year: 2016, Page(s):51 - 60
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (471 KB) | HTML iconHTML

    Signed networks remain relatively under explored despite the fact that many real networks are of this kind. Here, we study the problem of subgraph density in signed networks and show connections to the event detection task. Notions of density have been used in prior studies on anomaly detection, but all existing methods have been developed for unsigned networks. We develop the first algorithms for... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Relief of Spatiotemporal Accessibility Overloading with Optimal Resource Placement

    Publication Year: 2016, Page(s):61 - 70
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (366 KB) | HTML iconHTML

    With the effects of global warming, some epidemic diseases via mosquito (e.g. mosquito-borne diseases) become more serious, such as dengue fever and zika virus. It is reported that the epidemic disease may cause many challenges to the hospital management due to the unexpected burst with uncertain reasons. Furthermore, the imperfect cares during the propagation of epidemic diseases, such as dengue ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Mining Graphlet Counts in Online Social Networks

    Publication Year: 2016, Page(s):71 - 80
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (361 KB) | HTML iconHTML

    Counting subgraphs is a fundamental analysis task for online social networks (OSNs). Given the sheer size and restricted access of online social network data, efficient computation of subgraph counts is highly challenging. Although a number of algorithms have been proposed to estimate the relative counts of subgraphs in OSNs with restricted access, there are only few works which try to solve a mor... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Differentially Private Regression Diagnostics

    Publication Year: 2016, Page(s):81 - 90
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (2700 KB) | HTML iconHTML

    Linear and logistic regression are popular statistical techniques for analyzing multi-variate data. Typically, analysts do not simply posit a particular form of the regression model, estimate its parameters, and use the results for inference orprediction. Instead, they first use a variety of diagnostic techniques to assess how well the model fits the relationships in the data and how well it can b... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Vote-and-Comment: Modeling the Coevolution of User Interactions in Social Voting Web Sites

    Publication Year: 2016, Page(s):91 - 100
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (829 KB) | HTML iconHTML

    In social voting Web sites, how do the user actions - up-votes, down-votes and comments - evolve over time? Are there relationships between votes and comments? What is normal and what is suspicious? These are the questions we focus on. We analyzed over 20,000 submissions corresponding to more than 100 million user interactions from three social voting Web sites: Reddit, Imgur and Digg. Our first c... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • On Efficient External-Memory Triangle Listing

    Publication Year: 2016, Page(s):101 - 110
    Cited by:  Papers (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (313 KB) | HTML iconHTML

    Discovering triangles in large graphs is a well-studied area, however, both external-memory performance of existing methods and our understanding of the complexity involved leave much room for improvement. To shed light on this problem, we first generalize the existing in-memory algorithms into a single framework of 18 triangle-search techniques. We then develop a novel external-memory approach, w... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Efficient Distributed SGD with Variance Reduction

    Publication Year: 2016, Page(s):111 - 120
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (1022 KB) | HTML iconHTML

    Stochastic Gradient Descent (SGD) has become one of the most popular optimization methods for training machine learning models on massive datasets. However, SGD suffers from two main drawbacks: (i) The noisy gradient updates have high variance, which slows down convergence as the iterates approach the optimum, and (ii) SGD scales poorly in distributed settings, typically experiencing rapidly decre... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Triply Stochastic Variational Inference for Non-linear Beta Process Factor Analysis

    Publication Year: 2016, Page(s):121 - 130
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (942 KB) | HTML iconHTML

    We propose a non-linear extension to factor analysis with beta process priors for improved data representation ability. This non-linear Beta Process Factor Analysis (nBPFA) allows data to be represented as a non-linear transformation of a standard sparse factor decomposition. We develop a scalable variational inference framework, which builds upon the ideas of the variational auto-encoder, by allo... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Beyond Points and Paths: Counting Private Bodies

    Publication Year: 2016, Page(s):131 - 140
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (1151 KB) | HTML iconHTML

    Mining of spatial data is an enabling technology for mobile services, Internet-connected cars, and the Internet of Things. But the very distinctiveness of spatial data that drives utility, comes at the cost of user privacy. In this work, we continue the tradition of privacy-preserving spatial analytics, focusing not on point or path data, but on planar spatial regions. Such data represents the are... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Efficient Rectangular Maximal-Volume Algorithm for Rating Elicitation in Collaborative Filtering

    Publication Year: 2016, Page(s):141 - 150
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (328 KB) | HTML iconHTML

    Cold start problem in Collaborative Filtering can be solved by asking new users to rate a small seed set of representative items or by asking representative users to rate a new item. The question is how to build a seed set that can give enough preference information for making good recommendations. One of the most successful approaches, called Representative Based Matrix Factorization, is based on... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.