2017 IEEE International Conference on Data Mining (ICDM)

18-21 Nov. 2017

Filter Results

Displaying Results 1 - 25 of 166
  • [Front cover]

    Publication Year: 2017, Page(s): c1
    Request permission for commercial reuse | |PDF file iconPDF (980 KB)
    Freely Available from IEEE
  • [Title page i]

    Publication Year: 2017, Page(s): i
    Request permission for commercial reuse | |PDF file iconPDF (98 KB)
    Freely Available from IEEE
  • [Title page iii]

    Publication Year: 2017, Page(s): iii
    Request permission for commercial reuse | |PDF file iconPDF (137 KB)
    Freely Available from IEEE
  • [Copyright notice]

    Publication Year: 2017, Page(s): iv
    Request permission for commercial reuse | |PDF file iconPDF (118 KB)
    Freely Available from IEEE
  • Table of contents

    Publication Year: 2017, Page(s):v - xv
    Request permission for commercial reuse | |PDF file iconPDF (171 KB)
    Freely Available from IEEE
  • Message from the Conference Chairs

    Publication Year: 2017, Page(s):xvi - xvii
    Request permission for commercial reuse | |PDF file iconPDF (81 KB) | HTML iconHTML
    Freely Available from IEEE
  • Message from Program Co-Chairs

    Publication Year: 2017, Page(s):xviii - xix
    Request permission for commercial reuse | |PDF file iconPDF (102 KB) | HTML iconHTML
    Freely Available from IEEE
  • Organizing Committee

    Publication Year: 2017, Page(s):xx - xxi
    Request permission for commercial reuse | |PDF file iconPDF (111 KB)
    Freely Available from IEEE
  • Program Committee

    Publication Year: 2017, Page(s):xxii - xxvii
    Request permission for commercial reuse | |PDF file iconPDF (125 KB)
    Freely Available from IEEE
  • Split Miner: Discovering Accurate and Simple Business Process Models from Event Logs

    Publication Year: 2017, Page(s):1 - 10
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (1546 KB) | HTML iconHTML

    The problem of automated discovery of process models from event logs has been intensively researched in the past two decades. Despite a rich field of proposals, state-of-the-art automated process discovery methods suffer from two recurrent deficiencies when applied to real-life logs: (i) they produce large and spaghetti-like models; and (ii) they produce models that either poorly fit the event log... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A Deep Transfer Learning Approach for Improved Post-Traumatic Stress Disorder Diagnosis

    Publication Year: 2017, Page(s):11 - 20
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (943 KB) | HTML iconHTML

    Post-traumatic stress disorder (PTSD) is a traumatic-stressor related disorder developed by exposure to a traumatic or adverse environmental event that caused serious harm or injury. Structured interview is the only widely accepted clinical practice for PTSD diagnosis but suffers from several limitations including the stigma associated with the disease. Diagnosis of PTSD patients by analyzing spee... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Many Heads are Better than One: Local Community Detection by the Multi-walker Chain

    Publication Year: 2017, Page(s):21 - 30
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (569 KB) | HTML iconHTML

    Local community detection (or local clustering) is of fundamental importance in large network analysis. Random walk based methods have been routinely used in this task. Most existing random walk methods are based on the single-walker model. However, without any guidance, a single-walker may not be adequate to effectively capture the local cluster. In this paper, we study a multi-walker chain (MWC)... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Knowledge Guided Short-Text Classification for Healthcare Applications

    Publication Year: 2017, Page(s):31 - 40
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (367 KB) | HTML iconHTML

    The need for short-text classification arises in many text mining applications particularly health care applications. In such applications shorter texts mean linguistic ambiguity limits the semantic expression, which in turns would make typical methods fail to capture the exact semantics of the scarce words. This is particularly true in health care domains when the text contains domain-specific or... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A Generic Framework for Interesting Subspace Cluster Detection in Multi-attributed Networks

    Publication Year: 2017, Page(s):41 - 50
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (921 KB) | HTML iconHTML

    Detection of interesting (e.g., coherent or anomalous) clusters has been studied extensively on plain or univariate networks, with various applications. Recently, algorithms have been extended to networks with multiple attributes for each node in the real-world. In a multi-attributed network, often, a cluster of nodes is only interesting for a subset (subspace) of attributes, andthis type of clust... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Revisiting Spectral Graph Clustering with Generative Community Models

    Publication Year: 2017, Page(s):51 - 60
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (317 KB) | HTML iconHTML

    The methodology of community detection can be divided into two principles: imposing a network model on a given graph, or optimizing a designed objective function. The former provides guarantees on theoretical detectability but falls short when the graph is inconsistent with the underlying model. The latter is model-free but fails to provide quality assurance for the detected communities. In this p... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Improving I/O Complexity of Triangle Enumeration

    Publication Year: 2017, Page(s):61 - 70
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (368 KB) | HTML iconHTML

    In the age of big data, many graph algorithms are now required to operate in external memory and deliver performance that does not significantly degrade with the scale of the problem. One particular area that frequently deals with graphs larger than RAM is triangle listing, where the algorithms must carefully piece together edges from multiple partitions to detect cycles. In recent literature, two... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • TensorCast: Forecasting with Context Using Coupled Tensors (Best Paper Award)

    Publication Year: 2017, Page(s):71 - 80
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (558 KB) | HTML iconHTML

    Given an heterogeneous social network, can we forecast its future? Can we predict who will start using a given hashtag on twitter? Can we leverage side information, such as who retweets or follows whom, to improve our membership forecasts? We present TensorCast, a novel method that forecasts time-evolving networks more accurately than current state of the art methods by incorporating multiple data... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Situation Aware Multi-task Learning for Traffic Prediction

    Publication Year: 2017, Page(s):81 - 90
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (490 KB) | HTML iconHTML

    Due to the recent vast availability of transportation traffic data, major research efforts have been devoted to traffic prediction, which is useful in many applications such as urban planning, traffic management and navigations systems. Current prediction methods that independently train a model per traffic sensor cannot accurately predict traffic in every situation (e.g., rush hours, construction... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Large Scale Kernel Methods for Online AUC Maximization

    Publication Year: 2017, Page(s):91 - 100
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (317 KB) | HTML iconHTML

    Learning to optimize AUC performance for classifying label imbalanced data in online scenarios has been extensively studied in recent years. Most of the existing work has attempted to address the problem directly in the original feature space, which may not suitable for non-linearly separable datasets. To solve this issue, some kernel-based learning methods are proposed for non-linearly separable ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A Hyperplane-Based Algorithm for Semi-Supervised Dimension Reduction

    Publication Year: 2017, Page(s):101 - 110
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (399 KB) | HTML iconHTML

    We consider the semi-supervised dimension reduction problem: given a high dimensional dataset with a small number of labeled data and huge number of unlabeled data, the goal is to find the low-dimensional embedding that yields good classification results. Most of the previous algorithms for this task are linkage-based algorithms. They try to enforce the must-link and cannot-link constraints in dim... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • IterativE Grammar-Based Framework for Discovering Variable-Length Time Series Motifs

    Publication Year: 2017, Page(s):111 - 116
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (472 KB) | HTML iconHTML

    In recent years, finding repetitive similar patterns in time series has become a popular problem. These patterns are called time series motifs. Recent studies show that using grammar compression algorithms to find repeating patterns from the symbolized time series holds promise in discovering approximate motifs with variable length. However, grammar compression algorithms are traditionally designe... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Matrix Profile VIII: Domain Agnostic Online Semantic Segmentation at Superhuman Performance Levels

    Publication Year: 2017, Page(s):117 - 126
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (709 KB) | HTML iconHTML

    Unsupervised semantic segmentation in the time series domain is a much-studied problem due to its potential to detect unexpected regularities and regimes in poorly understood data. However, the current techniques have several shortcomings, which have limited the adoption of time series semantic segmentation beyond academic settings for three primary reasons. First, most methods require setting/lea... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Overlapping Community Detection via Constrained PARAFAC: A Divide and Conquer Approach

    Publication Year: 2017, Page(s):127 - 136
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (1424 KB) | HTML iconHTML

    The task of community detection over complex networks is of paramount importance in a multitude of applications. The present work puts forward a top-to-bottom community identification approach, termed DC-EgoTen, in which an egonet-tensor (EgoTen) based algorithm is developed in a divide-and-conquer (DC) fashion for breaking the network into smaller subgraphs, out of which the underlying communitie... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Scalable Algorithms for Locally Low-Rank Matrix Modeling

    Publication Year: 2017, Page(s):137 - 146
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (971 KB) | HTML iconHTML

    We consider the problem of modeling data matrices with locally low rank (LLR) structure, a generalization of the popular low rank structure widely used in a variety of real world application domains ranging from medical imaging to recommendation systems. While LLR modeling has been found to be promising in real world application domains, limited progress has been made on the design of scalable alg... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A Self-Adaptive Sliding Window Based Topic Model for Non-uniform Texts

    Publication Year: 2017, Page(s):147 - 156
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (409 KB) | HTML iconHTML

    The contents generated from different data sources are usually non-uniform, such as long texts produced by news websites and short texts produced by social media. Uncovering topics over large-scale non-uniform texts becomes an important task for analyzing network data. However, the existing methods may fail to recognize the difference between long texts and short texts. To address this problem, we... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.