2016 IEEE 16th International Conference on Data Mining Workshops (ICDMW)

12-15 Dec. 2016

Filter Results

Displaying Results 1 - 25 of 200
  • [Front cover]

    Publication Year: 2016, Page(s): c1
    Request permission for commercial reuse | PDF file iconPDF (9599 KB)
    Freely Available from IEEE
  • [Title page i]

    Publication Year: 2016, Page(s): i
    Request permission for commercial reuse | PDF file iconPDF (98 KB)
    Freely Available from IEEE
  • [Title page iii]

    Publication Year: 2016, Page(s): iii
    Request permission for commercial reuse | PDF file iconPDF (139 KB)
    Freely Available from IEEE
  • [Copyright notice]

    Publication Year: 2016, Page(s): iv
    Request permission for commercial reuse | PDF file iconPDF (118 KB)
    Freely Available from IEEE
  • Table of contents

    Publication Year: 2016, Page(s):v - xviii
    Request permission for commercial reuse | PDF file iconPDF (189 KB)
    Freely Available from IEEE
  • Message from the Conference Chairs

    Publication Year: 2016, Page(s):xix - xx
    Request permission for commercial reuse | PDF file iconPDF (90 KB)
    Freely Available from IEEE
  • Message from the Workshop Co-Chairs

    Publication Year: 2016, Page(s):xxi - xxii
    Request permission for commercial reuse | PDF file iconPDF (96 KB)
    Freely Available from IEEE
  • Organizing Committee

    Publication Year: 2016, Page(s):xxiii - xxiv
    Request permission for commercial reuse | PDF file iconPDF (104 KB)
    Freely Available from IEEE
  • Database Integrated Analytics Using R: Initial Experiences with SQL-Server + R

    Publication Year: 2016, Page(s):1 - 7
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (431 KB) | HTML iconHTML

    Most data scientists use nowadays functional or semi-functional languages like SQL, Scala or R to treat data, obtained directly from databases. Such process requires to fetch data, process it, then store again, and such process tends to be done outside the DB, in often complex data-flows. Recently, database service providers have decided to integrate "R-as-a-Service" in their DB solutions. The ana... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Parallelized Frequent Item Set Mining Using a Tall and Skinny Matrix

    Publication Year: 2016, Page(s):8 - 13
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (273 KB) | HTML iconHTML

    Big data applications consist of very large collection of small records, for example data from a retail website, data from movie streaming services, sensor data applications and many other such applications. Frequent item set mining is one of the common tools used for all these applications to generate recommendations to improve user experience of the website. Frequent itemset mining is also used ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A Probabilistic View of Neighborhood-Based Recommendation Methods

    Publication Year: 2016, Page(s):14 - 20
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (742 KB) | HTML iconHTML

    Probabilistic graphic model is an elegant framework to compactly present complex real-world observations by modeling uncertainty and logical flow (conditionally independent factors). In this paper, we present a probabilistic framework of neighborhood-based recommendation methods (PNBM) in which similarity is regarded as an unobserved factor. Thus, PNBM leads the estimation of user preference to ma... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Discovering Multi-type Correlated Events with Time Series for Exception Detection of Complex Systems

    Publication Year: 2016, Page(s):21 - 28
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (273 KB) | HTML iconHTML

    With the increase of systems' complexity, exception detection becomes more important and difficult. For most complex systems, like cloud platform, exception detection is mainly conducted by analyzing a large amount of telemetry data collected from systems at runtime. Time series data and events data are two major types of telemetry data. Techniques of correlation analysis are important tools that ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Detecting Performance Degradation of Software-Intensive Systems in the Presence of Trends and Long-Range Dependence

    Publication Year: 2016, Page(s):29 - 36
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (1446 KB) | HTML iconHTML

    As contemporary software-intensive systems reach increasingly large scale, it is imperative that failure detection schemes be developed to help prevent costly system downtimes. A promising direction towards the construction of such schemes is the exploitation of easily available measurements of system performance characteristics such as average number of processed requests and queue size per unit ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Scalable Online-Offline Stream Clustering in Apache Spark

    Publication Year: 2016, Page(s):37 - 44
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (697 KB) | HTML iconHTML

    Two of the most popular approaches for dealing with big data are distributed computing and stream mining. In this paper, we incorporate both approaches in order to bring a competitive stream clustering algorithm, namely CluStream, into a modern framework for distributed computing, namely, Apache Spark. CluStream is one of the most popular clustering approaches for stream clustering and the one tha... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Distributed Mining and Modeling of Dynamic Lead-Lag Relations in Evolving Entities

    Publication Year: 2016, Page(s):45 - 52
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (595 KB) | HTML iconHTML

    Discovering and modeling lead-lag relations is a critical task in a variety of domains, including energy management, financial markets and environment monitoring. This task becomes more challenging when processing massive and highly dynamic data sources, often produced by sensors and live feeds that collect data about evolving entities in the real world. To cope with this data volume and velocity,... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Event Detection for Urban Dynamic Data Streams

    Publication Year: 2016, Page(s):53 - 60
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (556 KB) | HTML iconHTML

    This paper presents a framework for processing the data generated by Smart City sensors and IoT data streams in real-time. The scope of processing is to detect various event patterns from the raw data. The framework is extensible because at any moment new data sources can be registered or new specific event detection mechanism can be deployed. The framework offers a HTTP interface which can be use... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Segmenting Sequences of Node-Labeled Graphs

    Publication Year: 2016, Page(s):61 - 68
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (532 KB) | HTML iconHTML

    Detection of the changes in pattern of disease spread over a population network, Meme-tracking and opinion spread on the Twitter network and product-rating-cascade over a social network are a few among the many embodiments of graph sequence segmentation problem with labeled nodes. Most of the previous approaches to network sequence segmentation are on plain graphs without considerations for the dy... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Inference of Partial Canonical Correlation Networks with Application to Stock Market Portfolio Selection

    Publication Year: 2016, Page(s):69 - 76
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (908 KB) | HTML iconHTML

    In recent years, association networks and their applications have received increasing interest. The relationships in a network should ideally be ascertained without any preconceptions about the existence of a connection a priori. This would allow interpretations to be based on the underlying structure rather than on assumptions. Furthermore, a method that discounts outside influence on the relatio... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Overlapping Community Detection by Local Decentralised Vertex-Centred Process

    Publication Year: 2016, Page(s):77 - 84
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (306 KB) | HTML iconHTML

    This paper focuses on the identification of overlapping communities, allowing nodes to simultaneously belong to several communities, in a decentralised way. To that aim it proposes LOCNeSs, an algorithm specially designed to run in a decentralised environment and to limit propagation, two essential characteristics to be applied in mobile networks. It is based on the exploitation of the preferentia... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Query-Based Evolutionary Graph Cuboid Outlier Detection

    Publication Year: 2016, Page(s):85 - 92
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (800 KB) | HTML iconHTML

    Graph-OLAP is an online analytical framework which allows us to obtain various projections of a graph, each of which helps us view the graph along multiple dimensions and multiple levels. Given a series of snapshots of a temporal heterogeneous graph, we aim to find interesting projections of the graph which have anomalous evolutionary behavior. Detecting anomalous projections in a series of such s... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Vertex centric asynchronous belief propagation algorithm for large-scale graphs

    Publication Year: 2016, Page(s):93 - 98
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (183 KB) | HTML iconHTML

    Inference problems on networks and their algorithms were always important subjects, but more so now with so much data available and so little time to make sense of it. Common applications range from product recommendation to social networks and protein interaction. One of the main inferences in this types of networks is the guilty-by-association method, where labeled nodes propagate their informat... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Text Network Exploration via Heterogeneous Web of Topics

    Publication Year: 2016, Page(s):99 - 106
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (2628 KB) | HTML iconHTML

    A text network refers to a data type that each vertex is associated with a text document and the relationship between documents is represented by edges. The proliferation of text networks such as hyperlinked webpages and academic citation networks has led to an increasing demand for quickly developing a general sense of a new text network, namely text network exploration. In this paper, we address... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Classification of Normal and Pathological Brain Networks Based on Similarity in Graph Partitions

    Publication Year: 2016, Page(s):107 - 112
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (1062 KB) | HTML iconHTML

    We consider a task of classifying normal and pathological brain networks. These networks (called connectomes) represent macroscale connections between predefined brain regions, hence, the nodes of connectomes are uniquely labeled and the set of labels (brain regions) is the same across different brains. We make use of this property and hypothesize that connectomes obtained from normal and patholog... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Finding Heaviest k-Subgraphs and Events in Social Media

    Publication Year: 2016, Page(s):113 - 120
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (283 KB) | HTML iconHTML

    In recent years, social media have become a useful tool to stay in contact with friends, to share thoughts but also to be informed about events. Users can follow news channels, but they can be the ones reporting updates, which distinguishes social media from traditional media. In this paper, we use a graph mining approach for finding events in a graph constructed starting from posts of users. We d... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Knowledge Graph Constraints for Multi-label Graph Classification

    Publication Year: 2016, Page(s):121 - 127
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (296 KB) | HTML iconHTML

    Graph classification methods have gained increasing attention in different domains, such as classifying functions of molecules or detection of bugs in software programs. Similarly, predicting events in manufacturing operations data can be compactly modeled as graph classification problem. Feature representations of graphs are usually found by mining discriminative sub-graph patterns that are non-u... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.