By Topic

2010 IEEE International Conference on Data Mining

Date 13-17 Dec. 2010

Filter Results

Displaying Results 1 - 25 of 173
  • [Front cover]

    Publication Year: 2010, Page(s): C1
    Request permission for commercial reuse | PDF file iconPDF (150 KB)
    Freely Available from IEEE
  • [Title page i]

    Publication Year: 2010, Page(s): i
    Request permission for commercial reuse | PDF file iconPDF (85 KB)
    Freely Available from IEEE
  • [Title page iii]

    Publication Year: 2010, Page(s): iii
    Request permission for commercial reuse | PDF file iconPDF (152 KB)
    Freely Available from IEEE
  • [Copyright notice]

    Publication Year: 2010, Page(s): iv
    Request permission for commercial reuse | PDF file iconPDF (113 KB)
    Freely Available from IEEE
  • Table of contents

    Publication Year: 2010, Page(s):v - xv
    Request permission for commercial reuse | PDF file iconPDF (177 KB)
    Freely Available from IEEE
  • Welcome message from the Conference Chairs

    Publication Year: 2010, Page(s):xv - xvi
    Request permission for commercial reuse | PDF file iconPDF (103 KB) | HTML iconHTML
    Freely Available from IEEE
  • Message from the Program Committee Co-Chairs

    Publication Year: 2010, Page(s):xvii - xviii
    Request permission for commercial reuse | PDF file iconPDF (111 KB) | HTML iconHTML
    Freely Available from IEEE
  • Organizing Committee

    Publication Year: 2010, Page(s):xix - xxi
    Request permission for commercial reuse | PDF file iconPDF (132 KB)
    Freely Available from IEEE
  • Program Committee

    Publication Year: 2010, Page(s):xxii - xxviii
    Request permission for commercial reuse | PDF file iconPDF (129 KB)
    Freely Available from IEEE
  • Mining Billion-node Graphs: Patterns, Generators and Tools

    Publication Year: 2010, Page(s): 5
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (144 KB) | HTML iconHTML

    What do graphs look like? How do they evolve over time? How to handle a graph with a billion nodes? We present a comprehensive list of static and temporal laws, and some recent observations on real graphs (e.g., "eigenSpokes"). For generators, we describe some recent ones, which naturally match all of the known properties of real graphs. Finally, for tools, we present "oddball" for discovering ano... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Assessing the Significance of Groups in High-Dimensional Data

    Publication Year: 2010, Page(s): 6
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (81 KB) | HTML iconHTML

    First Page of the Article
    View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • 10 Years of Data Mining Research: Retrospect and Prospect

    Publication Year: 2010, Page(s): 7
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (151 KB) | HTML iconHTML

    First Page of the Article
    View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Detecting Novel Discrepancies in Communication Networks

    Publication Year: 2010, Page(s):8 - 17
    Cited by:  Papers (5)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (1010 KB) | HTML iconHTML

    We address the problem of detecting characteristic patterns in communication networks. We introduce a scalable approach based on set-system discrepancy. By implicitly labeling each network edge with the sequence of times in which its two endpoints communicate, we view an entire communication network as a set-system. This view allows us to use combinatorial discrepancy as a mechanism to "observe" s... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Multi-agent Random Walks for Local Clustering on Graphs

    Publication Year: 2010, Page(s):18 - 27
    Cited by:  Papers (27)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (788 KB) | HTML iconHTML

    We consider the problem of local graph clustering where the aim is to discover the local cluster corresponding to a point of interest. The most popular algorithms to solve this problem start a random walk at the point of interest and let it run until some stopping criterion is met. The vertices visited are then considered the local cluster. We suggest a more powerful alternative, the multi-agent r... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Spatiotemporal Event Detection in Mobility Network

    Publication Year: 2010, Page(s):28 - 37
    Cited by:  Papers (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (341 KB) | HTML iconHTML

    Learning and identifying events in network traffic is crucial for service providers to improve their mobility network performance. In fact, large special events attract cell phone users to relative small areas, which causes sudden surge in network traffic. To handle such increased load, it is necessary to measure the increased network traffic and quantify the impact of the events, so that relevant... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An Unsupervised Approach to Modeling Personalized Contexts of Mobile Users

    Publication Year: 2010, Page(s):38 - 47
    Cited by:  Papers (9)  |  Patents (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (267 KB) | HTML iconHTML

    Mobile context modeling is a process of recognizing and reasoning about contexts and situations in a mobile environment, which is critical for the success of context-aware mobile services. While there are prior work on mobile context modeling, the use of unsupervised learning techniques for mobile context modeling is still under-explored. Indeed, unsupervised techniques have the ability to learn p... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Fast and Flexible Multivariate Time Series Subsequence Search

    Publication Year: 2010, Page(s):48 - 57
    Cited by:  Papers (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (440 KB) | HTML iconHTML

    Multivariate Time-Series (MTS) are ubiquitous, and are generated in areas as disparate as sensor recordings in aerospace systems, music and video streams, medical monitoring, and financial systems. Domain experts are often interested in searching for interesting multivariate patterns from these MTS databases which can contain up to several gigabytes of data. Surprisingly, research on MTS search is... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • iSAX 2.0: Indexing and Mining One Billion Time Series

    Publication Year: 2010, Page(s):58 - 67
    Cited by:  Papers (21)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (640 KB) | HTML iconHTML

    There is an increasingly pressing need, by several applications in diverse domains, for developing techniques able to index and mine very large collections of time series. Examples of such applications come from astronomy, biology, the web, and other domains. It is not unusual for these applications to involve numbers of time series in the order of hundreds of millions to billions. However, all re... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Abstraction Augmented Markov Models

    Publication Year: 2010, Page(s):68 - 77
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (349 KB) | HTML iconHTML

    High accuracy sequence classification often requires the use of higher order Markov models (MMs). However, the number of MM parameters increases exponentially with the range of direct dependencies between sequence elements, thereby increasing the risk of over fitting when the data set is limited in size. We present abstraction augmented Markov models (AAMMs) that effectively reduce the number of n... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A Graph-Based Approach for Multi-folder Email Classification

    Publication Year: 2010, Page(s):78 - 87
    Cited by:  Papers (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (512 KB) | HTML iconHTML

    This paper presents a novel framework for multi-folder email classification using graph mining as the underlying technique. Although several techniques exist (e.g., SVM, TF-IDF, n-gram) for addressing this problem in a delimited context, they heavily rely on extracting high-frequency keywords, thus ignoring the inherent structural aspects of an email (or document in general) which can play a criti... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Scalable Influence Maximization in Social Networks under the Linear Threshold Model

    Publication Year: 2010, Page(s):88 - 97
    Cited by:  Papers (90)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (570 KB) | HTML iconHTML

    Influence maximization is the problem of finding a small set of most influential nodes in a social network so that their aggregated influence in the network is maximized. In this paper, we study influence maximization in the linear threshold model, one of the important models formalizing the behavior of influence propagation in social networks. We first show that computing exact influence in gener... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • CLUSMASTER: A Clustering Approach for Sampling Data Streams in Sensor Networks

    Publication Year: 2010, Page(s):98 - 107
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (620 KB) | HTML iconHTML

    The growing usage of embedded devices and sensors in our daily lives has been profoundly reshaping the way we interact with our environment and our peers. As more and more sensors will pervade our future cities, increasingly efficient infrastructures to collect, process, and store massive amounts of data streams from a wide variety of sources will be required. Despite the different application-spe... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Bayesian Maximum Margin Clustering

    Publication Year: 2010, Page(s):108 - 117
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (333 KB) | HTML iconHTML

    Most well-known discriminative clustering models, such as spectral clustering (SC) and maximum margin clustering (MMC), are non-Bayesian. Moreover, they merely considered to embed domain-dependent prior knowledge into data-specific kernels, while other forms of prior knowledge were seldom considered in these models. In this paper, we propose a Bayesian maximum margin clustering model (BMMC) based ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Viral Marketing for Multiple Products

    Publication Year: 2010, Page(s):118 - 127
    Cited by:  Papers (14)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (292 KB) | HTML iconHTML

    Viral Marketing, the idea of exploiting social interactions of users to propagate awareness for products, has gained considerable focus in recent years. One of the key issues in this area is to select the best seeds that maximize the influence propagated in the social network. In this paper, we define the seed selection problem (called t-Influence Maximization, or t-IM) for multiple products. Spec... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Finding Local Anomalies in Very High Dimensional Space

    Publication Year: 2010, Page(s):128 - 137
    Cited by:  Papers (11)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (693 KB) | HTML iconHTML

    Time, cost and energy efficiency are critical factors for many data analysis techniques when the size and dimensionality of data is very large. We investigate the use of Local Outlier Factor (LOF) for data of this type, providing a motivating example from real world data. We propose Projection-Indexed Nearest-Neighbours (PINN), a novel technique that exploits extended nearest neighbour sets in the... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.