Data Mining, 2004. ICDM '04. Fourth IEEE International Conference on

1-4 Nov. 2004

Filter Results

Displaying Results 1 - 25 of 117
  • [Cover page]

    Publication Year: 2004, Page(s): c1
    Request permission for commercial reuse | PDF file iconPDF (139 KB)
    Freely Available from IEEE
  • [Title page]

    Publication Year: 2004, Page(s):i - iv
    Request permission for commercial reuse | PDF file iconPDF (81 KB)
    Freely Available from IEEE
  • Table of contents

    Publication Year: 2004, Page(s):v - xii
    Request permission for commercial reuse | PDF file iconPDF (57 KB)
    Freely Available from IEEE
  • Welcome to ICDM 2004

    Publication Year: 2004, Page(s):xiii - xiv
    Request permission for commercial reuse | PDF file iconPDF (43 KB) | HTML iconHTML
    Freely Available from IEEE
  • Conference organization

    Publication Year: 2004, Page(s): xv
    Request permission for commercial reuse | PDF file iconPDF (29 KB)
    Freely Available from IEEE
  • Steering Committee

    Publication Year: 2004, Page(s): xvi
    Request permission for commercial reuse | PDF file iconPDF (24 KB)
    Freely Available from IEEE
  • Program Committee

    Publication Year: 2004, Page(s):xvii - xix
    Request permission for commercial reuse | PDF file iconPDF (44 KB)
    Freely Available from IEEE
  • Non-PC reviewers

    Publication Year: 2004, Page(s):xx - xxi
    Request permission for commercial reuse | PDF file iconPDF (23 KB)
    Freely Available from IEEE
  • Invited talks [breaker page]

    Publication Year: 2004, Page(s): 579
    Request permission for commercial reuse | PDF file iconPDF (21 KB) | HTML iconHTML
    Freely Available from IEEE
  • Tutorials

    Publication Year: 2004, Page(s): 580
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (20 KB)

    Provides an abstract for each of the tutorial presentations and a brief professional biography of each presenter. The complete presentations were not made available for publication as part of the conference proceedings. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Workshops

    Publication Year: 2004, Page(s): 581
    Request permission for commercial reuse | PDF file iconPDF (17 KB) | HTML iconHTML
    Freely Available from IEEE
  • Detection of significant sets of episodes in event sequences

    Publication Year: 2004, Page(s):3 - 10
    Cited by:  Papers (12)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (248 KB) | HTML iconHTML

    We present a method for a reliable detection of "unusual" sets of episodes in the form of many pattern sequences, scanned simultaneously for an occurrence as a subsequence in a large event stream within a window of size w. We also investigate the important special case of all permutations of the same sequence, which models the situation where the order of events in an episode does not matter, e.g.... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Subspace selection for clustering high-dimensional data

    Publication Year: 2004, Page(s):11 - 18
    Cited by:  Papers (13)  |  Patents (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (248 KB) | HTML iconHTML

    In high-dimensional feature spaces traditional clustering algorithms tend to break down in terms of efficiency and quality. Nevertheless, the data sets often contain clusters which are hidden in various subspaces of the original feature space. In this paper, we present a feature selection technique called SURFING (subspaces relevant for clustering) that finds all subspaces interesting for clusteri... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Multi-view clustering

    Publication Year: 2004, Page(s):19 - 26
    Cited by:  Papers (113)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (5600 KB) | HTML iconHTML

    We consider clustering problems in which the available attributes can be split into two independent subsets, such that either subset suffices for learning. Example applications of this multi-view setting include clustering of Web pages which have an intrinsic view (the pages themselves) and an extrinsic view (e.g., anchor texts of inbound hyperlinks); multi-view learning has so far been studied in... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Density connected clustering with local subspace preferences

    Publication Year: 2004, Page(s):27 - 34
    Cited by:  Papers (30)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (240 KB) | HTML iconHTML

    Many clustering algorithms tend to break down in high-dimensional feature spaces, because the clusters often exist only in specific subspaces (attribute subsets) of the original feature space. Therefore, the task of projected clustering (or subspace clustering) has been defined recently. As a solution to tackle this problem, we propose the concept of local subspace preferences, which captures the ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • On closed constrained frequent pattern mining

    Publication Year: 2004, Page(s):35 - 42
    Cited by:  Papers (16)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (832 KB) | HTML iconHTML

    Constrained frequent patterns and closed frequent patterns are two paradigms aimed at reducing the set of extracted patterns to a smaller, more interesting, subset. Although a lot of work has been done with both these paradigms, there is still confusion around the mining problem obtained by joining closed and constrained frequent patterns in a unique framework. In this paper, we shed light on this... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Efficient density-based clustering of complex objects

    Publication Year: 2004, Page(s):43 - 50
    Cited by:  Papers (8)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (208 KB) | HTML iconHTML

    Nowadays, data mining in large databases of complex objects from scientific, engineering or multimedia applications is getting more and more important. In many different application domains, complex object representations along with complex distance functions are used for measuring the similarity between objects. Often, not only these complex distance measures are available but also simpler distan... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Test-cost sensitive naive Bayes classification

    Publication Year: 2004, Page(s):51 - 58
    Cited by:  Papers (13)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (192 KB) | HTML iconHTML

    Inductive learning techniques such as the naive Bayes and decision tree algorithms have been extended in the past to handle different types of costs mainly by distinguishing different costs of classification errors. However, it is an equally important issue to consider how to handle the test costs associated with querying the missing values in a test case. When the value of an attribute is missing... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Moment: maintaining closed frequent itemsets over a stream sliding window

    Publication Year: 2004, Page(s):59 - 66
    Cited by:  Papers (10)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (184 KB) | HTML iconHTML

    This paper considers the problem of mining closed frequent itemsets over a sliding window using limited memory space. We design a synopsis data structure to monitor transactions in the sliding window so that we can output the current closed frequent itemsets at any time. Due to time and memory constraints, the synopsis data structure cannot monitor all possible itemsets. However, monitoring only f... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Communication efficient construction of decision trees over heterogeneously distributed data

    Publication Year: 2004, Page(s):67 - 74
    Cited by:  Papers (11)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (208 KB) | HTML iconHTML

    We present an algorithm designed to efficiently construct a decision tree over heterogeneously distributed data without centralizing. We compare our algorithm against a standard centralized decision tree implementation in terms of accuracy as well as the communication complexity. Our experimental results show that by using only 20% of the communication cost necessary to centralize the data we can ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Non-redundant data clustering

    Publication Year: 2004, Page(s):75 - 82
    Cited by:  Papers (19)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (200 KB) | HTML iconHTML

    Data clustering is a popular approach for automatically finding classes, concepts, or groups of patterns. In practice, this discovery process should avoid redundancies with existing knowledge about class structures or groupings, and reveal novel, previously unknown aspects of the data. In order to deal with this problem, we present an extension of the information bottleneck framework, called coord... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Fast and exact out-of-core k-means clustering

    Publication Year: 2004, Page(s):83 - 90
    Cited by:  Papers (15)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (280 KB) | HTML iconHTML

    Clustering has been one of the most widely studied topics in data mining and k-means clustering has been one of the popular clustering algorithms. K-means requires several passes on the entire dataset, which can make it very expensive for large disk-resident datasets. In view of this, a lot of work has been done on various approximate versions of k-means, which require only one or a small number o... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Mining frequent itemsets from secondary memory

    Publication Year: 2004, Page(s):91 - 98
    Cited by:  Papers (10)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (208 KB) | HTML iconHTML

    Mining frequent itemsets is at the core of mining association rules, and is by now quite well understood algorithmically for main memory databases. In this paper, we investigate approaches to mining frequent itemsets when the database or the data structures used in the mining are too large to fit in main memory. Experimental results show that our techniques reduce the required disk accesses by ord... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A Bayesian framework for regularized SVM parameter estimation

    Publication Year: 2004, Page(s):99 - 105
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (176 KB) | HTML iconHTML

    The support vector machine (SVM) is considered here in the context of pattern classification. The emphasis is on the soft margin classifier which uses regularization to handle non-separable learning samples. We present an SVM parameter estimation algorithm that first identifies a subset of the learning samples that we call the support set and then determines not only the weights of the classifier ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Unimodal segmentation of sequences

    Publication Year: 2004, Page(s):106 - 113
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (432 KB) | HTML iconHTML

    We study the problem of segmenting a sequence into k pieces so that the resulting segmentation satisfies monotonicity or unimodality constraints. Unimodal functions can be used to model phenomena in which a measured variable first increases to a certain level and then decreases. We combine a well-known unimodal regression algorithm with a simple dynamic-programming approach to obtain an optimal qu... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.