Scheduled System Maintenance
On Saturday, December 10, single article sales and account management will be unavailable from 5:00 AM-7:30 PM ET.
We apologize for the inconvenience.
By Topic

Sixth IEEE International Conference on Data Mining - Workshops (ICDMW'06)

18-22 Dec. 2006

Filter Results

Displaying Results 1 - 25 of 170
  • Sixth IEEE International Conference on Data Mining - Workshops - Cover

    Publication Year: 2006, Page(s): c1
    Request permission for commercial reuse | PDF file iconPDF (169 KB)
    Freely Available from IEEE
  • Sixth IEEE International Conference on Data Mining Workshops - Title

    Publication Year: 2006, Page(s):i - iii
    Request permission for commercial reuse | PDF file iconPDF (91 KB)
    Freely Available from IEEE
  • Sixth IEEE International Conference on Data Mining Workshops - Copyright

    Publication Year: 2006, Page(s): iv
    Request permission for commercial reuse | PDF file iconPDF (43 KB)
    Freely Available from IEEE
  • Sixth IEEE International Conference on Data Mining Workshops - Table of contents

    Publication Year: 2006, Page(s):v - xvi
    Request permission for commercial reuse | PDF file iconPDF (188 KB)
    Freely Available from IEEE
  • Preface

    Publication Year: 2006, Page(s):xvii - xix
    Request permission for commercial reuse | PDF file iconPDF (68 KB) | HTML iconHTML
    Freely Available from IEEE
  • Workshops Organizations

    Publication Year: 2006, Page(s): xx
    Request permission for commercial reuse | PDF file iconPDF (175 KB)
    Freely Available from IEEE
  • Mining Frequent Induced Subtree Patterns with Subtree-Constraint

    Publication Year: 2006, Page(s):3 - 7
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (974 KB) | HTML iconHTML

    Mining frequent induced subtree patterns is very useful in domains such as XML databases, Web log analyzing. However, because of the combinatorial explosion, mining all frequent subtree patterns becomes infeasible for a large and dense tree database. And too many frequent subtree patterns also confuse users. Usually only a small set of the mining results can arouse users' interests. In this paper,... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Razor: mining distance-constrained embedded subtrees

    Publication Year: 2006, Page(s):8 - 13
    Cited by:  Papers (4)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (377 KB) | HTML iconHTML

    Our work is focused on the task of mining frequent subtrees from a database of rooted ordered labeled subtrees. Previously we have developed an efficient algorithm, MB3 (Tan et al., 2005), for mining frequent embedded subtrees from a database of rooted labeled and ordered subtrees. The efficiency comes from the utilization of a novel embedding list representation for tree model guided (TMG) candid... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Mining Closed and Maximal Frequent Induced Free Subtrees

    Publication Year: 2006, Page(s):14 - 18
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (744 KB) | HTML iconHTML

    Mining frequent tree patterns is an important problem, since tree structures are used in various fields such as computational biology, XML databases, and so on. However, mining all frequent subtrees is sometimes infeasible because of the combinatorial explosion. In this paper, by combining an efficient algorithm for enumerating free trees and the pruning techniques for mining closed and maximal ro... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Automatic Keyword Extraction Using Linguistic Features

    Publication Year: 2006, Page(s):19 - 23
    Cited by:  Papers (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (260 KB) | HTML iconHTML

    This paper describes a novel keyword extraction algorithm position weight (PW) that utilizes linguistic features to represent the importance of the word position in a document. Topical terms and their previous-term and next-term co-occurrence collections are extracted. To measure the degree of correlation between a topical term and its co-occurrence terms, three methods are employed including term... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A Semi-Structured representation for Knowledge Discovering using Remote Sensing Images

    Publication Year: 2006, Page(s):24 - 28
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (292 KB) | HTML iconHTML

    In this paper we describe the basic functionalities of a system dedicated to process high-resolution satellite images and to handle them through (semi-) structured descriptors. These descriptors enable to manage in a unified representation two families of features extracted from the objects identified by image segmentation: the attributes characterizing each object, and the attributes characterizi... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • NameIt: Extraction of product names

    Publication Year: 2006, Page(s):29 - 33
    Cited by:  Papers (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (233 KB) | HTML iconHTML

    An important precondition for the semantic Web is to identify and annotate entities, their names, and their descriptions in the Web. In particular, the Web contains numerous Web pages describing various entities. In this paper we present a method for unsupervised generation of identities (i.e. product names) based on a set of concept instance describing Web pages. We exploit the redundancy of desc... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • The Role of Domain Ontology in Text Mining Applications: The ADDMiner Project

    Publication Year: 2006, Page(s):34 - 38
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (572 KB) | HTML iconHTML

    Extracting insights from large text collections is an aspiration of any organization aiming to take advantage of their experience generally documented in textual documents. Textual documents, either digital or not, have been the most common form to register any organization transaction. Free text style is a very easy way to input data since it does not require users any special training. On the ot... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Enhancing Text Retrieval Performance using Conceptual Ontological Graph

    Publication Year: 2006, Page(s):39 - 44
    Cited by:  Papers (4)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (198 KB) | HTML iconHTML

    Most of the data representation techniques are based on word and/or phrase analysis of the text. The statistical analysis of a term (word or phrase) frequency captures the importance of the term within a document. However, to achieve a more accurate analysis, the underlying data representation should indicate terms that capture the semantics of the text from which the importance of a term in a sen... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Unsupervised Learning of Tree Alignment Models for Information Extraction

    Publication Year: 2006, Page(s):45 - 49
    Cited by:  Papers (3)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (224 KB) | HTML iconHTML

    We propose an algorithm for extracting fields from HTML search results. The output of the algorithm is a database table - a data structure that better lends itself to high-level data mining and information exploitation. Our algorithm effectively combines tree and string alignment algorithms, as well as domain-specific feature extraction to match semantically related data across search results. The... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Clustering Workflow Requirements Using Compression Dissimilarity Measure

    Publication Year: 2006, Page(s):50 - 54
    Cited by:  Papers (2)  |  Patents (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (255 KB) | HTML iconHTML

    Xerox offers a bewildering array of printers and software configurations to satisfy the need of production print shops. A configuration tool in the hands of sales analysts elicits requirements from customers and recommends a list of product configurations. This tool generates special question and answer case logs that provide useful historical data. Given the unusual semi-structured question and a... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Reducing the Frequent Pattern Set

    Publication Year: 2006, Page(s):55 - 59
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (841 KB) | HTML iconHTML

    One of the major problems in frequent pattern mining is the explosion of the number of results, making it difficult to identify the interesting frequent patterns. In a recent paper we have shown that an MDL-based approach gives a dramatic reduction of the number of frequent item sets to consider. Here we show that MDL gives similarly good reductions for frequent patterns on other types of data, vi... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Improving the Results and Performance of Clustering Bit-encoded XML Documents

    Publication Year: 2006, Page(s):60 - 64
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (208 KB) | HTML iconHTML

    Clustering XML documents according to their structure is one of the techniques that may improve the effectiveness of XML documents storage and retrieval. One of existing approaches to this problem is to encode XML document structure as a string of bits and cluster such feature vectors. High dimensionality and sparseness of the feature vectors are the weaknesses of this method. The paper presents f... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A New Algorithm for Mining Fuzzy Association Rules in the Large Databases Based on Ontology

    Publication Year: 2006, Page(s):65 - 69
    Cited by:  Papers (4)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (3774 KB) | HTML iconHTML

    Association rule mining is an active data mining research area. Recent years have witnessed many efforts on discovering fuzzy associations. The key strength of fuzzy association rule mining is its completeness. This strength, however, comes with a major drawback to handle large datasets. It often produces a huge number of candidate itemsets. The huge number of candidate itemsets makes it ineffecti... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Extracting Variable Knowledge from Multiversioned XML Documents

    Publication Year: 2006, Page(s):70 - 74
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (191 KB) | HTML iconHTML

    The growing research interests in the XML data warehousing and XML mining areas during the last few years were determined by the wider use of the XML to represent semi-structured data and to exchange information between different types of applications. A large number of techniques have being developed, to mine interesting knowledge from XML documents, e.g. frequent patterns, association rules, clu... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Automatic Construction of N-ary Tree Based Taxonomies

    Publication Year: 2006, Page(s):75 - 79
    Cited by:  Papers (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (266 KB) | HTML iconHTML

    Hierarchies are an intuitive and effective organization paradigm for data. Of late there has been considerable research on automatically learning hierarchical organizations of data. In this paper, we explore the problem of learning n-ary tree based hierarchies of categories with no user-defined parameters. We propose a framework that characterizes a "good" taxonomy and also provide an algorithm to... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Concept-Aware Ranking: Teaching an Old Graph New Moves

    Publication Year: 2006, Page(s):80 - 88
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (256 KB) | HTML iconHTML

    In ranking algorithms for Web graphs, such as PageRank and HITS, the lack of attention to concepts/topics representing Web page content causes problems such as topic drift and mutually reinforcing relationships between hosts. This paper proposes a novel approach to expand the Web graph to incorporate conceptual information encoded by links (anchor text) between Web pages. Using Web graph link stru... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Hierarchical Density Shaving: A clustering and visualization framework for large biological datasets

    Publication Year: 2006, Page(s):89 - 93
    Cited by:  Papers (5)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (462 KB) | HTML iconHTML

    In many clustering applications for bioinformatics, only part of the data clusters into one or more groups while the rest needs to be pruned. For such situations, we present hierarchical density shaving (HDS), a framework that consists of a fast, hierarchical, density-based clustering algorithm. Our framework also provides a simple yet powerful 2D visualization of the hierarchy of clusters that ca... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Discovering Frequent Poly-Regions in DNA Sequences

    Publication Year: 2006, Page(s):94 - 98
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (271 KB) | HTML iconHTML

    The problem of discovering arrangements of regions of high occurrence of one or more items of a given alphabet in a sequence is studied, and two efficient algorithms are proposed. The first one is entropy-based and uses an existing recursive segmentation technique to split the input sequence into a set of homogeneous segments. The key idea of the second approach is to use a set of sliding windows ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Sparse Logistic Classifiers for Interpretable Protien Homology Detection

    Publication Year: 2006, Page(s):99 - 103
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (4188 KB) | HTML iconHTML

    Computational classification of proteins using methods such as string kernels and Fisher-SVM has demonstrated great success. However, the resulting models do not offer an immediate interpretation of the underlying biological mechanisms. In particular, some recent studies have postulated the existence of a small subset of positions and residues in protein sequences may be sufficient to discriminate... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.