Proceedings. 20th International Conference on Data Engineering

2-2 April 2004

Filter Results

Displaying Results 1 - 25 of 135
  • Proceedings 20th International Conference on Data Engineering

    Publication Year: 2004, Page(s): 0_1
    Request permission for commercial reuse | PDF file iconPDF (159 KB)
    Freely Available from IEEE
  • Dedicated to the memory of Professor Yahiko Kambayashi 1943-2004

    Publication Year: 2004, Page(s): 0_2
    Request permission for commercial reuse | PDF file iconPDF (157 KB)
    Freely Available from IEEE
  • Proceedings. 20th International Conference on Data Engineering

    Publication Year: 2004
    Request permission for commercial reuse | PDF file iconPDF (2279 KB)
    Freely Available from IEEE
  • Copyright page

    Publication Year: 2004, Page(s): 0_4
    Request permission for commercial reuse | PDF file iconPDF (190 KB)
    Freely Available from IEEE
  • Table of contents

    Publication Year: 2004, Page(s):0_5 - 0_12
    Request permission for commercial reuse | PDF file iconPDF (208 KB)
    Freely Available from IEEE
  • Message from the general chairs and program chairs

    Publication Year: 2004, Page(s): 0_13
    Request permission for commercial reuse | PDF file iconPDF (176 KB) | HTML iconHTML
    Freely Available from IEEE
  • Conference officers

    Publication Year: 2004, Page(s): 0_14
    Request permission for commercial reuse | PDF file iconPDF (174 KB)
    Freely Available from IEEE
  • Program Committee

    Publication Year: 2004, Page(s):0_15 - 0_18
    Request permission for commercial reuse | PDF file iconPDF (191 KB)
    Freely Available from IEEE
  • External reviewers

    Publication Year: 2004, Page(s):0_19 - 0_20
    Request permission for commercial reuse | PDF file iconPDF (169 KB)
    Freely Available from IEEE
  • Plenary session: Can a semantic web for life sciences improve drug discovery?

    Publication Year: 2004, Page(s): 2
    Request permission for commercial reuse | PDF file iconPDF (201 KB)
    Freely Available from IEEE
  • Plenary session: driving forces in database technology

    Publication Year: 2004
    Cited by:  Patents (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (202 KB)

    Several forces, with impacts so fundamental that they are akin to tectonic plate movements, are driving the commercial database marketplace. First is hardware commoditization: arrays of low priced computers with high speed interconnects which yield the new cluster based computing capabilities referred to as 'grid,' 'utility,' and 'on-demand' computing, at price points radically lower than standard... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Plenary session: enabling communities of knowledge workers

    Publication Year: 2004
    Request permission for commercial reuse | PDF file iconPDF (202 KB)
    Freely Available from IEEE
  • LDC: enabling search by partial distance in a hyper-dimensional space

    Publication Year: 2004, Page(s):6 - 17
    Cited by:  Papers (15)  |  Patents (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (431 KB) | HTML iconHTML

    Recent advances in research fields like multimedia and bioinformatics have brought about a new generation of hyper-dimensional databases which can contain hundreds or even thousands of dimensions. Such hyper-dimensional databases pose significant problems to existing high-dimensional indexing techniques which have been developed for indexing databases with (commonly) less than a hundred dimensions... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Simple, robust and highly concurrent b-trees with node deletion

    Publication Year: 2004, Page(s):18 - 27
    Cited by:  Papers (2)  |  Patents (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (310 KB) | HTML iconHTML

    Why might B-tree concurrency control still be interesting? For two reasons: (i) currently exploited "real world" approaches are complicated; (ii) simpler proposals are not used because they are not sufficiently robust. In the "real world", systems need to deal robustly with node deletion, and this is an important reason why the currently exploited techniques are complicated. In our effort to simpl... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Bulk operations for space-partitioning trees

    Publication Year: 2004, Page(s):29 - 40
    Cited by:  Papers (9)  |  Patents (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (6518 KB) | HTML iconHTML

    The emergence of extensible index structures, e.g., GiST (generalized search tree) [J.M. Hellerstein et al. (1995)] and SP-GiST (space-partitioning generalized search tree) [W. G Aref et al., (2001)], calls for a set of extensible algorithms to support different operations (e.g., insertion, deletion, and search). Extensible bulk operations (e.g., bulk loading and bulk insertion) are of the same im... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Recursive XML schemas, recursive XML queries, and relational storage: XML-to-SQL query translation

    Publication Year: 2004, Page(s):42 - 53
    Cited by:  Papers (14)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (405 KB) | HTML iconHTML

    We consider the problem of translating XML queries into SQL when XML documents have been stored in an RDBMS using a schema-based relational decomposition. Surprisingly, there is no published XML-to-SQL query translation algorithm for this scenario that handles recursive XML schemas. We present a generic algorithm to translate path expression queries into SQL in the presence of recursion in the sch... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A succinct physical storage scheme for efficient evaluation of path queries in XML

    Publication Year: 2004, Page(s):54 - 65
    Cited by:  Papers (19)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (382 KB) | HTML iconHTML

    Path expressions are ubiquitous in XML processing languages. Existing approaches evaluate a path expression by selecting nodes that satisfies the tag-name and value constraints and then joining them according to the structural constraints. We propose a novel approach, next-of-kin (NoK) pattern matching, to speed up the node-selection step, and to reduce the join size significantly in the second st... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A prime number labeling scheme for dynamic ordered XML trees

    Publication Year: 2004, Page(s):66 - 78
    Cited by:  Papers (56)  |  Patents (13)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (352 KB) | HTML iconHTML

    Efficient evaluation of XML queries requires the determination of whether a relationship exists between two elements. A number of labeling schemes have been designed to label the element nodes such that the relationships between nodes can be easily determined by comparing their labels. With the increased popularity of XML on the Web, finding a labeling scheme that is able to support order-sensitiv... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • BIDE: efficient mining of frequent closed sequences

    Publication Year: 2004, Page(s):79 - 90
    Cited by:  Papers (159)  |  Patents (3)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (438 KB) | HTML iconHTML

    Previous studies have presented convincing arguments that a frequent pattern mining algorithm should not mine all frequent patterns but only the closed ones because the latter leads to not only more compact yet complete result set but also better efficiency. However, most of the previously developed closed pattern mining algorithms work under the candidate maintenance-and-test paradigm which is in... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Mining frequent labeled and partially labeled graph patterns

    Publication Year: 2004, Page(s):91 - 102
    Cited by:  Papers (5)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (1128 KB) | HTML iconHTML

    Whereas data mining in structured data focuses on frequent data values, in semistructured and graph data the emphasis is on frequent labels and common topologies. Here, the structure of the data is just as important as its content. When data contains large amount of different labels, both fully labeled and partially labeled data may be useful. More informative patterns can be found in the database... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Probe, cluster, and discover: focused extraction of QA-Pagelets from the deep Web

    Publication Year: 2004, Page(s):103 - 114
    Cited by:  Papers (6)  |  Patents (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (553 KB) | HTML iconHTML

    We introduce the concept of a QA-Pagelet to refer to the content region in a dynamic page that contains query matches. We present THOR, a scalable and efficient mining system for discovering and extracting QA-Pagelets from the deep Web. A unique feature of THOR is its two-phase extraction framework. In the first phase, pages from a deep Web site are grouped into distinct clusters of structurally-s... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Improving hash join performance through prefetching

    Publication Year: 2004, Page(s):116 - 127
    Cited by:  Papers (20)  |  Patents (3)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (457 KB) | HTML iconHTML

    Hash join algorithms suffer from extensive CPU cache stalls. We show that the standard hash join algorithm/or disk-oriented databases (i.e. GRACE) spends over 73% of its user time stalled on CPU cache misses, and explores the use of prefetching to improve its cache performance. Applying prefetching to hash joins is complicated by the data dependencies, multiple code paths, and inherent randomness ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Go green: recycle and reuse frequent patterns

    Publication Year: 2004, Page(s):128 - 139
    Cited by:  Papers (4)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (1162 KB) | HTML iconHTML

    In constrained data mining, users can specify constraints to prune the search space to avoid mining uninteresting knowledge. This is typically done by specifying some initial values of the constraints that are subsequently refined iteratively until satisfactory results are obtained. Existing mining schemes treat each iteration as a distinct mining process, and fail to exploit the information gener... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Approximate selection queries over imprecise data

    Publication Year: 2004, Page(s):140 - 151
    Cited by:  Papers (7)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (563 KB) | HTML iconHTML

    We examine the problem of evaluating selection queries over imprecisely represented objects. Such objects are used either because they are much smaller in size than the precise ones (e.g., compressed versions of time series), or as imprecise replicas of fast-changing objects across the network (e.g., interval approximations for time-varying sensor readings). It may be impossible to determine wheth... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Improved file synchronization techniques for maintaining large replicated collections over slow networks

    Publication Year: 2004, Page(s):153 - 164
    Cited by:  Papers (16)  |  Patents (22)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (612 KB) | HTML iconHTML

    We study the problem of maintaining large replicated collections of files or documents in a distributed environment with limited bandwidth. This problem arises in a number of important applications, such as synchronization of data between accounts or devices, content distribution and Web caching networks, Web site mirroring, storage networks, and large scale Web search and mining. At the core of t... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.