By Topic

Database Systems for Advanced Applications, 2003. (DASFAA 2003). Proceedings. Eighth International Conference on

Date 26-28 March 2003

Filter Results

Displaying Results 1 - 25 of 43
  • Proceedings Eighth International Conference on Database Systems for Advanced Applications

    Publication Year: 2003
    Request permission for commercial reuse | PDF file iconPDF (286 KB)
    Freely Available from IEEE
  • A survey of new directions in database systems

    Publication Year: 2003, Page(s): 3
    Cited by:  Papers (90)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (182 KB) | HTML iconHTML

    Summary form only given, as follows. As database system research evolves, there are several enduring themes. One, of course, is how we deal with the largest possible amounts of data. A less obvious theme is optimization ?? it is an essential ingredient of all modern forms of database system. Because we deal with large volumes of data, we are often forced to process that data in regular ways. But w... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Similarity join for low-and high-dimensional data

    Publication Year: 2003, Page(s):7 - 16
    Cited by:  Papers (6)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (393 KB) | HTML iconHTML

    The efficient processing of similarity joins is important for a large class of applications. The dimensionality of the data for these applications ranges from low to high. Most existing methods have focussed on the execution of high-dimensional joins over large amounts of disk-based data. The increasing sizes of main memory available on current computers, and the need for efficient processing of s... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Spatial query processing for high resolutions

    Publication Year: 2003, Page(s):17 - 26
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (867 KB) | HTML iconHTML

    Modern database applications including computer-aided design (CAD), medical imaging, or molecular biology impose new requirements on spatial query processing. Particular problems arise from the need of high resolutions for very large spatial objects, including cars, space stations, planes and industrial plants, and from the design goal to use general purpose database management systems in order to... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Effective similarity search on voxelized CAD objects

    Publication Year: 2003, Page(s):27 - 36
    Cited by:  Papers (6)  |  Patents (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (634 KB) | HTML iconHTML

    Similarity search in database systems is becoming an increasingly important task in modern application domains such as multimedia, molecular biology, medical imaging and many others. Especially for CAD applications, suitable similarity models and a clear representation of the results can help to reduce the cost of developing and producing new parts by maximizing the reuse of existing parts. In thi... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Discovering direct and indirect matches for schema elements

    Publication Year: 2003, Page(s):39 - 46
    Cited by:  Papers (4)  |  Patents (3)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (312 KB) | HTML iconHTML

    Automating schema matching is challenging. Previous approaches to automating schema matching focus on computing direct element matches between two schemas. Schemas, however rarely match directly. Thus, to complete the task of schema matching, we must also compute indirect element matches. In this paper we present a framework for generating direct as well as many indirect element matches between a ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Gangam: a transformation modeling framework

    Publication Year: 2003, Page(s):47 - 54
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (521 KB) | HTML iconHTML

    Integration of multiple heterogeneous data sources continues to be a critical problem for many application domains and a challenge for researchers world-wide. One aspect of integration is the translation of schema and data across data model boundaries. Researchers in the past have looked at both customized algorithmic approaches as well as generic meta-modeling approaches as viable solutions. We n... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Securing your data in agent-based P2P systems

    Publication Year: 2003, Page(s):55 - 62
    Cited by:  Papers (5)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (348 KB) | HTML iconHTML

    Peer-to-peer (P2P) technology can be naturally integrated with mobile agent technology in Internet applications, taking advantage of the autonomy, mobility, and efficiency of mobile agents in accessing and processing data. We address the problem of protecting critical information in agent-based P2P Internet applications under two different scenarios. First, we assume the route of a mobile agent in... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Ascending frequency ordered prefix-tree: efficient mining of frequent patterns

    Publication Year: 2003, Page(s):65 - 72
    Cited by:  Papers (12)  |  Patents (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (326 KB) | HTML iconHTML

    Mining frequent patterns is a fundamental and important problem in many data mining applications. Many of the algorithms adopt the pattern growth approach, which is shown to be superior to the candidate generate-and-test approach significantly. We identify the key factors that influence the performance of the pattern growth approach, and optimize them to further improve the performance. Our algori... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An efficient sliding window algorithm for detection of sequential patterns

    Publication Year: 2003, Page(s):73 - 80
    Cited by:  Papers (4)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (409 KB) | HTML iconHTML

    Recently a growing number of applications monitor the physical world by tracking sensor data and detecting values, trends or patterns of interest. We focus on the problem of detecting sequential patterns with complex predicates over sensor data, and present an algorithm that efficiently pre-computes which pattern predicates' checks can be skipped at query compile-time, so that the processing windo... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Caucus-based transaction clustering

    Publication Year: 2003, Page(s):81 - 88
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (686 KB) | HTML iconHTML

    Transaction clustering has received attention in recent developments of data mining. Traditional clustering methods are not useful to solve this problem. Transaction data sets are different from the traditional data sets in their high dimensionality, sparsity and numerous outliers. We introduce a new efficient algorithm for transaction clustering. The proposed algorithm is based on a caucus, which... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • TAX-PQ: dynamic taxonomy probing and query modification for topic-focused Web search

    Publication Year: 2003, Page(s):91 - 100
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (491 KB) | HTML iconHTML

    We propose a novel Web search scheme TAX-PQ. TAX-PQ enables taxonomy-based topic-focused Web search on ordinary Boolean Web search interfaces. TAX-PQ utilizes a taxonomy and the data set maintained in an existing taxonomy-based search facility for this purpose. The search is initiated by designating an initial query and a context category in the taxonomy. The data set in the taxonomy-based search ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Finding a Web community by maximum flow algorithm with HITS score based capacity

    Publication Year: 2003, Page(s):101 - 106
    Cited by:  Papers (2)  |  Patents (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (321 KB) | HTML iconHTML

    We propose an edge capacity based on hub and authority scores, and examine the effects of using the edge capacity on the method for extracting Web communities using maximum flow algorithm proposed by G. Flake et al. (2000). A Web community is a collection of Web pages in which a common (or related) topic is taken up. In recent years, various methods for finding Web communities have been proposed. ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Scalable view expansion in a peer mediator system

    Publication Year: 2003, Page(s):107 - 116
    Cited by:  Papers (1)  |  Patents (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (343 KB) | HTML iconHTML

    To integrate many data sources we use a peer mediator-framework where views defined in the peers are logically composed in terms of each other A common approach to execute queries over mediators is to treat views in data sources as 'black boxes'. The mediators locally decompose queries into query fragments and submit them to the data sources for processing. Another approach, used in distributed DB... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Mining emerging substrings

    Publication Year: 2003, Page(s):119 - 126
    Cited by:  Papers (8)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (360 KB) | HTML iconHTML

    We introduce a new type of KDD patterns called emerging substrings. In a sequence database, an emerging substring (ES) of a data class is a substring which occurs more frequently in that class rather than in other classes. ESs are important to sequence classification as they capture significant contrasts between data classes and provide insights for the construction of sequence classifiers. We pro... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Fast text classification: a training-corpus pruning based approach

    Publication Year: 2003, Page(s):127 - 136
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (453 KB) | HTML iconHTML

    With the rapid growth of on-line information available, text classification is becoming more and more important. kNN is a widely used text classification method of high performance. However, this method is inefficient because it requires a large amount of computation for evaluating the similarity between a test document and each training document. In this paper, we propose a fast kNN text classifi... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Efficient record linkage in large data sets

    Publication Year: 2003, Page(s):137 - 146
    Cited by:  Papers (13)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (494 KB) | HTML iconHTML

    This paper describes an efficient approach to record linkage. Given two lists of records, the record-linkage problem consists of determining all pairs that are similar to each other where the overall similarity between two records is defined based on domain-specific similarities over individual attributes constituting the record. The record-linkage problem arises naturally in the context of data c... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Maintenance of partial-sum-based histograms

    Publication Year: 2003, Page(s):149 - 156
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (396 KB) | HTML iconHTML

    This paper introduces an efficient method for the maintenance of wavelet-based histograms built on partial sums. Wavelet-based histograms can be constructed from either raw data distributions or partial sums. The two construction methods have their own merits. Previous works have only focused on the maintenance of raw-data-based histograms. However it is highly inefficient to apply directly their ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Selectivity estimation using orthogonal series

    Publication Year: 2003, Page(s):157 - 164
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (336 KB) | HTML iconHTML

    Selectivity estimation is an integral part of query optimization. In this paper, we propose a novel approach to approximate data density functions of relations and use them to estimate selectivities. A data density function here is approximated by a partial sum of an orthogonal series. Such approximate density functions can be derived easily, stored efficiently, and maintained dynamically. Experim... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Error minimization for approximate computation of range aggregates

    Publication Year: 2003, Page(s):165 - 172
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (340 KB) | HTML iconHTML

    Histogram techniques are widely used in commercial database management systems for an estimation of query results. Recently, they have been also used in approximately, processing database queries, especially aggregation queries. Existing research results in this area have been mainly focused on constructing a histogram to approximately represent, as accurate as possible on an intuitive base, the o... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Q+Rtree: efficient indexing for moving object databases

    Publication Year: 2003, Page(s):175 - 182
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (306 KB) | HTML iconHTML

    Moving object environments contain large numbers of queries and continuously moving objects. Traditional spatial index structures do not work well in this environment because of the need to frequently update the index which results in very poor performance. In this paper, we present a novel indexing structure, namely the Q+Rtree, based on the observation that: i) most moving objects are in quasi-s... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Efficient index update for moving objects with future trajectories

    Publication Year: 2003, Page(s):183 - 191
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (353 KB) | HTML iconHTML

    Recently, more research has been conducted on moving object databases (MOD). Typically, there are three kinds of data for dynamic attributes in MOD, i.e., historical, current and future. Although many index structures have been developed for the former two types of data, there is not much work to deal with the future data. In particular, the problem of index update has not been addressed with effi... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Prefetching for visual data exploration

    Publication Year: 2003, Page(s):195 - 202
    Cited by:  Papers (3)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (559 KB) | HTML iconHTML

    Modern computer applications, from business decision support to scientific data analysis, utilize data visualization tools to support exploratory activities. Visual exploration tools typically do not scale well when applied to huge data sets, partially because being interactive necessitates real-time responses. However, we observe that interactive visual explorations exhibit several properties tha... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Freshness-driven adaptive caching for dynamic content

    Publication Year: 2003, Page(s):203 - 212
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (643 KB) | HTML iconHTML

    With the wide availability of content delivery networks, many e-commerce Web applications utilize edge cache servers to cache and deliver dynamic contents at locations much closer to users, avoiding network latency. By caching a large number of dynamic content pages in the edge cache servers, response time can be reduced, benefiting from higher cache hit rates. However this is achieved at the expe... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Time-stratified sampling for approximate answers to aggregate queries

    Publication Year: 2003, Page(s):215 - 222
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (332 KB) | HTML iconHTML

    In large data warehousing environments, it is often advantageous to provide fast, approximate answers to complex aggregate queries based on samples. However, uniformly extracted samples often do not guarantee acceptable accuracy in grouping interval estimations. This is crucial in most less-aggregated analyses, which are mostly based on recent data (e.g. forecasting, performance analysis). We prop... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.