By Topic

Database Engineering and Applications Symposium, 2002. Proceedings. International

Date 17-19 July 2002

Filter Results

Displaying Results 1 - 25 of 30
  • Proceedings International Database Engineering and Applications Symposium

    Save to Project icon | Request Permissions | PDF file iconPDF (500 KB)  
    Freely Available from IEEE
  • Author index

    Page(s): 295
    Save to Project icon | Request Permissions | PDF file iconPDF (193 KB)  
    Freely Available from IEEE
  • Rule termination analysis investigating the interaction between transactions and triggers

    Page(s): 285 - 294
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (659 KB) |  | HTML iconHTML  

    We introduce a new method for rule termination analysis within active databases. This method analyzes the interaction between transactions and triggers, by means of evolution graphs. In this paper trigger information and transaction updates are considered in order to study rule termination and simulate execution. First we present the algorithm for testing rule termination and then show that several termination analysis methods are captured by our method. The proposed approach turns out to be practical and general with respect to various rule languages and thus may be applied to many database systems. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A location dependent benchmark with mobility behavior

    Page(s): 74 - 83
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (322 KB) |  | HTML iconHTML  

    Location dependent benchmarking with mobility behavior is a necessary step in the evolution of improvements of wireless and computing technology. The components needed for a mobile computing benchmark include specifications for data, queries, mobile unit behavior and execution guidelines. One of the most unique types of queries present in the mobile computing environment is a location dependent query where its result depends on the issue location. In this paper we describe the main features of a location dependent benchmark including the execution guidelines. While targeted to location dependent applications, it contains more general queries as well. As such, it can be viewed as the first benchmark targeted to the general mobile computing environment. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Distributing CORBA views from an OODBMS

    Page(s): 116 - 127
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (284 KB) |  | HTML iconHTML  

    The need to distribute objects on the Internet and to offer views from databases has found a solution with the advent of CORBA. Most database management systems now offer CORBA interfaces which are generally simple mapping of the database schema to the CORBA world. This approach does not address all the problems of database interoperation because (i) such a view is static (ii) its semantic is completely bound to the semantic of the schema and it is not possible to re-model it (iii) only one view per database can be offered (iv) access may be limited to reading and no mechanism is given to write in the database through the view. To solve these problems, we have designed a language, the Interface Mapping Definition Language (IMDL) and some tools, grouped in the Interface Mapping Service (IMS). IMDL is used to define CORBA views from OODBMS, while IMS generates an IDL construct and a full CORBA implementation from an IMDL construct and a database schema. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Clustering spatial data in the presence of obstacles: a density-based approach

    Page(s): 214 - 223
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (440 KB)  

    Clustering spatial data is a well-known problem that has been extensively studied. Grouping similar data in large 2-dimensional spaces to find hidden patterns or meaningful sub-groups has many applications such as satellite imagery, geographic information systems, medical image analysis, marketing, computer visions, etc. Although many methods have been proposed in the literature, very few have considered physical obstacles that may have significant consequences on the effectiveness of the clustering. Taking into account these constraints during the clustering process is costly and the modeling of the constraints is paramount for good performance. In this paper, we investigate the problem of clustering in the presence of constraints such as physical obstacles and introduce a new approach to model these constraints using polygons. We also propose a strategy to prune the search space and reduce the number of polygons to test during clustering. We devise a density-based clustering algorithm, DBCluC, which takes advantage of our constraint modeling to efficiently cluster data objects while considering all physical constraints. The algorithm can detect clusters of arbitrary shape and is insensitive to noise, the input order and the difficulty of constraints. Its average running complexity is O(NlogN) where N is the number of data points. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Graph partition based multi-way spatial joins

    Page(s): 23 - 32
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (474 KB) |  | HTML iconHTML  

    We investigate the problem of efficiently computing a multi-way spatial join without spatial indexes. We propose a novel and effective filtering algorithm based on a two phase partitioning technique. To avoid missing hits due to an inherent difficulty in multi-way spatial joins, we propose to firstly partition a join graph into sub-graphs whenever necessary. In the second phase, we partition the spatial data sets; and then the sub-joins will be executed simultaneously in each partition to minimise the I/O costs. Finally, a multi-way relational join is applied to merge together the sub-join results. Our experiment results demonstrate the effectiveness and efficiency of the proposed algorithm. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • DWS-AQA: a cost effective approach for very large data warehouses

    Page(s): 233 - 242
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (351 KB) |  | HTML iconHTML  

    Data warehousing applications typically involve massive amounts of data that push database management technology to the limit. A scalable architecture is crucial, not only to handle very large amount of data but also to assure interactive response time to the users. Large data warehouses require a very expensive setup, typically based on high-end servers or high-performance clusters. In this paper we propose and evaluate a simple but very effective method to implement a data warehouse using the computers and workstations typically available in large organizations. The proposed approach is called data warehouse striping with approximate query answering (DWS-AQA). The goal is to use the processing and disk capacity normally available in large workstation networks to implement a data warehouse with a very reduced infrastructure cost. As the data warehouse shares computers that are also being used for other purposes, most of the times only a fraction of the computers will be able to execute the partial queries in time. However, as we show in the paper, the approximated answers estimated from partial results have a very small error for most of the plausible scenarios. Moreover, as the data warehouse facts are partitioned in a strict uniform way, it is possible to calculate tight confidence intervals for the approximated answers, providing the user with a measure of the accuracy of the query results. A set of experiments on the TPC-H benchmark database is presented to show the accuracy of DWS-AQA for a large number of scenarios. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • XGL: a graphical query language for XML

    Page(s): 86 - 95
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (824 KB) |  | HTML iconHTML  

    In this paper we present a graphical query language for XML. The language, based on a simple form of graph grammars, permits us to extract data and reorganize information in a new structure. As with most of the current query languages for XML, queries consist of two parts: one extracting a sub-graph and one constructing the output graph. The semantics of queries is given in terms of graph grammars. The use of graph grammars makes it possible to define, in a simple way, the structural properties of both the subgraph that has to be extracted and the graph that has to be constructed. By means of examples, we show the effectiveness and simplicity of our approach. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Fast filter-and-refine algorithms for subsequence selection

    Page(s): 243 - 254
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (505 KB)  

    Large sequence databases, such as protein, DNA and gene sequences in biology, are becoming increasingly common. An important operation on a sequence database is approximate subsequence matching, where all subsequences that are within some distance from a given query string are retrieved. This paper proposes a filter-and-refine algorithm that enables efficient approximate subsequence matching in large DNA sequence databases. It employs a bitmap indexing structure to condense and encode each data sequence into a shorter index sequence. During query processing, the bitmap index is used to filter out most of the irrelevant subsequences, and false positives are removed in the final refinement step. Analytical and experimental studies show that the proposed strategy is capable of reducing response time substantially while incurring only a small space overhead. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Nearest neighbor and reverse nearest neighbor queries for moving objects

    Page(s): 44 - 53
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (357 KB) |  | HTML iconHTML  

    With the proliferation of wireless communications and the rapid advances in technologies for tracking the positions of continuously moving objects, algorithms for efficiently answering queries about large numbers of moving objects increasingly are needed. One such query is the reverse nearest neighbor (RNN) query that returns the objects that have a query object as their closest object. While algorithms have been proposed that compute RNN queries for non-moving objects, there have been no proposals for answering RNN queries for continuously moving objects. Another such query is the nearest neighbor (NN) query, which has been studied extensively and in many contexts. Like the RNN query, the NN query has not been explored for moving query and data points. This paper proposes an algorithm for answering RNN queries for continuously moving points in the plane. As a part of the solution to this problem and as a separate contribution, an algorithm for answering NN queries for continuously moving points is also proposed. The results of performance experiments are reported. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An index structure for improving closest pairs and related join queries in spatial databases

    Page(s): 140 - 149
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (354 KB) |  | HTML iconHTML  

    Spatial databases have grown in importance in various fields. Together with them come various types of queries that need to be answered effectively. While queries involving a single data set have been studied extensively, join queries on multi-dimensional data like the k-closest pairs and the nearest neighbor joins have only recently received attention. In this paper we propose a new index structure, the b-Rdnn tree, to solve different join queries. The structure is similar to the Rdnn-tree for reverse nearest neighbor queries. Based on this new index structure, we give algorithms for various join queries in spatial databases. It is especially effective for k-closest pair queries, where earlier algorithms using the R*-tree can be very inefficient in many real life circumstances. To this end we present experimental results on k-closest pair queries to support the fact that our index structure is a better alternative. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A multi-agent model for handling e-commerce activities

    Page(s): 202 - 211
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (892 KB)  

    In this paper we propose a multi-agent model for handling e-commerce activities. In our model, an agent is present in each e-commerce site, managing the information stored therein. In addition, another agent is associated with each customer handling her/his profile. The proposed model is based on the exploitation of a particular conceptual model, called the B-SDR network, capable of representing and handling both information stored in e-commerce sites and customer profiles. The capabilities of the B-SDR network model are exploited to let customer and site agents to cooperate in such a way to support a customer to detect, whenever she/he accesses an e-commerce site, those products and services present in the site itself and better matching her/his interests. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • On implicate discovery and query optimization

    Page(s): 2 - 11
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (330 KB) |  | HTML iconHTML  

    Boolean expression simplification is a well-known problem in the history of computer science. The problem of determining prime implicates from an arbitrary Boolean expression has been mostly studied in the contexts of hardware design and automated reasoning. While many of the same principles can be applied to the simplification of search conditions in ANSI SQL queries, the richness of its language and SQL's three-valued logic present a number of challenges. We propose a modified version of a matrix-based normalization algorithm suitable for normalizing SQL search conditions in constrained-memory environments. In particular we describe a set of tradeoffs that enable our algorithm to discover a useful set of implicates without requiring a complete conversion of the input condition to a normal form, preventing a combinatorial explosion in the number of terms. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Interval processing with the UB-Tree

    Page(s): 12 - 22
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (395 KB)  

    Advanced data warehouses and Web databases have set the demand for processing large sets of time ranges, quality classes, fuzzy data, personalized data and extended objects. Since, all of these data types can be mapped to intervals, interval indexing can dramatically speed up or even be an enabling technology for these new applications. We introduce a method for managing intervals by indexing the dual space with the UB-Tree. We show that our method is an effective and efficient solution, benefitting from all good characteristics of the UB-Tree, i.e., concurrency control, worst case guarantees for insertion, deletion and update as well as efficient query processing. Our technique can easily be integrated into an RDBMS engine providing the UB-Tree as access method. We also show that our technique is superior and more flexible to previously suggested techniques. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • View merging in the context of view selection

    Page(s): 33 - 42
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (327 KB) |  | HTML iconHTML  

    Materialized views can provide massive improvements in query processing time, especially for aggregation queries over large tables. To achieve the potential of materialized views, we must determine what views to materialize. An important issue in view selection is view merging. View merging can take a set of candidate views generated by analyzing queries in a workload, and produce a set of merged views by exploiting commonality among those queries. View merging can efficiently reduce candidate views for view selection. We present a merging tree, as well as a fast and scalable algorithm for view merging based on such a tree. The merging tree can significantly reduce the search space of potential views to be merged. Our approach is more scalable than the alternative of sequentially merging all pairs of views every time. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Integrating HTML tables using semantic hierarchies and meta-data sets

    Page(s): 160 - 169
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (901 KB) |  | HTML iconHTML  

    As the Internet is a global network, there is a demand on accessing closely related data without browsing through different Web documents. A significant amount of these data are presented in HTML documents. Since data contents of HTML documents are intervened by markups, it is not trivial to integrate and provide a unified view of closely related data in different HTML documents. In this paper we present an approach for integrating semantically related data in any HTML tables that belong to a particular domain of interest (ID), such as house/apartment rental, by using the semantic hierarchies generated from the tables and the predefined meta-data sets that indicate related column names in ID. In our approach, we capture each data source as semi-structured data, called semantic hierarchy, and the end result of integrating different HTML tables of ID is a unified view of data in the tables, which is presented in an XML document. Besides HTML tables, our approach can be adopted by any system that integrates semi-structured data across different platforms. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Energy-efficient data broadcasting in mobile ad-hoc networks

    Page(s): 64 - 73
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (305 KB) |  | HTML iconHTML  

    Energy saving is the most important issue in wireless mobile computing due to power constraints on mobile units. Data broadcasting is the main method of information dissemination in wireless networks as its cost is independent of the number of mobile hosts receiving the information. A number of data broadcasting techniques have been proposed for mobile wireless networks, where servers have no energy restrictions, but little research has been done to address the issue of data broadcasting in mobile ad-hoc networks where both servers and clients are nomadic. In this paper, we propose two groups of broadcast scheduling algorithms called adaptive broadcasting and popularity based adaptive broadcasting that consider time constraints on requests as well as energy limitation on both servers and clients. We also present the simulation experiments that compare the algorithms. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Using the F2 OODBMS to support incremental knowledge acquisition

    Page(s): 266 - 275
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (613 KB) |  | HTML iconHTML  

    Ripple down rules (RDR) is an incremental knowledge acquisition (KA) methodology, where a knowledge base (KB) is constructed as a collection of rules with exceptions. Nested ripple down rules (NRDR) is an extension of this methodology which allows the expert to enter her/his own domain concepts and later refine these concepts hierarchically. In this paper we show similarities between incremental knowledge acquisition and database schema evolution, and propose to use the F2 object-oriented database management system (OODBMS) to implement an NRDR knowledge based system. We use the existing non-standard features of F2 and show how multiple instantiation and object migration (known as multiobjects feature in F2), and schema evolution capabilities in F2 easily accommodate all the update mechanisms required to incrementally build an NRDR KB. We illustrate our approach with a KA session. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Implementing federated database systems by compiling SchemaSQL

    Page(s): 192 - 201
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (320 KB) |  | HTML iconHTML  

    Federated systems integrating data from multiple sources must cope with semantic heterogeneity by reasoning over both the data and meta-data of their sources. SchemaSQL is one of a number of related higher-order languages, which have been proposed for succinctly expressing integrated views over heterogeneous sources. We define a method for compiling SchemaSQL into standard SQL. We show that the output of the compilation algorithm is of size O(m+p) where m is the size of the catalogs and p the size of input queries. The resulting code may be executed by existing conventional SQL query engines without modification. We extend our basic compilation method by including type driven optimizations which, empirical evaluation shows, yield an effective execution by native query engines. Prior efforts do not provide feasible guarantees on the size of the compiled programs or require the development of new query engines encompassing higher-order query operators. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Consistency in data warehouse dimensions

    Page(s): 224 - 232
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (284 KB) |  | HTML iconHTML  

    Data warehouses present a powerful framework for storing and analyzing huge amounts of data. In this context analyses focus on data that has been gathered over long periods of time, often between one and five years. A data warehouse can therefore be regarded as a specialized historical database. However not only the data kept in a data warehouse has to be seen in a temporal context, but also the fact that dimension data may undergo changes during such a time period needs to be taken into consideration. In this paper we focus on update operations on dimensions and establish a notion of consistency for guiding such operations. We devise algorithms for executing update operations that can be shown to preserve consistency, and we study their time complexity. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Parallel processing XML documents

    Page(s): 96 - 105
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (271 KB) |  | HTML iconHTML  

    As Web applications are time vulnerable, the increasing size of XML documents and the complexity of evaluating XML queries pose new performance challenges to existing information retrieval technologies. This paper introduces a new approach for developing a purpose-built XML data management system to improve the system performance of a Web site with XML support by using parallel data processing techniques. To improve the system performance, we propose a parallelisation model for XML data processing, where the data storage strategies, data placement methods and query evaluation techniques have been studied. Other related issues are also presented. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Management of multiply represented geographic entities

    Page(s): 150 - 159
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (349 KB) |  | HTML iconHTML  

    Multiple representation of geographic information occurs when a real-world entity is represented more than once in the same or different databases. In this paper we propose a new approach to the modeling of multiply represented entities and the relationships among the entities and their representations. A multiple representation management system is outlined that can manage multiple representations consistently over a number of autonomous databases. Central to our approach is the multiple representation schema language that is used to configure the system. It provides an intuitive and declarative means of modeling multiple representations and specifying rules that are used to maintain consistency, match objects representing the same entity, and restore consistency if necessary. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • YAM2 (yet another multidimensional model): an extension of UML

    Page(s): 172 - 181
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (363 KB) |  | HTML iconHTML  

    This paper presents a multidimensional conceptual object-oriented model, its structures, integrity constraints and query operations. It has been developed as an extension of UML core metaclasses to facilitate its usage, as well as to avoid the introduction of completely new concepts. YAM2 allows the representation of several semantically related star schemas, as well as summarizability and identification constraints. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Methodology for creating a sample subset of dynamic taxonomy to use in navigating medical text databases

    Page(s): 276 - 284
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (374 KB) |  | HTML iconHTML  

    The amount of text available in electronic form is increasing, especially since the rise of the Web. So too are the potential interconnections between concepts, given the advent of ontologies and other relationship based data sources. Text could be navigated using the structure from the ontologies, specifically, using dynamic taxonomies to navigate the is-a relationships. Dynamic taxonomies are rooted index structures that dynamically prune themselves in response to zoom requests. The use of dynamic taxonomies with existing ontologies, and in the medical field, is unexplored. This paper details the process of connecting index terms from a medical text database to a taxonomy extracted from an existing medical ontology. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.