By Topic

Knowledge and Data Engineering, IEEE Transactions on

Issue 3 • Date May/Jun 1997

Filter Results

Displaying Results 1 - 12 of 12
  • Optimal design of multiple hash tables for concurrency control

    Page(s): 384 - 390
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (224 KB)  

    In this paper, we propose the approach of using multiple hash tables for lock requests with different data access patterns to minimize the number of false contentions in a data sharing environment. We first derive some theoretical results on using multiple hash tables. Then, in light of these derivations, a two-step procedure to design multiple hash tables is developed. In the first step, data items are partitioned into a given number of groups. Each group of data items is associated with the use of a hash table in such a way that lock requests to data items in the same group will be hashed into the same hash table. In the second step, given an aggregate hash table size, the hash table size for each individual data group is optimally determined so as to minimize the number of false contentions. Some design examples and remarks on the proposed method are given. It is observed from real database systems that different data sets usually have their distinct data access patterns, thus resulting in an environment where this approach can offer significant performance improvement View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Default logic as a query language

    Page(s): 448 - 463
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (436 KB)  

    Research in nonmonotonic reasoning has focused largely on the idea of representing knowledge about the world via rules that are generally true but can be defeated. Even if relational databases are nowadays the main tool for storing very large sets of data, the approach of using nonmonotonic AI formalisms as relational database query languages has been investigated to a much smaller extent. In this work, we propose a novel application of Reiter's default logic by introducing a default query language (DQL) for finite relational databases, which is based on default rules. The main result of this paper is that DQL is as expressive as SO∃∀ the existential-universal fragment of second-order logic. This result is not only of theoretical importance: We exhibit queries-which are useful in practice-that can be expressed with DQL and cannot with other query languages based on nonmonotonic logics such as DATALOG with negation under the stable model semantics. In particular, we show that DQL is well-suited for diagnostic reasoning View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An efficient multiversion access structure

    Page(s): 391 - 409
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (452 KB)  

    An efficient multiversion access structure for a transaction-time database is presented. Our method requires optimal storage and query times for several important queries and logarithmic update times. Three version operations-inserts, updates, and deletes-are allowed on the current database, while queries are allowed on any version, present or past. The following query operations are performed in optimal query time: key range search, key history search, and time range view. The key-range query retrieves all records having keys in a specified key range at a specified time; the key history query retrieves all records with a given key in a specified time range; and the time range view query retrieves all records that were current during a specified time interval. Special cases of these queries include the key search query, which retrieves a particular version of a record, and the snapshot query which reconstructs the database at some past time. To the best of our knowledge no previous multiversion access structure simultaneously supports all these query and version operations within these time and space bounds. The bounds on query operations are worst case per operation, while those for storage space and version operations are (worst-case) amortized over a sequence of version operations. Simulation results show that good storage utilization and query performance is obtained View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • SQL extension for interval data

    Page(s): 480 - 499
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (280 KB)  

    IXSQL, an extension to SQL, is proposed for the management of interval data. IXSQL is syntactically and semantically upwards consistent with SQL2. Its specification has been based both on theoretical results and actual user requirements for the management of temporal data, a special case of interval data. Design decisions and implementation issues are also discussed View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • On modeling cost functions for object-oriented databases

    Page(s): 500 - 508
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (116 KB)  

    In this paper, we present a set of parameters able to exactly model topologies of object references in object-oriented databases. These parameters are important since they are used to model query execution strategy costs for optimization. The model we present considers also the cases of multivalued attributes and null references. Moreover, a set of derived parameters are introduced and their mathematical derivations are shown. These derived parameters are important, since they allow selectivity of nested predicates to be estimated. Moreover, they are used in estimating storage, access, and update costs for a number of access structures specifically tailored to efficiently support object-oriented queries View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Efficient bulk-loading of gridfiles

    Page(s): 410 - 420
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (264 KB)  

    This paper considers the problem of bulk-loading large data sets for the gridfile multiattribute indexing technique. We propose a rectilinear partitioning algorithm that heuristically seeks to minimize the size of the gridfile needed to ensure no bucket overflows. Empirical studies on both synthetic data sets and on data sets drawn from computational fluid dynamics applications demonstrate that our algorithm is very efficient, and is able to handle large data sets. In addition, we present an algorithm for bulk-loading data sets too large to fit in main memory. Utilizing a sort of the entire data set it creates a gridfile without incurring any overflows View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A performance comparison of locking methods with limited wait depth

    Page(s): 421 - 434
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (220 KB)  

    A number of recent studies have proposed lock conflict resolution methods to improve the performance of standard locking, i.e., strict two-phase locking with the general waiting method. This paper is primarily concerned with the performance of wait depth limited methods with respect to each other and some other methods. The methods considered include the general waiting, wound-wait, and no-waiting methods, symmetric and asymmetric versions of cautious waiting and running priority methods, the wait depth limited (WDL) method, and a modified version of it. In spite of the availability of analytic solutions for most of wait depth limited methods, for reasons given in-the paper, the performance comparison is based on simulation results. The contributions of this study are as follows: 1) modeling assumptions, i.e., a careful definition of transaction restart options; 2) new results concerning the relative performance of wait depth limited methods, which show that a) the running priority method outperforms cautious waiting and may even outperform the WDL method in a system with limited hardware resource, b) WDL outperforms other methods in high lock contention, high capacity systems, and c) modified WDL has a performance comparable to WDL, but incurs less overhead in selecting the abort victim; and 3) contrary to common belief, Tay's Effective Database Size Paradigm for dealing with shared and exclusive locks and/or skewed database accesses in standard locking is applicable to some wait depth limited methods and provides acceptably accurate approximations in others-as long as locking modes for restarted transactions are not resampled View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Analysis of the n-dimensional quadtree decomposition for arbitrary hyperrectangles

    Page(s): 373 - 383
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (536 KB)  

    We give a closed-form expression for the average number of n-dimensional quadtree nodes (“pieces” or “blocks”) required by an n-dimensional hyperrectangle aligned with the axes. Our formula includes as special cases the formulae of previous efforts for two-dimensional spaces. It also agrees with theoretical and empirical results that the number of blocks depends on the hypersurface of the hyperrectangle and not on its hypervolume. The practical use of the derived formula is that it allows the estimation of the space requirements of the n-dimensional quadtree decomposition. Quadtrees are used extensively in two-dimensional spaces (geographic information systems and spatial databases in general), as well in higher dimensionality spaces (as oct-trees for three-dimensional spaces, e.g., in graphics, robotics, and three-dimensional medical images). Our formula permits the estimation of the space requirements for data hyperrectangles when stored in an index structure like a (n-dimensional) quadtree, as well as the estimation of the search time for query hyperrectangles, for the so-called linear quadtrees. A theoretical contribution of the paper is the observation that the number of blocks is a piece-wise linear function of the sides of the hyperrectangle View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Similarity searching in medical image databases

    Page(s): 435 - 447
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (424 KB)  

    We propose a method to handle approximate searching by image content in medical image databases. Image content is represented by attributed relational graphs holding features of objects and relationships between objects. The method relies on the assumption that a fixed number of “labeled” or “expected” objects (e.g., “heart”, “lungs”, etc.) are common in all images of a given application domain in addition to a variable number of “unexpected” or “unlabeled” objects (e.g., “tumor”, “hematoma”, etc.). The method can answer queries by example, such as “find all X-rays that are similar to Smith's X-ray”. The stored images are mapped to points in a multidimensional space and are indexed using state-of-the-art database methods (R-trees). The proposed method has several desirable properties: (a) Database search is approximate, so that all images up to a prespecified degree of similarity (tolerance) are retrieved. (b) It has no “false dismissals” (i.e., all images qualifying query selection criteria are retrieved). (c) It is much faster than sequential scanning for searching in the main memory and on the disk (i.e., by up to an order of magnitude), thus scaling-up well for large databases View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Temporal relational data model

    Page(s): 464 - 479
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (284 KB)  

    This paper incorporates a temporal dimension to nested relations. It combines research in temporal databases and nested relations for managing the temporal data in nontraditional database applications. A temporal data value is represented as a temporal atom; a temporal atom consists of two parts: a temporal set and a value. The temporal atom asserts that the value is valid over the time duration represented by its temporal set. The data model allows relations with arbitrary levels of nesting and can represent the histories of objects and their relationships. Temporal relational algebra and calculus languages are formulated and their equivalence is proved. Temporal relational algebra includes operations to manipulate temporal data and to restructure nested temporal relations. Additionally, we define operations to generate a power set of a relation, a set membership test, and a set inclusion test, which are all derived from the other operations of temporal relational algebra. To obtain a concise representation of temporal data (temporal reduction), collapsed versions of the set-theoretic operations are defined. Procedures to express collapsed operations by the regular operations of temporal relational algebra are included. The paper also develops procedures to completely flatten a nested temporal relation into an equivalent 1 NF relation and back to its original form, thus providing a basis for the semantics of the collapsed operations by the traditional operations on 1 NF relations View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Data on air: organization and access

    Page(s): 353 - 372
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (532 KB)  

    Organizing massive amount of data on wireless communication networks in order to provide fast and low power access to users equipped with palmtops, is a new challenge to the data management and telecommunication communities. Solutions must take under consideration the physical restrictions of low network bandwidth and limited battery life of palmtops. This paper proposes algorithms for multiplexing clustering and nonclustering indexes along with data on wireless networks. The power consumption and the latency for obtaining the required data are considered as the two basic performance criteria for all algorithms. First, this paper describes two algorithms namely, (1, m) indexing and Distributed Indexing, for multiplexing data and its clustering index. Second, an algorithm called Nonclustered Indexing is described for allocating static data and its corresponding nonclustered index. Then, the Nonclustered indexing algorithm is generalized to the case of multiple indexes. Finally, the proposed algorithms are analytically demonstrated to lead to significant improvement of battery life while retaining a low latency View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An improved algorithm for the incremental recomputation of active relational expressions

    Page(s): 508 - 511
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (56 KB)  

    Qian and Wiederhold (1991) presented an algorithm for the incremental recomputation of relational algebra expressions that was claimed to preserve a certain minimality condition. This condition guarantees that the incremental change sets do not contain any unnecessary tuples; so, redundant computations are not performed. We show that, in fact, their algorithm violates this condition. We present an improved algorithm that does preserve this notion of minimality View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.

Aims & Scope

IEEE Transactions on Knowledge and Data Engineering (TKDE) informs researchers, developers, managers, strategic planners, users, and others interested in state-of-the-art and state-of-the-practice activities in the knowledge and data engineering area.

Full Aims & Scope

Meet Our Editors

Editor-in-Chief
Jian Pei
Simon Fraser University