By Topic

Knowledge and Data Engineering, IEEE Transactions on

Issue 6 • Date Dec 1994

Filter Results

Displaying Results 1 - 13 of 13
  • An improved algorithm for implication testing involving arithmetic inequalities

    Page(s): 997 - 1001
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (520 KB)  

    Implication testing of arithmetic inequalities has been widely used in different areas in database systems and has received extensive research as well. Klug and Ullman (A. Klug, 1988; and J.D. Ullman, 1989) proposed an algorithm that determines whether S implies T, where T and S consist of inequalities of form (X op Y), X and Y are two variables, and opε {=<, ⩽, ≠, >, ⩾}. The complexity of the algorithm is O(n3), where n is the number of inequalities in S. We reduce the problem to matrix multiplication, thus improving the time bound to O(n2.376). We also demonstrate an O(n2 ) algorithm if the number of inequalities in T is bounded by O(n). Since matrix multiplication has been well studied, our reduction allows the possibility of directly adopting many practical results for managing matrices and their operations, such as parallel computation and efficient representation of sparse matrices View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A methodology for integration of heterogeneous databases

    Page(s): 920 - 933
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1220 KB)  

    The transformation of existing local databases to meet diverse application needs at the global level is performed through a four-layered procedure that stresses total schema integration and virtual integration of local databases. The proposed methodology covers both schema integration and database integration, and uses a four-layered schema architecture (local schemata, local object schemata, global schema, and global view schemata) with each layer presenting an integrated view of the concepts that characterize the layer below. Mechanisms for accomplishing this objective are presented in theoretical terms, along with a running example. Object equivalence classes, property equivalence classes, and other related concepts are discussed in the context of logical integration of heterogeneous schemata, while object instance equivalence classes and property instance equivalence classes, and other related concepts are discussed for data integration purposes. The proposed methodology resolves naming conflicts, scaling conflicts, type conflicts, and level of abstraction, and other types of conflicts during schema integration, and data inconsistencies during data integration View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Sort vs. hash revisited

    Page(s): 934 - 944
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1100 KB)  

    Efficient algorithms for processing large volumes of data are very important both for relational and new object-oriented database systems. Many query-processing operations can be implemented using sort- or hash-based algorithms, e.g. intersections, joins, and duplicate elimination. In the early relational database systems, only sort-based algorithms were employed. In the last decade, hash-based algorithms have gained acceptance and popularity, and are often considered generally superior to sort-based algorithms such as merge-join. In this article, we compare the concepts behind sort- and hash-based query-processing algorithms and conclude that (1) many dualities exist between the two types of algorithms, (2) their costs differ mostly by percentages rather than by factors, (3) several special cases exist that favor one or the other choice, and (4) there is a strong reason why both hash- and sort-based algorithms should be available in a query-processing system. Our conclusions are supported by experiments performed using the Volcano query execution engine View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Performance evaluation of rule grouping on a real-time expert system architecture

    Page(s): 883 - 891
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (792 KB)  

    Uses a Markov process to model a real-time expert system architecture characterized by message passing and event-driven scheduling. The model is applied to the performance evaluation of rule grouping for real-time expert systems running on this architecture. An optimizing algorithm based on Kernighan-Lin heuristic graph partitioning for the real-time architecture is developed and a demonstration system based on the model and algorithm has been developed and tested on a portion of the advanced GPS receiver (AGR) and manned manoeuvring unit (MMU) knowledge bases View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Optimal allocation for partially replicated database systems on ring networks

    Page(s): 975 - 982
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (612 KB)  

    Considers a distributed database with partial replication of data objects located on a ring network. Certain placements of replicated objects are shown to optimize the probability of read-only success and the probability of write-only success. We also obtain optimal placements for k-terminal reliability and expected minimal path length for read-only and write-only operations View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • DBMS support for nonmetric measurement systems

    Page(s): 945 - 953
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (808 KB)  

    In commercial DBMSs, it is possible to model only primitive numeric data types and perform arithmetic between them. In practice, however, organizations need to store and manipulate more complex numeric elements that could be interpreted as representing quantities of a non-numeric measurement system like that in which a distance is expressed in feet and inches, or like that in which a weight is expressed in quarters, stones, pounds, and ounces. This implies that users have to choose between two options, either to abandon the nonmetric system and completely adapt their applications to the limited capabilities of the DBMS, or to write their own pieces of code for the management of such more complex numeric data types. The first approach is principally unacceptable, and at the same time, there is a loss in the precision of arithmetic operations, because quantities have to be expressed as real numbers. The second one is tedious, because distinct pieces of code have to be written for handling different nonmetric units. Furthermore, integrity checking for these pieces of data has to be performed by application programs rather than by the DBMS. To overcome these problems, a new generic data type is proposed, the composite number, whose support automatically enables the use of any nonmetric measurement system. Functions and operations are defined for the management of composites. Because, in practice, time is usually expressed in many distinct nonmetric measurement units whose choice depends on the particular application, temporal databases represent one of the many application areas of the proposed formalization View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Implementation of rule-based information systems for integrated manufacturing

    Page(s): 892 - 908
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1532 KB)  

    Focuses on the development of a methodology within a software environment for automating the rule-based implementation of specifications of integrated manufacturing information systems. The specifications are initially formulated in a natural language and subsequently represented in terms of a graphical representation by the system designer. A new graphical representation tool is based on updated Petri nets (UPN) that we have developed as a specialized version of colored Petri nets. The rule-based implementation approach utilizes the similarity of features between UPN and the general rule specification language used for the implementation. The automation of the translation of UPN to the rule specification language is expected to considerably reduce the life-cycle for design and implementation of the system. The application presented deals with the control and management of information flow between the computer-aided design, process planning, manufacturing resource planning and shop floor control databases. This provides an integrated information framework for computer integrated manufacturing systems View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • New algorithms for parallelizing relational database joins in the presence of data skew

    Page(s): 990 - 997
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (756 KB)  

    Parallel processing is an attractive option for relational database systems. As in any parallel environment however, load balancing is a critical issue which affects overall performance. Load balancing for one common database operation in particular, the join of two relations, can be severely hampered for conventional parallel algorithms, due to a natural phenomenon known as data skew. In a pair of recent papers (J. Wolf et al., 1993; 1993), we described two new join algorithms designed to address the data skew problem. We propose significant improvements to both algorithms, increasing their effectiveness while simultaneously decreasing their execution times. The paper then focuses on the comparative performance of the improved algorithms and their more conventional counterparts. The new algorithms outperform their more conventional counterparts in the presence of just about any skew at all, dramatically so in cases of high skew View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A fuzzy reasoning database question answering system

    Page(s): 868 - 882
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1200 KB)  

    Describes a question-answering system based on fuzzy logic. The proposed system provides the capability to assess whether a database contains information pertinent to a subject of interest by evaluating each comment in the database via a fuzzy evaluator that attributes a fuzzy membership value indicating its relationship to the subject. An assessment is provided for the database as a whole regarding its pertinence to the subject of interest, and consequently comments that are considered irrelevant to the subject may be discarded. The system has been developed for the examination of databases that were created during the development of the IBM 4381 computer systems, for bookkeeping purposes, to assess whether such databases contain information pertinent to the functional changes that occurred during the development cycle. The system, however, can be applied with minimal changes to a variety of circumstances, provided that the fundamental assumptions for the development of the membership functions are respected in the new application. Its applicability, without modifications, assuming the same subject of interest, is granted for databases comprising similar characteristics to that of the original database for which the system has been developed View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Learning concepts in parallel based upon the strategy of version space

    Page(s): 857 - 867
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (940 KB)  

    Applies the technique of parallel processing to concept learning. A parallel version-space learning algorithm based upon the principle of divide-and-conquer is proposed. Its time complexity is analyzed to be O(k log2n) with n processors, where n is the number of given training instances and k is a coefficient depending on the application domains. For a bounded number of processors in real situations, a modified parallel learning algorithm is then proposed. Experimental results are then performed on a real learning problem, showing that our parallel learning algorithm works, and being quite consistent with the results of theoretical analysis. We conclude that when the number of training instances is large, it is worth learning in parallel because of its faster execution View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • PREPARE: a tool for knowledge base verification

    Page(s): 983 - 989
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (588 KB)  

    The knowledge base is the most important component in a knowledge-based system. Because a knowledge base is often built in an incremental, piecemeal fashion, potential errors may be inadvertently brought into it. One of the critical issues in developing reliable knowledge-based systems is how to verify the correctness of a knowledge base. The paper describes an automated tool called PREPARE for detecting potential errors in a knowledge base. PREPARE is based on modeling a knowledge base by using a predicate/transition net representation. Inconsistent, redundant, subsumed, circular, and incomplete rules in a knowledge base are then defined as patterns of the predicate/transition net model, and are detected through a syntactic pattern recognition method. The research results to date have indicated that: the methodology ran be adopted in knowledge-based systems where logic is used as knowledge representation formalism; the tool can be invoked at any stage of the system's development, even without a fully functioning inference engine; the predicate/transition net model of knowledge bases is easy to implement and provides a clear and understandable display of the knowledge to be used by the system View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Temporal specialization and generalization

    Page(s): 954 - 974
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2100 KB)  

    A standard relation has two dimensions: attributes and tuples. A temporal relation contains two additional orthogonal time dimensions: valid time records when facts are true in the modeled reality, and transaction time records when facts are stored in the temporal relation. Although there are no restrictions between the valid time and transaction time associated with each fact, in many practical applications the valid and transaction times exhibit restricted interrelationships that define several types of specialized temporal relations. This paper examines areas where different specialized temporal relations are present. In application systems with multiple, interconnected temporal relations, multiple time dimensions may be associated with facts as they flow from one temporal relation to another. The paper investigates several aspects of the resulting generalized temporal relations, including the ability to query a predecessor relation from a successor relation. The presented framework for generalization and specialization allows one to precisely characterize and compare temporal relations and the application systems in which they are embedded. The framework's comprehensiveness and its use in understanding temporal relations are demonstrated by placing previously proposed temporal data models within the framework. The practical relevance of the defined specializations and generalizations is illustrated by sample realistic applications in which they occur. The additional semantics of specialized relations are especially useful for improving the performance of query processing View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • ConClass: a framework for real-time distributed knowledge-based processing

    Page(s): 909 - 919
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1080 KB)  

    We have developed a problem-solving framework, called ConClass, that is capable of classifying continuous real-time problems dynamically and concurrently on a distributed system. ConClass provides an efficient development environment for describing and decomposing a classification problem and synthesizing solutions. In ConClass, decomposed concurrent subproblems specified by the application developer effectively correspond to the actual distributed hardware elements. This scheme is useful for designing and implementing efficient distributed processing, making it easier to anticipate and evaluate system behavior. The ConClass system provides an object replication feature that prevents any particular object from being overloaded. In order to deal with an indeterminate amount of problem data, ConClass dynamically creates object networks that justify hypothesized solutions, and thus achieves a dynamic load distribution. A number of efficient execution mechanisms that manage a variety of asynchronous aspects of distributed processing have been implemented without using schedulers or synchronization schemes that are liable to develop bottlenecks. We have confirmed the efficiency of parallel distributed processing and load balancing of ConClass with an experimental application View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.

Aims & Scope

IEEE Transactions on Knowledge and Data Engineering (TKDE) informs researchers, developers, managers, strategic planners, users, and others interested in state-of-the-art and state-of-the-practice activities in the knowledge and data engineering area.

Full Aims & Scope

Meet Our Editors

Editor-in-Chief
Jian Pei
Simon Fraser University