By Topic

IEEE Transactions on Knowledge and Data Engineering

Issue 8 • Date Aug. 2004

Filter Results

Displaying Results 1 - 15 of 15
  • [Front cover]

    Publication Year: 2004, Page(s): c1
    Request permission for commercial reuse | PDF file iconPDF (147 KB)
    Freely Available from IEEE
  • [Inside front cover]

    Publication Year: 2004, Page(s): c2
    Request permission for commercial reuse | PDF file iconPDF (77 KB)
    Freely Available from IEEE
  • Influential rule search scheme (IRSS) - a new fuzzy pattern classifier

    Publication Year: 2004, Page(s):881 - 893
    Cited by:  Papers (12)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (1248 KB) | HTML iconHTML

    Automatic generation of fuzzy rule base and membership functions from an input-output data set, for reliable construction of an adaptive fuzzy inference system, has become an important area of research interest. We propose a new robust, fast acting adaptive fuzzy pattern classification scheme, named influential rule search scheme (IRSS). In IRSS, rules which are most influential in contributing to... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Learning functions using randomized genetic code-like transformations: probabilistic properties and experimentations

    Publication Year: 2004, Page(s):894 - 908
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (880 KB) | HTML iconHTML

    Inductive learning of nonlinear functions plays an important role in constructing predictive models and classifiers from data. We explore a novel randomized approach to construct linear representations of nonlinear functions proposed elsewhere [H. Kargupta (2001)], [H. Kargupta et al., (2002)]. This approach makes use of randomized codebooks, called the genetic code-like transformations (GCTs) for... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Efficient disk-based K-means clustering for relational databases

    Publication Year: 2004, Page(s):909 - 921
    Cited by:  Papers (23)  |  Patents (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (872 KB) | HTML iconHTML

    K-means is one of the most popular clustering algorithms. We introduce an efficient disk-based implementation of K-means. The proposed algorithm is designed to work inside a relational database management system. It can cluster large data sets having very high dimensionality. In general, it only requires three scans over the data set. It is optimized to perform heavy disk I/O and its memory requir... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Mining constrained gradients in large databases

    Publication Year: 2004, Page(s):922 - 938
    Cited by:  Papers (20)  |  Patents (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (832 KB) | HTML iconHTML

    Many data analysis tasks can be viewed as search or mining in a multidimensional space (MDS). In such MDSs, dimensions capture potentially important factors for given applications, and cells represent combinations of values for the factors. To systematically analyze data in MDS, an interesting notion, called "cubegrade" was recently introduced by Imielinski et al. [2002], which focuses on the nota... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Privacy: a machine learning view

    Publication Year: 2004, Page(s):939 - 948
    Cited by:  Papers (9)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (288 KB) | HTML iconHTML

    The problem of disseminating a data set for machine learning while controlling the disclosure of data source identity is described using a commuting diagram of functions. This formalization is used to present and analyze an optimization problem balancing privacy and data utility requirements. The analysis points to the application of a generalization mechanism for maintaining privacy in view of ma... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • TopCat: data mining for topic identification in a text corpus

    Publication Year: 2004, Page(s):949 - 964
    Cited by:  Papers (37)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (1472 KB) | HTML iconHTML

    TopCat (topic categories) is a technique for identifying topics that recur in articles in a text corpus. Natural language processing techniques are used to identify key entities in individual articles, allowing us to represent an article as a set of items. This allows us to view the problem in a database/data mining context: Identifying related groups of items. We present a novel method for identi... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An efficient algorithm to compute differences between structured documents

    Publication Year: 2004, Page(s):965 - 979
    Cited by:  Papers (9)  |  Patents (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (1304 KB) | HTML iconHTML

    SGML/XML are having a profound impact on data modeling and processing. We present an efficient algorithm to compute differences between old and new versions of an SGML/XML document. The difference between the two versions can be considered to be an edit script that transforms one document tree into another. The proposed algorithm is based on a hybridization of bottom-up and top-down methods: The m... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Multistrategy ensemble learning: reducing error by combining ensemble learning techniques

    Publication Year: 2004, Page(s):980 - 991
    Cited by:  Papers (32)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (1656 KB) | HTML iconHTML

    Ensemble learning strategies, especially boosting and bagging decision trees, have demonstrated impressive capacities to improve the prediction accuracy of base learning algorithms. Further gains have been demonstrated by strategies that combine simple ensemble formation approaches. We investigate the hypothesis that the improvement in accuracy of multistrategy approaches to ensemble learning is d... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Optimizing top-k selection queries over multimedia repositories

    Publication Year: 2004, Page(s):992 - 1009
    Cited by:  Papers (54)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (736 KB) | HTML iconHTML

    Repositories of multimedia objects having multiple types of attributes (e.g., image, text) are becoming increasingly common. A query on these attributes will typically, request not just a set of objects, as in the traditional relational query model (filtering), but also a grade of match associated with each object, which indicates how well the object matches the selection condition (ranking). Furt... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Automatic control of workflow processes using ECA rules

    Publication Year: 2004, Page(s):1010 - 1023
    Cited by:  Papers (52)  |  Patents (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (1896 KB) | HTML iconHTML

    Changes in recent business environments have created the necessity for a more efficient and effective business process management. The workflow management system is software that assists in defining business processes as well as automatically controlling the execution of the processes. We propose a new approach to the automatic execution of business processes using event-condition-action (ECA) rul... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • [Advertisement]

    Publication Year: 2004, Page(s): 1024
    Request permission for commercial reuse | PDF file iconPDF (397 KB)
    Freely Available from IEEE
  • TKDE Information for authors

    Publication Year: 2004, Page(s): c3
    Request permission for commercial reuse | PDF file iconPDF (77 KB)
    Freely Available from IEEE
  • [Back cover]

    Publication Year: 2004, Page(s): c4
    Request permission for commercial reuse | PDF file iconPDF (147 KB)
    Freely Available from IEEE

Aims & Scope

IEEE Transactions on Knowledge and Data Engineering (TKDE) informs researchers, developers, managers, strategic planners, users, and others interested in state-of-the-art and state-of-the-practice activities in the knowledge and data engineering area.

Full Aims & Scope

Meet Our Editors

Editor-in-Chief
Jian Pei
Simon Fraser University

Associate Editor-in-Chief
Xuemin Lin
University of New South Wales

Associate Editor-in-Chief
Lei Chen
Hong Kong University of Science and Technology