Scheduled System Maintenance:
Some services will be unavailable Sunday, March 29th through Monday, March 30th. We apologize for the inconvenience.
By Topic

Knowledge and Data Engineering, IEEE Transactions on

Issue 2 • Date Feb. 2009

Filter Results

Displaying Results 1 - 17 of 17
  • [Front cover]

    Publication Year: 2009 , Page(s): c1
    Save to Project icon | Request Permissions | PDF file iconPDF (142 KB)  
    Freely Available from IEEE
  • [Inside front cover]

    Publication Year: 2009 , Page(s): c2
    Save to Project icon | Request Permissions | PDF file iconPDF (78 KB)  
    Freely Available from IEEE
  • Learning Image-Text Associations

    Publication Year: 2009 , Page(s): 161 - 177
    Cited by:  Papers (2)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (3344 KB) |  | HTML iconHTML  

    Web information fusion can be defined as the problem of collating and tracking information related to specific topics on the World Wide Web. Whereas most existing work on Web information fusion has focused on text-based multidocument summarization, this paper concerns the topic of image and text association, a cornerstone of cross-media Web information fusion. Specifically, we present two learning methods for discovering the underlying associations between images and texts based on small training data sets. The first method based on vague transformation measures the information similarity between the visual features and the textual features through a set of predefined domain-specific information categories. Another method uses a neural network to learn direct mapping between the visual and textual features by automatically and incrementally summarizing the associated features into a set of information templates. Despite their distinct approaches, our experimental results on a terrorist domain document set show that both methods are capable of learning associations between images and texts from a small training data set. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Decompositional Rule Extraction from Support Vector Machines by Active Learning

    Publication Year: 2009 , Page(s): 178 - 191
    Cited by:  Papers (12)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (3932 KB) |  | HTML iconHTML  

    Support vector machines (SVMs) are currently state-of-the-art for the classification task and, generally speaking, exhibit good predictive performance due to their ability to model nonlinearities. However, their strength is also their main weakness, as the generated nonlinear models are typically regarded as incomprehensible black-box models. In this paper, we propose a new active learning-based approach (ALBA) to extract comprehensible rules from opaque SVM models. Through rule extraction, some insight is provided into the logics of the SVM model. ALBA extracts rules from the trained SVM model by explicitly making use of key concepts of the SVM: the support vectors, and the observation that these are typically close to the decision boundary. Active learning implies the focus on apparent problem areas, which for rule induction techniques are the regions close to the SVM decision boundary where most of the noise is found. By generating extra data close to these support vectors that are provided with a class label by the trained SVM model, rule induction techniques are better able to discover suitable discrimination rules. This performance increase, both in terms of predictive accuracy as comprehensibility, is confirmed in our experiments where we apply ALBA on several publicly available data sets. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Multiclass MTS for Simultaneous Feature Selection and Classification

    Publication Year: 2009 , Page(s): 192 - 205
    Cited by:  Papers (9)  |  Patents (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (3499 KB) |  | HTML iconHTML  

    Multiclass Mahalanobis-Taguchi system (MMTS), the extension of MTS, is developed for simultaneous multiclass classification and feature selection. In MMTS, the multiclass measurement scale is constructed by establishing an individual Mahalanobis space for each class. To increase the validity of the measurement scale, the Gram-Schmidt process is performed to mutually orthogonalize the features and eliminate the multicollinearity. The important features are identified using the orthogonal arrays and the signal-to-noise ratio, and are then used to construct a reduced model measurement scale. The contribution of each important feature to classification is also derived according to the effect gain to develop a weighted Mahalanobis distance which is finally used as the distance metric for the classification of MMTS. Using the reduced model measurement scale, an unknown example will be classified into the class with minimum weighted Mahalanobis distance considering only the important features. For evaluating the effectiveness of MMTS, a numerical experiment is implemented, and the results show that MMTS outperforms other well-known algorithms not only on classification accuracy but also on feature selection efficiency. Finally, a real case about gestational diabetes mellitus is studied, and the results indicate the practicality of MMTS in real-world applications. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • k-Anonymization with Minimal Loss of Information

    Publication Year: 2009 , Page(s): 206 - 219
    Cited by:  Papers (4)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (437 KB) |  | HTML iconHTML  

    The technique of k-anonymization allows the releasing of databases that contain personal information while ensuring some degree of individual privacy. Anonymization is usually performed by generalizing database entries. We formally study the concept of generalization, and propose three information-theoretic measures for capturing the amount of information that is lost during the anonymization process. The proposed measures are more general and more accurate than those that were proposed by Meyerson and Williams and Aggarwal et al. We study the problem of achieving k-anonymity with minimal loss of information. We prove that it is NP-hard and study polynomial approximations for the optimal solution. Our first algorithm gives an approximation guarantee of O(ln k) for two of our measures as well as for the previously studied measures. This improves the best-known O(k)-approximation in. While the previous approximation algorithms relied on the graph representation framework, our algorithm relies on a novel hypergraph representation that enables the improvement in the approximation ratio from O(k) to O(ln k). As the running time of the algorithm is O(n2k}), we also show how to adapt the algorithm in in order to obtain an O(k)-approximation algorithm that is polynomial in both n and k. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Cost-Based Predictive Spatiotemporal Join

    Publication Year: 2009 , Page(s): 220 - 233
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (3431 KB) |  | HTML iconHTML  

    A predictive spatiotemporal join finds all pairs of moving objects satisfying a join condition on future time and space. In this paper, we present CoPST, the first and foremost algorithm for such a join using two spatiotemporal indexes. In a predictive spatiotemporal join, the bounding boxes of the outer index are used to perform window searches on the inner index, and these bounding boxes enclose objects with increasing laxity over time. CoPST constructs globally tightened bounding boxes "on the fly" to perform window searches during join processing, thus significantly minimizing overlap and improving the join performance. CoPST adapts gracefully to large-scale databases, by dynamically switching between main-memory buffering and disk-based buffering, through a novel probabilistic cost model. Our extensive experiments validate the cost model and show its accuracy for realistic data sets. We also showcase the superiority of CoPST over algorithms adapted from state-of-the-art spatial join algorithms, by a speedup of up to an order of magnitude. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • BMQ-Processor: A High-Performance Border-Crossing Event Detection Framework for Large-Scale Monitoring Applications

    Publication Year: 2009 , Page(s): 234 - 252
    Cited by:  Papers (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (3361 KB) |  | HTML iconHTML  

    In this paper, we present BMQ-Processor, a high-performance border-crossing event (BCE) detection framework for large-scale monitoring applications. We first characterize a new query semantics, namely, border monitoring query (BMQ), which is useful for BCE detection in many monitoring applications. It monitors the values of data streams and reports them only when data streams cross the borders of its range. We then propose BMQ-Processor to efficiently handle a large number of BMQs over a high volume of data streams. BMQ-Processor efficiently processes BMQs in a shared and incremental manner. It develops and operates over a novel stateful query index, achieving a high level of scalability over continuous data updates. Also, it utilizes the locality embedded in data streams and greatly accelerates successive BMQ evaluations. We present data structures and algorithms to support 1D as well as multidimensional BMQs. We show that the semantics of border monitoring can be extended toward more advanced ones and build region transition monitoring as a sample case. Lastly, we demonstrate excellent processing performance and low storage cost of BMQ-Processor through extensive analysis and experiments. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Privacy-Preserving Kth Element Score over Vertically Partitioned Data

    Publication Year: 2009 , Page(s): 253 - 258
    Cited by:  Papers (34)
    Multimedia
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (245 KB) |  | HTML iconHTML  

    Given a large integer data set shared vertically by two parties, we consider the problem of securely computing a score separating the kth and the (k + 1) to compute such a score while revealing little additional information. The proposed protocol is implemented using the Fairplay system and experimental results are reported. We show a real application of this protocol as a component used in the secure processing of top-k queries over vertically partitioned data. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Semantic Access to Multichannel M-Services

    Publication Year: 2009 , Page(s): 259 - 272
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2295 KB) |  | HTML iconHTML  

    M-services provide mobile users wireless access to Web services. In this paper, we present a novel infrastructure for supporting M-services in wireless broadcast systems. The proposed infrastructure provides a generic framework for mobile users to look up, access, and execute Web services over wireless broadcast channels. Access efficiency is an important issue in wireless broadcast systems. We discuss different semantics that have impact on the access efficiency for composite M-services. A multiprocess workflow is proposed for effectively accessing composite M-services from multiple broadcast channels based on these semantics. We also present and compare different broadcast channel organizations for M-services and wireless data. Analytical models are provided for these channel organizations. Practical studies are presented to demonstrate the impact of different semantics and channel organizations on the access efficiency. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Online Scheduling Sequential Objects with Periodicity for Dynamic Information Dissemination

    Publication Year: 2009 , Page(s): 273 - 286
    Cited by:  Papers (4)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (3681 KB) |  | HTML iconHTML  

    The scalability of data broadcasting has been manifested by prior studies on the base of the traditional data management systems where data objects, mapped to a pair of state and value in the database, are independent, persistent, and static against simple queries. However, many modern information applications spread dynamic data objects and process complex queries for retrieving multiple data objects. Particularly, the information servers dynamically generate data objects that are dependent and can be associated into a complete response against complex queries. Accordingly, the study in this paper considers the problem of scheduling dynamic broadcast data objects in a clients-providers-servers system from the standpoint of data association, dependency, and dynamics. Since the data broadcast problem is NP-hard, we derive the lower and the upper bounds of the mean service access time. In light of the theoretical analyses, we further devise a deterministic algorithm with several gain measure functions for the approximation of schedule optimization. The experimental results show that the proposed algorithm is able to generate a dynamic broadcast schedule and also minimize the mean service access time to the extent of being very close to the theoretical optimum. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Storing and Indexing Spatial Data in P2P Systems

    Publication Year: 2009 , Page(s): 287 - 300
    Cited by:  Papers (6)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2793 KB) |  | HTML iconHTML  

    The peer-to-peer (P2P) paradigm has become very popular for storing and sharing information in a totally decentralized manner. At first, research focused on P2P systems that host 1D data. Nowadays, the need for P2P applications with multidimensional data has emerged, motivating research on P2P systems that manage such data. The majority of the proposed techniques are based either on the distribution of centralized indexes or on the reduction of multidimensional data to one dimension. Our goal is to create from scratch a technique that is inherently distributed and also maintains the multidimensionality of data. Our focus is on structured P2P systems that share spatial information. We present SpatialP2P, a totally decentralized indexing and searching framework that is suitable for spatial data. SpatialP2P supports P2P applications in which spatial information of various sizes can be dynamically inserted or deleted, and peers can join or leave. The proposed technique preserves well locality and directionality of space. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Call for Papers for Special Issue on Domain-Driven Data Mining

    Publication Year: 2009 , Page(s): 301
    Save to Project icon | Request Permissions | PDF file iconPDF (47 KB)  
    Freely Available from IEEE
  • Call for Papers for Special Issue on Mining Large Uncertain and Probabilistic Databases

    Publication Year: 2009 , Page(s): 302
    Save to Project icon | Request Permissions | PDF file iconPDF (30 KB)  
    Freely Available from IEEE
  • Call for Papers for Special Issue on Rule Representation, Interchange, and Reasoning in Distributed, Heterogeneous Environments

    Publication Year: 2009 , Page(s): 303
    Save to Project icon | Request Permissions | PDF file iconPDF (40 KB)  
    Freely Available from IEEE
  • TKDE Information for authors

    Publication Year: 2009 , Page(s): c3
    Save to Project icon | Request Permissions | PDF file iconPDF (78 KB)  
    Freely Available from IEEE
  • [Back cover]

    Publication Year: 2009 , Page(s): c4
    Save to Project icon | Request Permissions | PDF file iconPDF (142 KB)  
    Freely Available from IEEE

Aims & Scope

IEEE Transactions on Knowledge and Data Engineering (TKDE) informs researchers, developers, managers, strategic planners, users, and others interested in state-of-the-art and state-of-the-practice activities in the knowledge and data engineering area.

Full Aims & Scope

Meet Our Editors

Editor-in-Chief
Jian Pei
Simon Fraser University

Associate Editor-in-Chief
Xuemin Lin
University of New South Wales

Associate Editor-in-Chief
Lei Chen
Hong Kong University of Science and Technology