By Topic

Computer and Information Technology Workshops, 2008. CIT Workshops 2008. IEEE 8th International Conference on

Date 8-11 July 2008

Filter Results

Displaying Results 1 - 25 of 126
  • [Front cover]

    Page(s): C1
    Save to Project icon | Request Permissions | PDF file iconPDF (1245 KB)  
    Freely Available from IEEE
  • [Title page i]

    Page(s): i
    Save to Project icon | Request Permissions | PDF file iconPDF (70 KB)  
    Freely Available from IEEE
  • [Title page iii]

    Page(s): iii
    Save to Project icon | Request Permissions | PDF file iconPDF (103 KB)  
    Freely Available from IEEE
  • [Copyright notice]

    Page(s): iv
    Save to Project icon | Request Permissions | PDF file iconPDF (113 KB)  
    Freely Available from IEEE
  • Table of contents

    Page(s): v - xiii
    Save to Project icon | Request Permissions | PDF file iconPDF (157 KB)  
    Freely Available from IEEE
  • Message from the Chairs

    Page(s): xiv
    Save to Project icon | Request Permissions | PDF file iconPDF (110 KB)  
    Freely Available from IEEE
  • Committee Organization

    Page(s): xv - xx
    Save to Project icon | Request Permissions | PDF file iconPDF (124 KB)  
    Freely Available from IEEE
  • Website Data Storage Management during Data Lifecycle Taking into Account of Time Effect

    Page(s): 3 - 7
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (64 KB) |  | HTML iconHTML  

    The data lifecycle means the process from datapsilas advent to its disappearance; the data management runs through the whole data lifecycle. Data management aims at providing the data which are complete, accurate and conducts time effect for userspsila data retrieval. At each stage of datapsilas life cycle, different demands exist upon data management, among which data store management is the most important. There are several different strategies and models in store management and to choose the best scheme for it is also fundamental. In fact, it is always hoped that the recent, latest data would be stored in the most accessible device. From the perspective of how the administrator provides effective data for userspsila retrieval, this thesis discusses about the store management based on data time effect within datapsilas life cycle. The main point, however, is to find out a feasible way on a Website towards how to divide the data upon its time effect, what kind of store strategies should be adopted due to the division and finally some suggestions and solution would be proposed. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A Strategy for Attributes Selection in Cost-Sensitive Decision Trees Induction

    Page(s): 8 - 13
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (272 KB) |  | HTML iconHTML  

    Decision tree learning is one of the most widely used and practical methods for inductive inference. A fundamental issue in decision tree inductive learning is the attribute selection measure at each non-terminal node of the tree. However, existing literatures have not taken both classification ability and cost-sensitive into account well. In this paper, we present a new strategy for attributes selection, which is a trade-off method between attributes' information and cost-sensitive learning including misclassification costs and test costs with different units, for selecting splitting attributes in cost-sensitive decision trees induction. The experimental results show our method outperform than the existing methods, such as, information gain method, total costs methods, in terms of the decrease of misclassification costs with different missing rate and various costs in UCI datasets. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Identification of Chinese Event and Their Argument Roles

    Page(s): 14 - 19
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (296 KB) |  | HTML iconHTML  

    Event detection and recognition is a major task in ACE evaluation plan. In this paper, we focus on solving the two subtasks: (1) event detection and classification, (2) their argument role identification. For the first subtask, the strategy of local feature selection and explicit discrimination of positive and negative features is used in order to ensure the performance of each type. For the second subtask, the approach based on multi-level patterns is presented in order to improve the coverage of patterns and to use various language information. Experiments on the ACE2005 corpus show that performance of the first subtask is satisfying with the 83.5% macro-average F1-measure. And experiments of the second subtask show that the method based on multi-level patterns is very promising. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • High Indexing Compression for Spatial Databases

    Page(s): 20 - 25
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (288 KB) |  | HTML iconHTML  

    The KDB-tree is a traditional point access method for retrieving multidimensional data. Many literatures frequently address the low storage utilization and insufficient retrieval performance as two bottlenecks for KDB-tree family of structures. A large amount of unnecessary splits caused by data insertion orders and data skewness is the fatal reason for these two bottlenecks. Compressing KDB-trees still has high appeal for practical applications. In this paper, dynamic-tuning partition (DT-partition) and leaf replication(l-replication) methods are proposed to mend the sufferings of data insertion orders and data skewness. Without loss the quantity of data selectivity, a better dynamic indexing scheme is presented for accommodating data to leaf nodes as many as possible. Moreover, the degradation of retrieval performance in heavily skewed spaces are carefully investigated and solved. Analytical and experimental results show our indexing method out performs the traditional methods. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Association Rule Algorithms for Logical Equality Relationships

    Page(s): 26 - 30
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (110 KB) |  | HTML iconHTML  

    The association rule has become one of the most important techniques in data mining. New algorithms must be developed in order to apply it to more areas. This paper proposes association rule algorithms for logical equality relationships, modified from the original Apriori and FP-Growth algorithms. Logical equality is defined as truerarrtrue (1rarr1) or falserarrfalse (0rarr0) associations. This special relationship commonly occurs in the real world, such as the linkage in the stock markets and customer loyalty for a certain product. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • The Control of Symmetric Data Interpolation Surface

    Page(s): 31 - 36
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1327 KB) |  | HTML iconHTML  

    This paper discuss the property of symmetric data interpolation surface. For the given data, when the parameters very, the interpolating surface varies, so the interpolating surfaces can be modified by selecting suitable parameters under the condition that the interpolation data are not changed. The problem is that when the two parameters vary, how does the interpolating surface change. The discussion focuses on the problem in this paper. With the change of the two parameters, the trend of variations of the interpolating surface is derived. Based on the conclusion on the two parameters, one easy to control the shape of the surface. Examples to show the effect of the parameters on the surfaces are given. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An Efficient Algorithm for Mining Closed Frequent Itemsets in Data Streams

    Page(s): 37 - 42
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (366 KB) |  | HTML iconHTML  

    Mining closed frequent itemsets in the sliding window is one of important topics of data streams mining. In this paper, we propose a novel algorithm, FPCFI-DS, which mines closed frequent itemsets in the sliding window of data streams efficiently, and maintains the precise closed frequent itemsets in the current window at any time. The algorithm uses a single-pass lexicographical-order FP-Tree-based algorithm with mixed item ordering policy to mine the closed frequent itemsets in the first window, and introduces a novel updating approach to process the sliding of window. The experimental results show that FPCFI-DS performs better than the state-of-the-art algorithm Moment in terms of both the time and space efficiencies, especially for dense dataset or low minimum support. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • CUZ: An Improved Clustering Algorithm

    Page(s): 43 - 48
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (252 KB) |  | HTML iconHTML  

    Clustering is for many years now one of the most complex and most studied problems in data mining. Until now the most commonly used algorithm for finding groups of similar objects in large databases is CURE. The main advantage of CURE, compared to other clustering algorithms, is its ability to identify non spherical or rectangular shaped objects. In this paper we present a new algorithm called CUZ (Clustering Using Zones). The main innovation of CUZ lies in the technique that it uses to calculate the representatives. This technique overcomes the problem of identifying clusters with non-convex shapes. Experimental results show that CUZ is a generally competitive technique, while it is particularly adequate when we have to do with clusters that do not have convex shapes. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Robot Task Planning for Mixed-Initiative Human Robot Interaction in Home Service Robot

    Page(s): 49 - 54
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (378 KB) |  | HTML iconHTML  

    The main reason of current robotics research is the difficulties of coping with dynamic environment, uncertainty, operational limitation, etc. Human robot interaction (HRI) becomes important role in robotics research and especially humanpsilas roles in HRI is increasing. In this paper, we made robot task planning framework for MI HRI in home service environment. We suggest an structure of task network called joint script which enables MI HRI and its planning process based on three procedures; search, selection and adaptation in deliberative layer. For experiment, we manually made 200 heterogeneous joint scripts and each module for script-based planning is validated by showing appropriate scenario. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A Robust Bagging Method Using Median as a Combination Rule

    Page(s): 55 - 60
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (305 KB) |  | HTML iconHTML  

    Bagging has been known to be successful in increasing the accuracy of prediction of the unstable classifiers. In bagging predictors are constructed using bootstrap samples from the training sets and then aggregated to form a bagged predictor. The robust bagging discard the bootstrapped classifiers generating extreme error rates, as estimated by the out-of-bag error rate and to combine over the remaining ones using the robust location estimator,'median'. In this paper we try to explore the advantages of robust bagging. We carried out experiments on several benchmark data sets and suggest from the results that robust bagging performs quite similar compare to the standard bagging when applied to unstable base classifiers such as decision trees, but performs better when applied to more stable base classifiers as Fisher linear discriminant analysis and nearest mean classifier. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Application Research of Dominance Relation Rough Set on Seismology Data

    Page(s): 61 - 65
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (111 KB) |  | HTML iconHTML  

    In this paper, after discussing classical rough set theory based on indiscemibility relation and the problem of its reduction algorithms on Seismology Data, we arrived at a new definition of dominance distinguishing matrix by taking advantage of the unique characteristic of the extended model. Then the corresponding reduction and rule extracting algorithms are accordingly presented to enhance the efficiency of seeking the reduction in some extent. In fact, we use the rough set theory based on Dominance Relation to get the 272 rules with realistic meaning, from 44381 items in the small Seismology Data in China. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Exploring the Influence of Sampling on Pattern Support Distribution

    Page(s): 66 - 71
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (811 KB) |  | HTML iconHTML  

    Identifying the pattern support distribution (PSD) in datasets is useful for many data mining tasks, such as market basket analysis. The support of a pattern is the frequency of its occurrence in a dataset. Calculating the distribution of these supports over an entire dataset is computationally expensive; this cost can be reduced by sampling from the dataset and computing the PSD on a relatively small sample. However, this may miscount patterns and cause significant changes in the distribution identified. Based on the fact that the PSD shows a power-law relationship, in this paper we investigate the influence of sampling on the characteristics of the power-law relationship in the pattern support distribution. We consider sampling effect on this relationship under two assumptions: uniform distribution of pattern supports, and independent identically distributed (i.i.d.) distributions. We experimentally evaluate the influence on data from four real-world transaction datasets. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • CRFP: A Novel Adaptive Replacement Policy Combined the LRU and LFU Policies

    Page(s): 72 - 79
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (327 KB) |  | HTML iconHTML  

    A variety of cache replacement algorithms have been proposed and applied in different situations, in which the LRU (least recently used) and LFU (least frequently used) replacement policies are two of the most popular policies. However, most real systems donpsilat consider obtaining a maximized throughput by switching between the two policies in response to the access pattern. In this paper, we propose a novel adaptive replacement policy that combined the LRU and LFU Policies (CRFP); CRFP is self-tuning and can switch between different cache replacement policies adaptively and dynamically in response to the access pattern changes.Conducting simulations with a variety of file access patterns and a wide range of buffer size, we show that the CRFP outperforms other algorithms in many cases and performs the best in most of these cases. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Finding Dependency Trees from Binary Data

    Page(s): 80 - 85
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (217 KB) |  | HTML iconHTML  

    Much work has been done in finding interesting subsets of items, since it has broad applications in financial data analysis, e-commerce, text data mining, and so on. Though the well-known frequent pattern mining attracted much attention in research community, recently, more work has been devoted to analysis of more sophisticated relationships among items. Chow-Liu tree and low-entropy tree, for example, were used to summarize the frequent patterns. In this paper, we consider finding a novel dependency tree from binary data. It has several advantages over previous related work. Firstly, we propose a novel distance measure between items based on information theory, which captures the expected uncertainty in the item pairs and the mutual information between them. Based on this distance measure, we present a simple yet efficient algorithm for finding the dependency trees from binary data. We also show how our new approach can find applications in frequent pattern summarization. Our running example on synthetic dataset shows that our approach achieves good results compared to existing popular heuristics. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • On Defining Keys for XML

    Page(s): 86 - 91
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (230 KB) |  | HTML iconHTML  

    Keys are an important part of any data model. In relational model, keys are well studied and are widely used. In recent years, XML has emerged as an widely used data representation and storage format in the world wide Web. The data centric approach of XML has necessitated the definition of integrity constraints for XML. XML key is regarded as one of those integrity constraints. In this paper, we define XML keys, specifically for the purpose of XML data transformation and integration with integrity constraints. In the proposal, we show how XML keys are defined on the XML document type definition (DTD) and are satisfied by the XML documents. We introduce the novel concept P-tuple (pair-wise tuple) that produces semantically correct tuples in the XML document during key satisfaction. Our definition for XML keys can also be extended to XML relative keys. Finally, we discuss how our proposal for XML keys has some advantages over other previous proposals and standards. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An HRV Patterns Discovering Neural Network for Mobile Healthcare Services

    Page(s): 92 - 97
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (318 KB) |  | HTML iconHTML  

    Implantable devices such as pulse, ECG, and movement sensors that can be embedded in day to day wearables have been drawn a lot of research attentions in the field of wireless sensor network nowadays. In this research, we focus on developing an incremental adaptive network to detect subject at risk of coronary heart disease based on long-term Heart Rate Variability (HRV) measurement under blood pressure and breathing frequency using Poincare plot encoding, named PHIAN. The network is learnt along with the various changes of environment without destroying the old prototype patterns. The error probability density is taken care in the training process, which is necessary to avoid the regions where inputs have a high temporary probability density attracting all neural units. PHIAN is evaluated under different settings and in comparison with previous on-line learning techniques in terms of classification error and the network structure. Our proposed method is efficiently applicable to the smart sensor system to alert the health care service provider to intervene in the emergency situation. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Robot with Emotion for Triggering Mixed-Initiative Interaction Planning

    Page(s): 98 - 103
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (357 KB) |  | HTML iconHTML  

    Emotion is one of human communication' channels; and in this paper -as part of mixed-initiative-emotion is used for triggering user interaction. Three interaction-planning modes which agglutinate seven interaction-types are introduced. This work shows how the existing framework changes by adding emotion to robot. For planning, scripts are used for the implementation of the ideas presented, where emotional expressions reduces the number of interactions. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Hashed Multiple Lists: A Stream Filter for Processing Continuous Query with Multiple Attributes in Geosensor Networks

    Page(s): 104 - 109
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1742 KB) |  | HTML iconHTML  

    Many researches for processing continuous query are important to be processed in real time on data stream environments. Continuous query should be inserted and searched in real time on data stream environments. Some indexing techniques are applied in order to process continuous queries. Previous work is not able to support the dynamic insertion and deletion operation for building an index previously. The performance of insertion and search operation are also slowed because of large number and wide range of interval inserted. Therefore, in order to solve those problems, we propose hashed multiple lists to process continuous queries in real time in geosensor networks. Proposed technique shows the fast linear insertion and search performance in the performance evaluation. It can be utilized the location based service, u-healthcare, data stream management system, and context-awareness system. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.