By Topic

Knowledge Discovery and Data Mining, 2008. WKDD 2008. First International Workshop on

Date 23-24 Jan. 2008

Filter Results

Displaying Results 1 - 25 of 154
  • First International Workshop on Knowledge Discovery and Data Mining - Cover

    Page(s): c1
    Save to Project icon | Request Permissions | PDF file iconPDF (185 KB)  
    Freely Available from IEEE
  • First International Workshop on Knowledge Discovery and Data Mining - Title

    Page(s): i - iii
    Save to Project icon | Request Permissions | PDF file iconPDF (172 KB)  
    Freely Available from IEEE
  • First International Workshop on Knowledge Discovery and Data Mining - Copyright

    Page(s): iv
    Save to Project icon | Request Permissions | PDF file iconPDF (115 KB)  
    Freely Available from IEEE
  • First International Workshop on Knowledge Discovery and Data Mining - TOC

    Page(s): v - xiii
    Save to Project icon | Request Permissions | PDF file iconPDF (275 KB)  
    Freely Available from IEEE
  • Message from the WKDD 2008 Workshop Chairs

    Page(s): xiv
    Save to Project icon | Request Permissions | PDF file iconPDF (262 KB)  
    Freely Available from IEEE
  • Message from the IITSI 2008 Symposium Chairs

    Page(s): xv
    Save to Project icon | Request Permissions | PDF file iconPDF (289 KB)  
    Freely Available from IEEE
  • WKDD 2008 and IITSI 2008 Organizing Committee

    Page(s): xvi
    Save to Project icon | Request Permissions | PDF file iconPDF (288 KB)  
    Freely Available from IEEE
  • WKDD 2008 and IITSI 2008 Committee Members

    Page(s): xvii - xviii
    Save to Project icon | Request Permissions | PDF file iconPDF (292 KB)  
    Freely Available from IEEE
  • Advancing Knowledge Discovery and Data Mining

    Page(s): 3 - 5
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (385 KB) |  | HTML iconHTML  

    Knowledge discovery and data mining have become areas of growing significance because of the recent increasing demand for KDD techniques, including those used in machine learning, databases, statistics, knowledge acquisition, data visualization, and high performance computing. Knowledge discovery and data mining can be extremely beneficial for the field of Artificial Intelligence in many areas, such as industry, commerce, government, education and so on. The relation between Knowledge and Data Mining, and Knowledge Discovery in Database (KDD) process are presented in the paper. Data mining theory, Data mining tasks, Data Mining technology and Data Mining challenges are also proposed. This is an belief abstract for an invited talk at the workshop. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Knowledge Management in the Ubiquitous Software Development

    Page(s): 6 - 9
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (453 KB) |  | HTML iconHTML  

    The continuous technical advances have lead to the proliferation of very small and very cheap microprocessors, equipped with sensors and capacity of wireless communication. The information processing is becoming ubiquitous and it is being impregnated in all type of objects. In this article the general delineations set out towards a methodology of securing of the quality of software ubiquitous based their main characteristics: centered in the user and highly interactive. Moreover, it considered to the usability as the quality characteristic of more relevant in the development of this type of highly interactive software systems. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A Novel Network Intrusion Detection System (NIDS) Based on Signatures Search of Data Mining

    Page(s): 10 - 16
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (581 KB) |  | HTML iconHTML  

    Network security has been a very important issue, since the rising evolution of the Internet. There has been an increasing need for security systems against the external attacks from the hackers. One important type is the intrusion detection system (IDS). There are two major categories of the analysis techniques of IDS: the anomaly detection and the misuse detection. Here we forcus on misuse detection, the misuse detection collected the attack signatures in a database as the same as virus protection software to detect the relate attacks, we propose an algorithm to use the known signature to find the signature of the related attack quickly. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Mining High Utility Itemsets in Large High Dimensional Data

    Page(s): 17 - 20
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (485 KB) |  | HTML iconHTML  

    Existing algorithms for utility mining are inadequate on datasets with high dimensions or long patterns. This paper proposes a hybrid method, which is composed of a row enumeration algorithm (i.e., inter-transaction) and a column enumeration algorithm (i.e., two-phase), to discover high utility itemsets from two directions: Two-phase seeks short high utility itemsets from the bottom, while inter-transaction seeks long high utility itemsets from the top. In addition, optimization technique is adopted to improve the performance of computing the intersection of transactions. Experiments on synthetic data show that the hybrid method achieves high performance in large high dimensional datasets. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Effective Pruning Strategies for Sequential Pattern Mining

    Page(s): 21 - 24
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (773 KB) |  | HTML iconHTML  

    In this paper, we systematically explore the search space of frequent sequence mining and present two novel pruning strategies, SEP (Sequence Extension Pruning) and IEP (Item Extension Pruning), which can be used in all Apriori-like sequence mining algorithms or lattice-theoretic approaches. With a little more memory overhead, proposed pruning strategies can prune invalidated search space and decrease the total cost of frequency counting effectively. For effectiveness testing reason, we optimize SPAM [2] and present the improved algorithm, SPAMSEPIEP, which uses SEP and IEP to prune the search space by sharing the frequent 2- sequences lists. A set of comprehensive performance experiments study shows that SPAMSEPIEP outperforms SPAM by a factor of 10 on small datasets and better than 30% to 50% on reasonably large dataset. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Cooperation Forensic Computing Research

    Page(s): 25 - 30
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (538 KB) |  | HTML iconHTML  

    The network forensic computing is faced with the question of the complex network intrusion analyses. So a new concept of cooperation forensic computing is defined. Through to extend the theory of function dependency, a new method called probability function dependency relationships is proposed. Combined it with the Bayesian network and K2 algorithm, the network forensic computing algorithm called CFA is proposed. For the complex network attack, CFA is able to synthesize the various forensic data resource to reappearance the crime scenario intuitionally and realize the network forensic analysis effectively. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Grasping Related Words of Unknown Word for Automatic Extension of Lexical Dictionary

    Page(s): 31 - 35
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (946 KB) |  | HTML iconHTML  

    An aim of this research is to grasp related words of unknown word. Currently, several lexical dictionaries have been developed for semantic retrieval such as WordNet and FrameNet. However, more new words are created in every day because of new trends, new paradigm, new technology, etc. And, it is impossible to contain all of these new words. The existing methods, which grasp the meaning of unknown word, have a limitation that is not exact. To solve this limitation, we have studied the way how to make relations between known words and unknown word. As a result, we found a noble method using co-occurrence, wordnet and Bayesian probability. The method could find what words are related with unknown word and how much weight other words relate with unknown word. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A Novel Website Structure Optimization Model for More Effective Web Navigation

    Page(s): 36 - 41
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (547 KB) |  | HTML iconHTML  

    A novel website structure optimization model for more effective web navigation is proposed. First, web page group with low access efficiency is discovered by its support and its topology average distance; Then a measure degree, website topology interest, which can overall indicate the website access efficiency is proposed as guidance rule to optimize the website hyperlink structure; Finally, users' navigation are facilitated by optimizing website linkage structure that reduces the number of steps to locate their target web pages. Experiments result on a distance education website show that our approach is efficient and practical for adaptive website. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Average Fuzzy Direction Based Handwritten Chinese Characters Recognition Approach

    Page(s): 42 - 47
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (529 KB) |  | HTML iconHTML  

    Since handwritten Chinese characters' style is uncertain and differs as people differ, this article brings forward a new Average Fuzzy Direction Code based on weight, so as to conquer traditional fuzzy arithmetic's shortcoming that is without enough generalization capability; At the same time it improves association rules in data mining and applies it to the process of handwritten Chinese characters' generalization and picking up of their abstract attributes. Thus exact denotation of handwritten Chinese characters is resolved, and simultaneously it picks up the handwritten Chinese characters' characteristic and mines improved association rules, and further achieves the purpose of handwritten Chinese characters' quick recognition. Therefore it resolves traditional pattern identification's problem of poor adaptability. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • The Data Mining Technology Based on CIMS and its Application on Automotive Remanufacturing

    Page(s): 48 - 52
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (509 KB) |  | HTML iconHTML  

    Nowadays, with the development of computer technology, data mining has been widely used in various fields. This paper describes a CIMSMINER that combines the data mining with CIMS (computer integrated manufacturing system) and instructs its objectives, model, physical architecture and methods. Considering the characteristics of remanufacturing of automotive products in China, the CIMSMINER is used to get the information concourse together and obtain the data mining results to help the improvement of products. The application in automotive remanufacturing is a reform to make the automotive product information chain be not only an information carrier, but also an information miner. Currently, the government strongly emphasizes energy saving and emission reducing, which is closely related to the sustainable development of China. Obviously, CIMSMINER is an effective tool to support the implementation of this policy. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An Empirical Study on Improving the Manufacturing Informatization Index System of China

    Page(s): 53 - 58
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (566 KB) |  | HTML iconHTML  

    The manufacturing informatization index system (MIIS) is an indispensable tool to measure the informatization level of Chinese manufacturing industry and evaluate the implementation effect of the manufacturing informatization engineering (MIE) conducted by the government of China. Thus far, the constructs of MIIS has not been validated. This study fills this void by employing structural equation modeling (SEM) to test the MIIS model. The samples in this study come from the standard database of Chinese manufacturing informatization established by the data survey of MIE during the "Tenth Five-Year Plan" period, including 12896 enterprises samples from 11 manufacturing industries and 3472 support samples from 29 provinces of China. Based on the results of SEM analysis, some indexes of MIIS are adjusted and an improved MIIS is got at last. This empirical study proves that combining SEM technology and standard data resources would be an ideal method to improve MIIS. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Centrality Research on the Traditional Chinese Medicine Network

    Page(s): 59 - 62
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (511 KB) |  | HTML iconHTML  

    Aiming at the complex data in the traditional Chinese medicine, a new way is proposed in this paper that data mining of complex relations to find out the potential information among different medicine objects. We turned the traditional Chinese medicine knowledge network into graph by using information from ontology, then adopted centrality algorithm to analyze and process this graph, and finally mined valuable medicine knowledge. As the result of the verification test, this algorithm shows very good practicability. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A New Method Multi-Factor Trend Regression and its Application to Economy Forecast in Jiangxi

    Page(s): 63 - 67
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (479 KB) |  | HTML iconHTML  

    The principle of a new method called trend regression is introduced and applied to the economy forecast of Jiangxi Province. The method improved previous time series forecasting method in which only self-extension is done and multiple factors (variables) are not taken into consideration. Also, it got over the weakness of forecasting by general regression analysis that relies on simultaneous independent variables. A time series is the function of multiple factors. The values (independent variables) in a period may affect the value (dependent variable) to be predicated in the next period. The nearer the sample time to the predicted time, the more important the sample to the predict value. By shifting the dependent variable to establish models, sequential regression and prediction can be realized. In this way the trend of information can be mined. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • The BP Neural Network Optimizing Design Model for Agricultural Information Measurement Based on Multistage Dynamic Fuzzy Evaluation

    Page(s): 68 - 71
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (453 KB) |  | HTML iconHTML  

    The agricultural information level is on the initial stage in China, so we should pay more attention to its construction, but how to measure the agricultural information degree is a major issue. This paper overcomes the shortcoming of traditional linear agricultural information degree evaluation method, proposes a BP neural network evaluating method based on the multistage dynamic fuzzy judgment, takes the multistage dynamic fuzzy judgment as the sampling foundation, uses the BP neural network principle to establish evaluation model. This method not only can exert the unique advantages ofBP neural network, but also overcome the difficulty of seeking the high grade training sample data. The agricultural information degree evaluation of 10 cities in Jilin province indicates that the method to evaluate the agricultural information degree is stable and reliable. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A PSO-Based Clustering Algorithm for Manufacturing Cell Design

    Page(s): 72 - 75
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (379 KB) |  | HTML iconHTML  

    Different metaheuristic methods have been used to solve clustering problems. This paper addresses the problem of manufacturing cell formation using a modified particle swarm optimisation (PSO) algorithm. The main modification made to the original PSO algorithm consists on that in this work it is not used the vector of velocities as the standard PSO algorithm does. The proposed algorithm uses the concept of proportional likelihood with modifications, a technique that is used in data mining techniques. Some simulations are presented and compared. The criterion used to group the machines in cells is based on the minimization of inter-cell movements. The computational results show that the PSO algorithm is able to find the optimal solutions on almost all instances. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Association Rule Analysis of Spatial Data Mining Based on Matlab-A Case of Ancheng Township in China

    Page(s): 76 - 80
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (565 KB) |  | HTML iconHTML  

    Though there are many methods for spatial data mining, it is also a desirable problem that how to carry out this operation fast. Matlab is called full-purpose calculation paper. And it has certain advantage in carrying out data mining because of its powerful matrix calculation function. According to the association rule method of data mining, spatial data of land use and slope was processed and extracted on Geographic Information System (GIS) at first, and data mining was performed by the program, which was wrote in M-language of Matlab. The study area was Ancheng township in Shandong province of China. The confidence and the support between different land use types and slope levels were mined, and the association rules were set up. Through the empirical study, it has been proved that this study method is feasible and the conclusion can provide instruction for local land use planning. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An Enhanced ART2 Neural Network for Clustering Analysis

    Page(s): 81 - 85
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (655 KB) |  | HTML iconHTML  

    The adaptive resonance theory 2 (ART2) neural network exhibits several properties which can be useful in the data mining and which are lacking in most other neural networks. But ART2 has deficiencies that the categories clustered by ART2 are very mutable to slight changes in training conditions. An improved ART2 with enhanced triplex matching mechanism, named as ETM-ART2, is presented to redress the deficiencies. Several tests results show that ETM-ART2 performs better than classic ART2 when applied to clustering tasks. It is an effective improved algorithm and can be applied to a wide variety of problems. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.