By Topic
Skip to Results

Search Results

You searched for: data mining
74,096 Results returned
Skip to Results
  • Save this Search
  • Download Citations Disabled
  • Save To Project
  • Email
  • Print
  • Export Results
  • Full text access may be available. Click article title to sign in or learn about subscription options.

    Granular computing based data mining in the views of rough set and fuzzy set

    Wang, Guoyin ; Jun Hu ; Qinghua Zhang ; Xianquan Liu ; Jiaqing Zhou
    Granular Computing, 2008. GrC 2008. IEEE International Conference on

    DOI: 10.1109/GRC.2008.4664791
    Publication Year: 2008 , Page(s): 67
    Cited by:  Papers (1)

    IEEE Conference Publications

    Usually, data mining is considered as the nontrivial extraction of implicit, previously unknown, and potentially useful information from data. In our data-driven data mining model, knowledge is originally existed in data, but just not understandable for human. Data mining is taken as a process of transforming knowledge from data format into some other human understandable format like rule, formula, theorem, etc. In order to keep the knowledge unchanged in a data mining process, the knowledge properties should be kept unchanged during a knowledge transformation process. Many real world data mining tasks are highly constraint-based and domain-oriented. Thus, domain prior knowledge should also be a knowledge source for data mining. The control of a user to a data mining process could also be taken as a kind of dynamic input of the data mining process. Thus, a data mining process is not only mining knowledge from data, but also from human. This is the key idea of Domain- oriented Data-driven Data Mining (3DM). In the view of granular computing (GrC), a data mining process can be considered as the transformation of knowledge in different granularities. Original data is a representation of knowledge in the finest granularity. It is not understandable for human. However, human is sensitive to knowledge in coarser granularities. So, a data mining process could be considered to be a transformation of knowledge from a finer granularity space to a coarser granularity space. The understanding for data mining of3DM and GrC is consistent to each other. Rough set and fuzzy set are two important computing paradigms of GrC. They are both generalizations of classical set theory for modeling vagueness and uncertainty. Although both of them can be used to address vagueness, they are not rivals. In some real problems, they are even complementary to each other. In this plenary talk, the new understanding for data mining, domain-oriented data-driven data mining (3DM), will be introduced. - - The relationship of 3DM and GrC, and granular computing based data mining in the views of rough set and fuzzy set will be discussed. View full abstract»

  • Freely Available from IEEE

    Proceedings. Fifth IEEE International Conference on Data Mining


    Data Mining, Fifth IEEE International Conference on

    DOI: 10.1109/ICDM.2005.67
    Publication Year: 2005

    IEEE Conference Publications

  • Full text access may be available. Click article title to sign in or learn about subscription options.

    Seventh IEEE International Conference on Data Mining Workshops - Title

    Jagannathan, G. ; Wright, R.N.
    Data Mining Workshops, 2007. ICDM Workshops 2007. Seventh IEEE International Conference on

    DOI: 10.1109/ICDMW.2007.1
    Publication Year: 2007 , Page(s): i - iii

    IEEE Conference Publications

    The following topics are dealt with: data mining in Web 2.0 environment; knowledge-discovery from multimedia data and multimedia applications; mining and management of biological data; data mining in medicine; optimization-based data mining techniques; high performance data mining; mining graphs and complex structures; data mining on uncertain data; data streaming mining and management; spatial and spatio-temporal data mining. View full abstract»

  • Full text access may be available. Click article title to sign in or learn about subscription options.

    Quality Data for Data Mining and Data Mining for Quality Data: A Constraint Based Approach in XML

    Shahriar, M.S. ; Anam, S.
    Future Generation Communication and Networking Symposia, 2008. FGCNS '08. Second International Conference on

    Volume: 2
    DOI: 10.1109/FGCNS.2008.74
    Publication Year: 2008 , Page(s): 46 - 49
    Cited by:  Papers (2)

    IEEE Conference Publications

    As quality data is important for data mining, reversely data mining is necessary to measure the quality of data. Specifically, in XML, the issue of quality data for mining purposes and also using data mining techniques for quality measures is becoming more necessary as a massive amount of data is being stored and represented over the Web. We propose two important interrelated issues: how quality XML data is useful for data mining in XML and how data mining in XML is used to measure the quality data for XML. When we address both issues, we consider XML constraints because constraints in XML can be used for quality measurement in XML data and also for finding some important patterns and association rules in XML data mining. We note that XML constraints can play an important role for data quality and data mining in XML. We address the theoretical framework rather than solutions. Our research framework is towards the broader task of data mining and data quality for XML data integrations. View full abstract»

  • Full text access may be available. Click article title to sign in or learn about subscription options.

    Research on data mining models for the internet of things

    Shen Bin ; Liu Yuan ; Wang Xiaoyi
    Image Analysis and Signal Processing (IASP), 2010 International Conference on

    DOI: 10.1109/IASP.2010.5476146
    Publication Year: 2010 , Page(s): 127 - 132
    Cited by:  Papers (3)

    IEEE Conference Publications

    In this paper, we propose four data mining models for the Internet of Things, which are multi-layer data mining model, distributed data mining model, Grid based data mining model and data mining model from multi-technology integration perspective. Among them, multi-layer model includes four layers: (1) data collection layer, (2) data management layer, (3) event processing layer, and (4) data mining service layer. Distributed data mining model can solve problems from depositing data at different sites. Grid based data mining model allows Grid framework to realize the functions of data mining. Data mining model from multi-technology integration perspective describes the corresponding framework for the future Internet. Several key issues in data mining of IoT are also discussed. View full abstract»

  • Full text access may be available. Click article title to sign in or learn about subscription options.

    Ad-hoc association-rule mining within the data warehouse

    Nestorov, S. ; Jukic, N.
    System Sciences, 2003. Proceedings of the 36th Annual Hawaii International Conference on

    DOI: 10.1109/HICSS.2003.1174605
    Publication Year: 2003
    Cited by:  Papers (3)

    IEEE Conference Publications

    Many organizations often underutilize their existing data warehouses. In this paper, we suggest a way of acquiring more information from corporate data warehouses without the complications and drawbacks of deploying additional software systems. Association-rule mining, which captures co-occurrence patterns within data, has attracted considerable efforts from data warehousing researchers and practitioners alike. Unfortunately, most data mining tools are loosely coupled, at best, with the data warehouse repository. Furthermore, these tools can often find association rules only within the main fact table of the data warehouse (thus ignoring the information-rich dimensions of the star schema) and are not easily applied on non-transaction level data often found in data warehouses. In this paper, we present a new data-mining framework that is tightly integrated with the data warehousing technology. Our framework has several advantages over the use of separate data mining tools. First, the data stays at the data warehouse, and thus the management of security and privacy issues is greatly reduced. Second, we utilize the query processing power of a data warehouse itself, without using a separate data-mining tool. In addition, this framework allows ad-hoc data mining queries over the whole data warehouse, not just over a transformed portion of the data that is required when a standard data-mining tool is used. Finally, this framework also expands the domain of association-rule mining from transaction-level data to aggregated data as well. View full abstract»

  • Full text access may be available. Click article title to sign in or learn about subscription options.

    Data Mining for Malicious Code Detection and Security Applications

    Thuraisingham, B.
    Intelligence and Security Informatics Conference (EISIC), 2011 European

    DOI: 10.1109/EISIC.2011.80
    Publication Year: 2011 , Page(s): 4 - 5

    IEEE Conference Publications

    Summary form only given. Data mining is the process of posing queries and extracting patterns, often previously unknown from large quantities of data using pattern matching or other reasoning techniques. Data mining has many applications in security including for national security as well as for cyber security. The threats to national security include attacking buildings, destroying critical infrastructures such as power grids and telecommunication systems. Data mining techniques are being investigated to find out who the suspicious people are and who is capable of carrying out terrorist activities. Cyber security is involved with protecting the computer and network systems against corruption due to Trojan horses, worms and viruses. Data mining is also being applied to provide solutions such as intrusion detection and auditing. The first part of the presentation will discuss my joint research with Prof. Latifur Khan and our students at the University of Texas at Dallas on data mining for cyber security applications. For example, anomaly detection techniques could be used to detect unusual patterns and behaviors. Link analysis may be used to trace the viruses to the perpetrators. Classification may be used to group various cyber attacks and then use the profiles to detect an attack when it occurs. Prediction may be used to determine potential future attacks depending in a way on information learned about terrorists through email and phone conversations. Data mining is also being applied for intrusion detection and auditing. Other applications include data mining for malicious code detection such as worm detection and managing firewall policies. This second part of the presentation will discuss the various types of threats to national security and describe data mining techniques for handling such threats. Threats include non real-time threats and real time threats. We need to understand the types of threats and also gather good data to carry out mining and obtain usef- - ul results. The challenge is to reduce false positives and false negatives. The third part of the presentation will discuss some of the research challenges. We need some form of real-time data mining, that is, the results have to be generated in real-time, we also need to build models in real-time for real-time intrusion detection. Data mining is also being applied for credit card fraud detection and biometrics related applications. While some progress has been made on topics such as stream data mining, there is still a lot of work to be done here. Another challenge is to mine multimedia data including surveillance video. Finally, we need to maintain the privacy of individuals. Much research has been carried out on privacy preserving data mining. In summary, the presentation will provide an overview of data mining, the various types of threats and then discuss the applications of data mining for malicious code detection and cyber security. Then we will discuss the consequences to privacy. View full abstract»

  • Full text access may be available. Click article title to sign in or learn about subscription options.

    Bottom-up generalization: a data mining solution to privacy protection

    Ke Wang ; Yu, P.S. ; Chakraborty, S.
    Data Mining, 2004. ICDM '04. Fourth IEEE International Conference on

    DOI: 10.1109/ICDM.2004.10110
    Publication Year: 2004 , Page(s): 249 - 256
    Cited by:  Papers (22)  |  Patents (2)

    IEEE Conference Publications

    The well-known privacy-preserved data mining modifies existing data mining techniques to randomized data. In this paper, we investigate data mining as a technique for masking data, therefore, termed data mining based privacy protection. This approach incorporates partially the requirement of a targeted data mining task into the process of masking data so that essential structure is preserved in the masked data. The idea is simple but novel: we explore the data generalization concept from data mining as a way to hide detailed information, rather than discover trends and patterns. Once the data is masked, standard data mining techniques can be applied without modification. Our work demonstrated another positive use of data mining technology: not only can it discover useful patterns, but also mask private information. We consider the following privacy problem: a data holder wants to release a version of data for building classification models, but wants to protect against linking the released data to an external source for inferring sensitive information. We adapt an iterative bottom-up generalization from data mining to generalize the data. The generalized data remains useful to classification but becomes difficult to link to other sources. The generalization space is specified by a hierarchical structure of generalizations. A key is identifying the best generalization to climb up the hierarchy at each iteration. Enumerating all candidate generalizations is impractical. We present a scalable solution that examines at most one generalization in each iteration for each attribute involved in the linking. View full abstract»

  • Open Access

    Information Security in Big Data: Privacy and Data Mining

    Xu, L. ; Jiang, C. ; Wang, J. ; Yuan, J. ; Ren, Y.
    Access, IEEE

    Volume: 2
    DOI: 10.1109/ACCESS.2014.2362522
    Publication Year: 2014 , Page(s): 1149 - 1176

    IEEE Journals & Magazines

    The growing popularity and development of data mining technologies bring serious threat to the security of individual,’s sensitive information. An emerging research topic in data mining, known as privacy-preserving data mining (PPDM), has been extensively studied in recent years. The basic idea of PPDM is to modify the data in such a way so as to perform data mining algorithms effectively without compromising the security of sensitive information contained in the data. Current studies of PPDM mainly focus on how to reduce the privacy risk brought by data mining operations, while in fact, unwanted disclosure of sensitive information may also happen in the process of data collecting, data publishing, and information (i.e., the data mining results) delivering. In this paper, we view the privacy issues related to data mining from a wider perspective and investigate various approaches that can help to protect sensitive information. In particular, we identify four different types of users involved in data mining applications, namely, data provider, data collector, data miner, and decision maker. For each type of user, we discuss his privacy concerns and the methods that can be adopted to protect sensitive information. We briefly introduce the basics of related research topics, review state-of-the-art approaches, and present some preliminary thoughts on future research directions. Besides exploring the privacy-preserving approaches for each type of user, we also review the game theoretical approaches, which are proposed for analyzing the interactions among different users in a data mining scenario, each of whom has his own valuation on the sensitive information. By differentiating the responsibilities of different users with respect to security of sensitive information, we would like to provide some useful insights into the study of PPDM. View full abstract»

  • Full text access may be available. Click article title to sign in or learn about subscription options.

    CAKE – Classifying, Associating and Knowledge DiscovEry - An Approach for Distributed Data Mining (DDM) Using PArallel Data Mining Agents (PADMAs)

    Khan, D.
    Web Intelligence and Intelligent Agent Technology, 2008. WI-IAT '08. IEEE/WIC/ACM International Conference on

    Volume: 3
    DOI: 10.1109/WIIAT.2008.236
    Publication Year: 2008 , Page(s): 596 - 601

    IEEE Conference Publications

    This paper accentuate an approach of implementing distributed data mining (DDM) using multi-agent system (MAS) technology, and proposes a data mining technique of ldquoCAKErdquo (classifying, associating & knowledge discovery). The architecture is based on centralized parallel data mining agents (PADMAs). Data mining is part of a word, which has been recently introduced known as BI or business intelligence. The need is to derive knowledge out of the abstract data. The process is difficult, complex, time consuming and resource starving. These highlighted problems addressed in the proposed model. The model architecture is distributed, uses knowledge-driven mining technique and flexible enough to work on any data warehouse, which will help to overcome these problems. Good knowledge of data, meta-data and business domain is required for defining rules for data mining. Taking into consideration that the data and data warehouse has already gone through the necessary processes and ready for data mining. View full abstract»

  • Full text access may be available. Click article title to sign in or learn about subscription options.

    Data types generalization for data mining algorithms

    Mon-Fong Jiang ; Shian-Shyong Tseng ; Shan-Yi Liao
    Systems, Man, and Cybernetics, 1999. IEEE SMC '99 Conference Proceedings. 1999 IEEE International Conference on

    Volume: 3
    DOI: 10.1109/ICSMC.1999.823352
    Publication Year: 1999 , Page(s): 928 - 933 vol.3
    Cited by:  Patents (2)

    IEEE Conference Publications

    With the increasing use of database applications, mining interesting information from huge databases becomes of great concern and a variety of mining algorithms have been proposed in recent years. As we know, the data processed in data mining may be obtained from many sources in which different data types may be used. However, no algorithm can be applied to all applications due to the difficulty of fitting data types to the algorithm. The selection of an appropriate data mining algorithm is based not only on the goal of the application, but also the data fittability. Therefore, to transform the non-fitting data type into a target one is also important in data mining, but the work is often tedious or complex since a lot of data types exist in the real world. Merging the similar data types of a given selected mining algorithm into a generalized data type seems to be a good approach to reduce the transformation complexity. In this work, the data type fittability problem for six kinds of widely used data mining techniques is discussed and a data type generalization process, including merging and transforming phases is proposed. In the merging phase, the original data types of the data sources to be mined are first merged into the generalized ones. The transforming phase is then used to convert the generalized data types into the target ones for the selected mining algorithm. Using the data type generalization process, the user can select an appropriate mining algorithm just for the goal of the application without considering the data types View full abstract»

  • Full text access may be available. Click article title to sign in or learn about subscription options.

    Spatial data cube: provides better support for spatial data mining

    Yuanzhi Zhang ; Xie Kunqing ; Ma Xiujun ; Xu Dan ; Cai Cuo ; Tang Shiwei
    Geoscience and Remote Sensing Symposium, 2005. IGARSS '05. Proceedings. 2005 IEEE International

    Volume: 2
    DOI: 10.1109/IGARSS.2005.1525227
    Publication Year: 2005

    IEEE Conference Publications

    Spatial data mining is a promising technique that deals with extraction of implicit knowledge or other interesting patterns from large amount of spatial data. Though most data mining systems work with data stored in flat files or operational database, it has been recognized that mining in a data warehouse usually result in better information. Because data are usually cleansed before they are stored into data warehouse. Furthermore, data warehouse provides data with different levels of summarization for the clients, which will lead to fruitful data mining. However, current techniques of data warehouse can not handle spatial data well. Both dimensions and measures in the data model of data warehouse are nonspatial data. In this paper, we propose a new data model called spatial data cube for data warehouse. Spatial data cube supports both spatial and nonspatial data. We also introduce how to construct a spatial data cube that can answer queries efficiently by selective materialization. We believe that the spatial data cube can provide better support for spatial data mining. View full abstract»

  • Full text access may be available. Click article title to sign in or learn about subscription options.

    Applying data mining to the geosciences data

    Feng Bao ; Xu He ; Fengzhi Zhao
    Computer, Mechatronics, Control and Electronic Engineering (CMCE), 2010 International Conference on

    Volume: 5
    DOI: 10.1109/CMCE.2010.5609971
    Publication Year: 2010 , Page(s): 290 - 293

    IEEE Conference Publications

    The article detailedly addresses the features of the petrophysical data, logging data, seismic data and geological data based on the concepts of the data mining. The mining ideas regarding the petrophysical and logging data, seismic data and geological data are made based on their features. The article uses different mining ways to process the corresponding data, and describes the results from the perspective of the functions of data mining. According to the data mining techniques, the petrophysical data are applied to find the relations and forecast reservoirs; the logging data will be employed to evaluate the fuzzy reservoirs and recognize the effective reservoirs in complicated geological conditions; the space mining results of the 3D seismic data; the charts and text mining results of the geological data. The oil and natural gas data mining in the exploration adopts the methods of data analysis and the corresponding mathematical model to process the exploration data, and get the potential information. It has realized the purpose that the data guide exploration and given the concept of data exploration. View full abstract»

  • Full text access may be available. Click article title to sign in or learn about subscription options.

    Data Mining for Malicious Code Detection and Security Applications

    Thuraisingham, Bhavani
    Web Intelligence and Intelligent Agent Technologies, 2009. WI-IAT '09. IEEE/WIC/ACM International Joint Conferences on

    Volume: 2
    DOI: 10.1109/WI-IAT.2009.379
    Publication Year: 2009 , Page(s): 6 - 7

    IET Conference Publications

    Data mining is the process of posing queries and extracting patterns, often previously unknown from large quantities of data using pattern matching or other reasoning techniques. Data mining has many applications in security including for national security as well as for cyber security. The threats to national security include attacking buildings, destroying critical infrastructures such as power grids and telecommunication systems. Data mining techniques are being investigated to find out who the suspicious people are and who is capable of carrying out terrorist activities. Cyber security is involved with protecting the computer and network systems against corruption due to Trojan horses, worms and viruses. Data mining is also being applied to provide solutions such as intrusion detection and auditing. The first part of the presentation will discuss my joint research with Prof. Latifur Khan and our students at the University of Texas at Dallas on data mining for cyber security applications For example; anomaly detection techniques could be used to detect unusual patterns and behaviors. Link analysis may be used to trace the viruses to the perpetrators. Classification may be used to group various cyber attacks and then use the profiles to detect an attack when it occurs. Prediction may be used to determine potential future attacks depending in a way on information learnt about terrorists through email and phone conversations. Data mining is also being applied for intrusion detection and auditing. Other applications include data mining for malicious code detection such as worm detection and managing firewall policies. This second part of the presentation will discuss the various types of threats to national security and describe data mining techniques for handling such threats. Threats include non real-time threats and real-time threats. We need to understand the types of threats and also gather good data to carry out mining and obtain useful results. The challenge i- s to reduce false positives and false negatives. The third part of the presentation will discuss some of the research challenges. We need some form of real-time data mining, that is, the results have to be generated in real-time, we also need to build models in real-time for realtime intrusion detection. Data mining is also being applied for credit card fraud detection and biometrics related applications. While some progress has been made on topics such as stream data mining, there is still a lot of work to be done here. Another challenge is to mine multimedia data including surveillance video. Finally, we need to maintain the privacy of individuals. Much research has been carried out on privacy preserving data mining. In summary, the presentation will provide an overview of data mining, the various types of threats and then discuss the applications of data mining for malicious code detection, cyber security and national security. Then we will discuss the consequences to privacy. View full abstract»

  • Full text access may be available. Click article title to sign in or learn about subscription options.

    Data Mining for Malicious Code Detection and Security Applications

    Thuraisingham, Bhavani
    Web Intelligence and Intelligent Agent Technologies, 2009. WI-IAT '09. IEEE/WIC/ACM International Joint Conferences on

    Volume: 1
    DOI: 10.1109/WI-IAT.2009.372
    Publication Year: 2009 , Page(s): 6 - 7

    IET Conference Publications

    Data mining is the process of posing queries and extracting patterns, often previously unknown from large quantities of data using pattern matching or other reasoning techniques. Data mining has many applications in security including for national security as well as for cyber security. The threats to national security include attacking buildings, destroying critical infrastructures such as power grids and telecommunication systems. Data mining techniques are being investigated to find out who the suspicious people are and who is capable of carrying out terrorist activities. Cyber security is involved with protecting the computer and network systems against corruption due to Trojan horses, worms and viruses. Data mining is also being applied to provide solutions such as intrusion detection and auditing. The first part of the presentation will discuss my joint research with Prof. Latifur Khan and our students at the University of Texas at Dallas on data mining for cyber security applications For example; anomaly detection techniques could be used to detect unusual patterns and behaviors. Link analysis may be used to trace the viruses to the perpetrators. Classification may be used to group various cyber attacks and then use the profiles to detect an attack when it occurs. Prediction may be used to determine potential future attacks depending in a way on information learnt about terrorists through email and phone conversations. Data mining is also being applied for intrusion detection and auditing. Other applications include data mining for malicious code detection such as worm detection and managing firewall policies. This second part of the presentation will discuss the various types of threats to national security and describe data mining techniques for handling such threats. Threats include non real-time threats and real-time threats. We need to understand the types of threats and also gather good data to carry out mining and obtain useful results. The challenge i- s to reduce false positives and false negatives. The third part of the presentation will discuss some of the research challenges. We need some form of real-time data mining, that is, the results have to be generated in real-time, we also need to build models in real-time for realtime intrusion detection. Data mining is also being applied for credit card fraud detection and biometrics related applications. While some progress has been made on topics such as stream data mining, there is still a lot of work to be done here. Another challenge is to mine multimedia data including surveillance video. Finally, we need to maintain the privacy of individuals. Much research has been carried out on privacy preserving data mining. In summary, the presentation will provide an overview of data mining, the various types of threats and then discuss the applications of data mining for malicious code detection, cyber security and national security. Then we will discuss the consequences to privacy. View full abstract»

  • Full text access may be available. Click article title to sign in or learn about subscription options.

    TDML: A Data Mining Language for Transaction Databases

    Muthukumar, A. ; Nadarajan, R.
    Fuzzy Systems and Knowledge Discovery, 2007. FSKD 2007. Fourth International Conference on

    Volume: 4
    DOI: 10.1109/FSKD.2007.558
    Publication Year: 2007 , Page(s): 81 - 86

    IEEE Conference Publications

    A desired feature of data mining systems is the ability to support ad hoc and interactive data mining in order to facilitate flexible and effective knowledge discovery. Data mining query languages can be designed to support such a feature. There are data mining query languages like DMQLfor mining relational databases. In this paper, we have proposed a new data mining language for mining transaction databases called TDML. This proposed language mines association rule mining and sequential pattern mining. It uses a new bit map processing approach with buffered storage of results. Various types of data mining approaches that are supported like generalized mining, multilevel mining, multidimensional mining, distributed mining, partition mining, incremental mining, online mining, merge mining, transaction reduction, stream mining and targeted itemset mining. View full abstract»

  • Full text access may be available. Click article title to sign in or learn about subscription options.

    A parallel scalable infrastructure for OLAP and data mining

    Goil, S. ; Choudhary, A.
    Database Engineering and Applications, 1999. IDEAS '99. International Symposium Proceedings

    DOI: 10.1109/IDEAS.1999.787266
    Publication Year: 1999 , Page(s): 178 - 186
    Cited by:  Papers (3)  |  Patents (7)

    IEEE Conference Publications

    Decision support systems are important in leveraging information present in data warehouses in businesses like banking, insurance, retail and health care. The multidimensional aspects of a business can be naturally expressed using a multidimensional data model. Data analysis and data mining on these warehouses pose new challenges for traditional database systems. OLAP and data mining operations require summary information on these multidimensional data sets. Query processing for these applications require different views of data for analysis and effective decision making. Data mining techniques can be applied in conjunction with OLAP for an integrated business solution. As data warehouses grow, parallel processing techniques have been applied to enable the use of larger data sets and reduce the time for analysis, thereby enabling evaluation of many more options for decision making. We address: (1) scalability in multidimensional systems for OLAP and multidimensional analysis; (2) integration of data mining with the OLAP framework; and (3) high performance by using parallel processing for OLAP and data mining. We describe our system PARSIMONY-Parallel and Scalable Infrastructure for Multidimensional Online analytical processing. This platform is used both for OLAP and data mining. Sparsity of data sets is handled by using sparse chunks using a bit encoded sparse structure for compression. Techniques for effectively using summary information available in data cubes for data mining are presented for mining association rules and decision tree based classification. These take advantage of the data organization provided by the multidimensional data model. Performance results for high dimensional data sets on a distributed memory parallel machine (IBM SP-2) show good speedup and scalability View full abstract»

  • Full text access may be available. Click article title to sign in or learn about subscription options.

    The integrating between web usage mining and data mining techniques

    Nassar, O.A. ; Al Saiyd, N.A.
    Computer Science and Information Technology (CSIT), 2013 5th International Conference on

    DOI: 10.1109/CSIT.2013.6588787
    Publication Year: 2013 , Page(s): 243 - 247

    IEEE Conference Publications

    Clickstream data is one of the most important sources of information in websites usage and customers' behavior in Banks e-services. A number of web usage mining scenarios are possible depending on the available information. While simple traffic analysis based on click stream data may easily be performed to improve the e-banks services. The banks need data mining techniques to substantially improve Banks e-services activities. The relationships between data mining techniques and the Web usage mining are studied. Web structure mining, has three types these types are web usage structure, mining data streams and web content. The integration between the Web usage mining and data mining techniques are presented for processes at different stages, including the pattern discovery phases, and introduces banks cases, that have analytical mining technique. A general framework for fully integrating domain Web usage mining and data mining techniques is represented for processes at different stages. Data Mining techniques can be very helpful to the banks for better performance, acquiring new customers, fraud detection in real time, providing segment based products, and analysis of the customers purchase patterns over time. View full abstract»

  • Full text access may be available. Click article title to sign in or learn about subscription options.

    Data mining and ware housing

    Bora, S.P.
    Electronics Computer Technology (ICECT), 2011 3rd International Conference on

    Volume: 1
    DOI: 10.1109/ICECTECH.2011.5941548
    Publication Year: 2011 , Page(s): 1 - 5

    IEEE Conference Publications

    Data mining is a combination of database and artificial intelligence technologies. Although the AI field has taken a major dive in the last decade; this new emerging field has shown that AI can add major contributions to existing fields in computer science. In fact, many experts believe that data mining is the third hottest field in the industry behind the Internet, and data warehousing. Data mining is really just the next step in the process of analyzing data. Instead of getting queries on standard or user-specified relationships, data mining goes a step farther by finding meaningful relationships in data. Relationships that were thought to have not existed or ones that give a more insightful view of the data. For example, a computer generated graph may not give the user any insight; however data mining can find trends in the same data that shows the user more precisely what is going on. Using trends that the end-user would have never thought to query the computer about. Without adding any more data, data mining gives a huge increase in the value added by the database. It allows both technical and non-technical users get better answers, allowing them to make a much more informed decision, saving their companies millions of dollars. Data Mining is a concept that is taking off in the commercial sector as a means of finding useful information out of gigabytes of data. While products for the commercial environment are starting to become available, tools for a scientific environment are much rarer (or even non-existent). Yet scientists have long had to search through reams of printouts and rooms full of tapes to find the gems that make up scientific discovery. This paper will explore some of the ad hoc methods generally used for Data Mining in the scientific community, including such things as scientific visualization, and outline how some of the more recently developed products used in the commercial environment can be adapted to scientific Data Mining. View full abstract»

  • Full text access may be available. Click article title to sign in or learn about subscription options.

    Data structuring and effective retrieval in the mining of web sequential characteristic

    Zenggui Ou
    Electronic and Mechanical Engineering and Information Technology (EMEIT), 2011 International Conference on

    Volume: 7
    DOI: 10.1109/EMEIT.2011.6023787
    Publication Year: 2011 , Page(s): 3551 - 3554

    IEEE Conference Publications

    The Web data mining based on sequential characteristics is a mining technology focusing on text data on Web pages and link structure and combing sequential characteristics on the basis of the mining of Web structure and Web contents. A huge number of data information is carried on Web, and it is increased at a geometric speed every day. As time goes by, the effectiveness of a great number of data is continuously reduced, and they even become completely useless. How to clean these useless data, find out hidden regular contents among a great number of data, and solve the quality problem of data application has become the research hotspot in the Web data mining technology at present. All the information objects on Web can be generally divided into two categories: Structured data and semi-structured data. Those that can be expressed in database structure are called structured data; those expressed in various forms with text as representative are called semi-structured data. The greatest feature of Web data is semi-structuring. Such kind of semi-structured data are relevant to time sequence, meanwhile, time effect of data is also related to time sequence. In the article, discussion is made about how to use the sequential characteristic in the course of Web data mining to carry out structural transfer of semi-structured data based on time effect of data, that is the structuring of Web data, and solve the problem about effectiveness in retrieval accordingly. View full abstract»

  • Full text access may be available. Click article title to sign in or learn about subscription options.

    Potential use of Artificial Neural Network in Data Mining

    Nirkhi, S.
    Computer and Automation Engineering (ICCAE), 2010 The 2nd International Conference on

    Volume: 2
    DOI: 10.1109/ICCAE.2010.5451537
    Publication Year: 2010 , Page(s): 339 - 343
    Cited by:  Papers (6)

    IEEE Conference Publications

    With the enormous amount of data stored in files, databases, and other repositories, it is increasingly important, to develop powerful means for analysis and perhaps interpretation of such data and for the extraction of interesting knowledge that could help in decision-making. Data Mining, also popularly known as Knowledge Discovery in Databases (KDD), refers to the nontrivial extraction of implicit, previously unknown and potentially useful information from data in databases. Thus data mining is the process of automated extraction of hidden, predictive information from large databases. Data mining includes: extract, transform, and load transaction data onto the data warehouse system. Neural networks have been successfully applied in a wide range of supervised and unsupervised learning applications. Neural-network methods are not commonly used for data-mining tasks, because they may have complex structure, long training time, and uneasily understandable representation of results & often produce incomprehensible models. However, neural networks have high acceptance ability for noisy data and high accuracy and are preferable in data mining. In this paper, investigation is made to explore application of Artificial Neural Network in Data mining techniques, the key technology and ways to achieve the data mining based on neural networks are also researched. Given the current state of the art, neural-network deserves a place in the tool boxes of data-mining specialists. View full abstract»

  • Full text access may be available. Click article title to sign in or learn about subscription options.

    The Study of Multidimensional-Data Flow of Fishbone Applied for Data Mining

    Zhang Yun ; Li Weihua ; Chen Yang
    Software Engineering Research, Management and Applications, 2009. SERA '09. 7th ACIS International Conference on

    DOI: 10.1109/SERA.2009.22
    Publication Year: 2009 , Page(s): 86 - 91
    Cited by:  Papers (1)

    IEEE Conference Publications

    Data mining driven fishbone, which is whole a new term, is an enhancement of abstractive conception of multidimensional-data flow of fishbone applied for data mining to optimize the process and structure of data mining. End-to-end DMDF diagram includes complex dataflow and different processing component and improvements for numerous aspects in multiply level. DMDF provides integrated platform and mixed methodology to support the whole life cycle of data mining with comprehensive methodology. Data preprocessing, data classification, association rule mining and prediction are the foundation and linkage of the whole data mining process life cycle. DMDF supports combination of different mining component from strategy level, tactical level to abstractive level, and then re-engineered data mining process into execution system to realize reasonable architecture. DMDF is a new direction of the structure of data mining process. View full abstract»

  • Full text access may be available. Click article title to sign in or learn about subscription options.

    Data mining for security applications

    Thuraisingham, B.
    Machine Learning and Applications, 2004. Proceedings. 2004 International Conference on

    DOI: 10.1109/ICMLA.2004.1383486
    Publication Year: 2004 , Page(s): 3 - 4

    IEEE Conference Publications

    Data mining is the process of posing queries and extracting patterns, often previously unknown from large quantities of data using pattern matching or other reasoning techniques. Cyber security is the area that deals with cyber terrorism. We are hearing that cyber attacks will cause corporations billions of dollars. For example, one could masquerade as a legitimate user and swindle say a bank of billions of dollars. Data mining and web mining may be used to detect and possibly prevent security attacks including cyber attacks. For example, anomaly detection techniques could be used to detect unusual patterns and behaviors. Link analysis may be used to trace the viruses to the perpetrators. Classification may be used to group various cyber attacks and then use the profiles to detect an attack when it occurs. Prediction may be used to determine potential future attacks depending in a way on information learnt about terrorists through email and phone conversations. Also, for some threats non real-time data mining may suffice while for certain other threats such as for network intrusions we may need real-time data mining. Many researchers are investigating the use of data mining for intrusion detection. While we need some form of real-time data mining, that is, the results have to be generated in real-time, we also need to build models in real-time. For example, credit card fraud detection is a form of real-time processing. However, here models are built ahead of time. Building models in real-time remains a challenge. Data mining can also be used for analyzing web logs as well as analyzing the audit trails. Based on the results of the data mining tool, one can then determine whether any unauthorized intrusions have occurred and/or whether any unauthorized queries have been posed. There has been much research on data mining for intrusion detection. Data mining may also be applied for Biometrics related applications. Finally data mining has applications in national securi- y including detecting and preventing terrorist activities. The presentation will provide an overview of data mining and security threats and then discuss the applications of data mining for cyber security and national security including in intrusion detection and biometrics. Privacy considerations including a discussion of privacy preserving data mining will also be given. View full abstract»

  • Full text access may be available. Click article title to sign in or learn about subscription options.

    Task-driven data mining in the formation evaluation field

    Xiongyan Li ; Hongqi Li ; He Xu ; Zhou Jinyu ; Chen Yihan
    Advanced Information Management and Service (IMS), 2010 6th International Conference on

    Publication Year: 2010 , Page(s): 42 - 47

    IEEE Conference Publications

    In the traditional data-driven data mining process, there are huge gaps between the efficient algorithms and intelligent tools as well as the invalidity of knowledge, which is obtained by traditional data-driven data mining. Meanwhile, each data in the earth science field contains a solid physical meaning. If there is no corresponding domain knowledge involved in the mining process, the information explored by data-driven data mining will be lack of practicability and not able to effectively solve problems in the earth science area. Therefore, the task-driven data mining is proposed. Additionally, task-driven data mining concepts and principles are elaborated with the help of data mining concepts and techniques. It is divided into seven elements such as data warehousing, data preprocessing, feature subset selecting, modeling, model evaluating, model updating and model releasing. Those constitute a cyclic and iterative process until the appearance of a predictive model, which is capable of effectively achieving the objectives. The task-driven data mining is applied to recognizing the complex lithologies and the low resistivity oil layer, and the whole mining process is elaborated. Their accuracy rates are more than 90%. Finally, the paper puts forward the understandings, development prospects and key challenges of task-driven data mining facing. View full abstract»

  • Full text access may be available. Click article title to sign in or learn about subscription options.

    A study on classification techniques in data mining

    Kesavaraj, G. ; Sukumaran, S.
    Computing, Communications and Networking Technologies (ICCCNT),2013 Fourth International Conference on

    DOI: 10.1109/ICCCNT.2013.6726842
    Publication Year: 2013 , Page(s): 1 - 7

    IEEE Conference Publications

    Data mining is a process of inferring knowledge from such huge data. Data Mining has three major components Clustering or Classification, Association Rules and Sequence Analysis. By simple definition, in classification/clustering analyze a set of data and generate a set of grouping rules which can be used to classify future data. Data mining is the process is to extract information from a data set and transform it into an understandable structure. It is the computational process of discovering patterns in large data sets involving methods at the intersection of artificial intelligence, machine learning, statistics, and database systems. The actual data mining task is the automatic or semi-automatic analysis of large quantities of data to extract previously unknown interesting patterns. Data mining involves six common classes of tasks. Anomaly detection, Association rule learning, Clustering, Classification, Regression, Summarization. Classification is a major technique in data mining and widely used in various fields. Classification is a data mining (machine learning) technique used to predict group membership for data instances. In this paper, we present the basic classification techniques. Several major kinds of classification method including decision tree induction, Bayesian networks, k-nearest neighbor classifier, the goal of this study is to provide a comprehensive review of different classification techniques in data mining. View full abstract»

Skip to Results

SEARCH HISTORY

Search History is available using your personal IEEE account.