Data Mining:Concepts, Models, Methods, and Algorithms

Cover Image Copyright Year: 2011
Author(s): Mehmed Kantardzic
Book Type: Wiley-IEEE Press
Content Type : Books & eBooks
Topics: Communication, Networking & Broadcasting ;  Computing & Processing
  • Print

Abstract

Now updated—the systematic introductory guide to modern analysis of large data sets

As data sets continue to grow in size and complexity, there has been an inevitable move towards indirect, automatic, and intelligent data analysis in which the analyst works via more complex and sophisticated software tools. This book reviews state-of-the-art methodologies and techniques for analyzing enormous quantities of raw data in high-dimensional data spaces to extract new information for decision-making.

This Second Edition of Data Mining: Concepts, Models, Methods, and Algorithms discusses data mining principles and then describes representative state-of-the-art methods and algorithms originating from different disciplines such as statistics, machine learning, neural networks, fuzzy logic, and evolutionary computation. Detailed algorithms are provided with necessary explanations and illustrative examples, and questions and exercises for practice at the end of each chapt r. This new edition features the following new techniques/methodologies:

  • Support Vector Machines (SVM)—developed based on statistical learning theory, they have a large potential for applications in predictive data mining

  • Kohonen Maps (Self-Organizing Maps - SOM)—one of very applicative neural-networks-based methodologies for descriptive data mining and multi-dimensional data visualizations

  • DBSCAN, BIRCH, and distributed DBSCAN clustering algorithms—representatives of an important class of density-based clustering methodologies

  • Bayesian Networks (BN) methodology often used for causality modeling

  • Algorithms for measuring Betweeness and Centrality parameters in graphs, important for applications in mining large social networks

  • CART algorithm and Gini index in building decision trees

  • Bagging & Boosting approaches to ensemble-learning methodologies, with details of AdaBoost algorithm

  • Relief algorithm, one of the core feature selection algorithms inspired by instance-based learning

  • PageRank algorithm for mining and authority ranking of web pages

  • Latent Semantic Analysis (LSA) for text mining and measuring semantic similarities between text-based documents

  • New sections on temporal, spatial, web, text, parallel, and distributed data mining

  • More emphasis on business, privacy, security, and legal aspects of data mining technology

This text offers guidance on how and when to use a particular software tool (with the companion data sets) from among the hundreds offered when faced with a data set to mine. This allows analysts to create and perform their own data mining experiments using their knowledge of the methodologies and techniques provided. The book emphasizes the selection of appropriate methodologies and data analysis software, as well as parameter tuning. These critically important, qualitative decisions can only be made with the deeper understanding of paramet r meaning and its role in the technique that is offered here.

This volume is primarily intended as a data-mining textbook for computer science, computer engineering, and computer information systems majors at the graduate level. Senior students at the undergraduate level and with the appropriate background can also successfully comprehend all topics presented here.

  •   Click to expandTable of Contents

    • Full text access may be available. Click article title to sign in or learn about subscription options.

      Frontmatter

      Copyright Year: 2011

      Wiley-IEEE Press eBook Chapters

      The prelims comprise:

    • Half Title

    • IEEE Press Editorial Board

    • Title

    • Copyright

    • Dedication

    • Contents

    • Preface to the Second Edition

    • Preface to the First Edition

    • View full abstract»

  • Full text access may be available. Click article title to sign in or learn about subscription options.

    Data-Mining Concepts

    Copyright Year: 2011

    Wiley-IEEE Press eBook Chapters

    This chapter contains sections titled:

  • Introduction

  • Data-Mining Roots

  • Data-Mining Process

  • Large Data Sets

  • Data Warehouses for Data Mining

  • Business Aspects of Data Mining: Why a Data-Mining Project Fails

  • Organization of This Book

  • Review Questions and Problems

  • References for Further Study

  • View full abstract»

  • Full text access may be available. Click article title to sign in or learn about subscription options.

    Preparing the Data

    Data Mining:Concepts, Models, Methods, and Algorithms
    Copyright Year: 2011

    Wiley-IEEE Press eBook Chapters

    This chapter contains sections titled:

  • Representation of Raw Data

  • Characteristics of Raw Data

  • Transformation of Raw Data

  • Missing Data

  • Time-Dependent Data

  • Outlier Analysis

  • Review Questions and Problems

  • References for Further Study

  • View full abstract»

  • Full text access may be available. Click article title to sign in or learn about subscription options.

    Data Reduction

    Data Mining:Concepts, Models, Methods, and Algorithms
    Copyright Year: 2011

    Wiley-IEEE Press eBook Chapters

    This chapter contains sections titled:

  • Dimensions of Large Data Sets

  • Feature Reduction

  • Relief Algorithm

  • Entropy Measure for Ranking Features

  • PCA

  • Value Reduction

  • Feature Discretization: ChiMerge Technique

  • Case Reduction

  • Review Questions and Problems

  • References for Further Study

  • View full abstract»

  • Full text access may be available. Click article title to sign in or learn about subscription options.

    Learning from Data

    Data Mining:Concepts, Models, Methods, and Algorithms
    Copyright Year: 2011

    Wiley-IEEE Press eBook Chapters

    This chapter contains sections titled:

  • Learning Machine

  • SLT

  • Types of Learning Methods

  • Common Learning Tasks

  • SVMs

  • kNN: Nearest Neighbor Classifier

  • Model Selection versus Generalization

  • Model Estimation

  • 90% Accuracy: Now What?

  • Review Questions and Problems

  • References for Further Study

  • View full abstract»

  • Full text access may be available. Click article title to sign in or learn about subscription options.

    Statistical Methods

    Data Mining:Concepts, Models, Methods, and Algorithms
    Copyright Year: 2011

    Wiley-IEEE Press eBook Chapters

    This chapter contains sections titled:

  • Statistical Inference

  • Assessing Differences in Data Sets

  • Bayesian Inference

  • Predictive Regression

  • ANOVA

  • Logistic Regression

  • Log-Linear Models

  • LDA

  • Review Questions and Problems

  • References for Further Study

  • View full abstract»

  • Full text access may be available. Click article title to sign in or learn about subscription options.

    Decision Trees and Decision Rules

    Data Mining:Concepts, Models, Methods, and Algorithms
    Copyright Year: 2011

    Wiley-IEEE Press eBook Chapters

    This chapter contains sections titled:

  • Decision Trees

  • C4.5 Algorithm: Generating a Decision Tree

  • Unknown Attribute Values

  • Pruning Decision Trees

  • C4.5 Algorithm: Generating Decision Rules

  • CART Algorithm & Gini Index

  • Limitations of Decision Trees and Decision Rules

  • Review Questions and Problems

  • References for Further Study

  • View full abstract»

  • Full text access may be available. Click article title to sign in or learn about subscription options.

    Artificial Neural Networks

    Data Mining:Concepts, Models, Methods, and Algorithms
    Copyright Year: 2011

    Wiley-IEEE Press eBook Chapters

    This chapter contains sections titled:

  • Model of an Artificial Neuron

  • Architectures of ANNs

  • Learning Process

  • Learning Tasks Using ANNs

  • Multilayer Perceptrons (MLPs)

  • Competitive Networks and Competitive Learning

  • SOMs

  • Review Questions and Problems

  • References for Further Study

  • View full abstract»

  • Full text access may be available. Click article title to sign in or learn about subscription options.

    Ensemble Learning

    Data Mining:Concepts, Models, Methods, and Algorithms
    Copyright Year: 2011

    Wiley-IEEE Press eBook Chapters

    This chapter contains sections titled:

  • Ensemble-Learning Methodologies

  • Combination Schemes for Multiple Learners

  • Bagging and Boosting

  • AdaBoost

  • Review Questions and Problems

  • References for Further Study

  • View full abstract»

  • Full text access may be available. Click article title to sign in or learn about subscription options.

    Cluster Analysis

    Data Mining:Concepts, Models, Methods, and Algorithms
    Copyright Year: 2011

    Wiley-IEEE Press eBook Chapters

    This chapter contains sections titled:

  • Clustering Concepts

  • Similarity Measures

  • Agglomerative Hierarchical Clustering

  • Partitional Clustering

  • Incremental Clustering

  • DBSCAN Algorithm

  • BIRCH Algorithm

  • Clustering Validation

  • Review Questions and Problems

  • References for Further Study

  • View full abstract»

  • Full text access may be available. Click article title to sign in or learn about subscription options.

    Association Rules

    Data Mining:Concepts, Models, Methods, and Algorithms
    Copyright Year: 2011

    Wiley-IEEE Press eBook Chapters

    This chapter contains sections titled:

  • Market-Basket Analysis

  • Algorithm Apriori

  • From Frequent Itemsets to Association Rules

  • Improving the Efficiency of the Apriori Algorithm

  • FP Growth Method

  • Associative-Classification Method

  • Multidimensional Association-Rules Mining

  • Review Questions and Problems

  • References for Further Study

  • View full abstract»

  • Full text access may be available. Click article title to sign in or learn about subscription options.

    Web Mining and Text Mining

    Data Mining:Concepts, Models, Methods, and Algorithms
    Copyright Year: 2011

    Wiley-IEEE Press eBook Chapters

    This chapter contains sections titled:

  • Web Mining

  • Web Content, Structure, and Usage Mining

  • HITS and LOGSOM Algorithms

  • Mining Path-Traversal Patterns

  • PageRank Algorithm

  • Text Mining

  • Latent Semantic Analysis (LSA)

  • Review Questions and Problems

  • References for Further Study

  • View full abstract»

  • Full text access may be available. Click article title to sign in or learn about subscription options.

    Advances in Data Mining

    Data Mining:Concepts, Models, Methods, and Algorithms
    Copyright Year: 2011

    Wiley-IEEE Press eBook Chapters

    This chapter contains sections titled:

  • Graph Mining

  • Temporal Data Mining

  • Spatial Data Mining (SDM)

  • Distributed Data Mining (DDM)

  • Correlation Does Not Imply Causality

  • Privacy, Security, and Legal Aspects of Data Mining

  • Review Questions and Problems

  • References for Further Study

  • View full abstract»

  • Full text access may be available. Click article title to sign in or learn about subscription options.

    Genetic Algorithms

    Data Mining:Concepts, Models, Methods, and Algorithms
    Copyright Year: 2011

    Wiley-IEEE Press eBook Chapters

    This chapter contains sections titled:

  • Fundamentals of GAs

  • Optimization Using GAs

  • A Simple Illustration of a GA

  • Schemata

  • TSP

  • Machine Learning Using GAs

  • GAs for Clustering

  • Review Questions and Problems

  • References for Further Study

  • View full abstract»

  • Full text access may be available. Click article title to sign in or learn about subscription options.

    Fuzzy sets and Fuzzy Logic

    Data Mining:Concepts, Models, Methods, and Algorithms
    Copyright Year: 2011

    Wiley-IEEE Press eBook Chapters

    This chapter contains sections titled:

  • Fuzzy Sets

  • Fuzzy-Set Operations

  • Extension Principle and Fuzzy Relations

  • Fuzzy Logic and Fuzzy Inference Systems

  • Multifactorial Evaluation

  • Extracting Fuzzy Models from Data

  • Data Mining and Fuzzy Sets

  • Review Questions and Problems

  • References for Further Study

  • View full abstract»

  • Full text access may be available. Click article title to sign in or learn about subscription options.

    Visualization Methods

    Data Mining:Concepts, Models, Methods, and Algorithms
    Copyright Year: 2011

    Wiley-IEEE Press eBook Chapters

    This chapter contains sections titled:

  • Perception and Visualization

  • Scientific Visualization and Information Visualization

  • Parallel Coordinates

  • Radial Visualization

  • Visualization Using Self-Organizing Maps (SOMs)

  • Visualization Systems for Data Mining

  • Review Questions and Problems

  • References for Further Study

  • View full abstract»

  • Full text access may be available. Click article title to sign in or learn about subscription options.

    Appendix A

    Data Mining:Concepts, Models, Methods, and Algorithms
    Copyright Year: 2011

    Wiley-IEEE Press eBook Chapters

    This appendix contains sections titled:

  • Data-Mining Journals

  • Data-Mining Conferences

  • Data-Mining Forums/Blogs

  • Data Sets

  • Comercially and Publicly Available Tools

  • Web Site Links

  • View full abstract»

  • Full text access may be available. Click article title to sign in or learn about subscription options.

    Appendix B: Data-Mining Applications

    Data Mining:Concepts, Models, Methods, and Algorithms
    Copyright Year: 2011

    Wiley-IEEE Press eBook Chapters

    This appendix contains sections titled:

  • Data Mining for Financial Data Analysis

  • Data Mining for the Telecomunications Industry

  • Data Mining for the Retail Industry

  • Data Mining in Health Care and Biomedical Research

  • Data Mining in Science and Engineering

  • Pitfalls of Data Mining

  • View full abstract»

  • Full text access may be available. Click article title to sign in or learn about subscription options.

    Bibliography

    Data Mining:Concepts, Models, Methods, and Algorithms
    Copyright Year: 2011

    Wiley-IEEE Press eBook Chapters

    No abstract.

    View full abstract»

  • Full text access may be available. Click article title to sign in or learn about subscription options.

    Index

    Data Mining:Concepts, Models, Methods, and Algorithms
    Copyright Year: 2011

    Wiley-IEEE Press eBook Chapters

    No abstract.

    View full abstract»



  • On This Page

    Recently Published

    Learn More About