2016 IEEE International Conference on Big Data Analysis (ICBDA)

12-14 March 2016

Filter Results

Displaying Results 1 - 25 of 72
  • [Front cover]

    Publication Year: 2016, Page(s): c1
    Request permission for commercial reuse | PDF file iconPDF (149 KB)
    Freely Available from IEEE
  • [Title page]

    Publication Year: 2016, Page(s): 1
    Request permission for commercial reuse | PDF file iconPDF (47 KB)
    Freely Available from IEEE
  • [Copyright notice]

    Publication Year: 2016, Page(s): 1
    Request permission for commercial reuse | PDF file iconPDF (86 KB)
    Freely Available from IEEE
  • Table of contents

    Publication Year: 2016, Page(s):1 - 6
    Request permission for commercial reuse | PDF file iconPDF (135 KB)
    Freely Available from IEEE
  • Author index

    Publication Year: 2016, Page(s):1 - 4
    Request permission for commercial reuse | PDF file iconPDF (91 KB)
    Freely Available from IEEE
  • A behavior mining based hybrid recommender system

    Publication Year: 2016, Page(s):1 - 5
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (666 KB) | HTML iconHTML

    Recommender systems are mostly well known for their applications in e-commerce sites and are mostly static models. Classical personalized recommender algorithm include collaborative filtering method applied in Amazon, matrix factorization algorithm from Netflix, etc. In this article, we hope to combine traditional model with behavior pattern extraction method. We use desensitized mobile transactio... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • New combination algorithms in commercial area data mining and clustering

    Publication Year: 2016, Page(s):1 - 5
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (799 KB) | HTML iconHTML

    The location of business is indispensable for all commercial activities. However, current partition of commercial area mostly depends on human experience and other subjective factors rather than intelligent decisions, which is likely to mislead people who want to engage in business. The new combination of algorithms in this paper aims to clarify how the commercial area is formed by visualizing whe... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Unsupervised feature selection for text classification via word embedding

    Publication Year: 2016, Page(s):1 - 5
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (331 KB) | HTML iconHTML

    The key of big text documents data analysis is to classify those text documents. To classify those text documents, it is necessary to represent those text documents as vectors which is vector space model (VSM). A powerful vector space model should remain the classification information with dimensions as little as possible. To achieve that, it is important to select most effective features for text... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Research on counter-terrorism based on big data

    Publication Year: 2016, Page(s):1 - 5
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (663 KB) | HTML iconHTML

    In the area of big data, people have a new perspective on counter-terrorism research. In this paper, we have carried out a systematic research on the applications of big data in counter-terrorism field by using quantitative analysis method. And then we have demonstrated effect of big data on counter-terrorism research from data collection and preprocessing, data mining and analysis, monitoring and... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Research of association rule algorithm based on data mining

    Publication Year: 2016, Page(s):1 - 4
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (214 KB) | HTML iconHTML

    Association rule data mining is an important part in the field of data mining data mining, its algorithm performance directly affects the efficiency of data mining and the integrity, effectiveness of ultimate data mining results. Based on the existing association rule mining algorithms, this paper studies and analyzes their efficiency and effectiveness, and according to the efficiency defects of A... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Sentiment analysis in a cross-media analysis framework

    Publication Year: 2016, Page(s):1 - 5
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (679 KB) | HTML iconHTML

    This paper introduces the implementation and integration of a sentiment analysis pipeline into the ongoing open source cross-media analysis framework. The pipeline includes the following components; chat room cleaner, NLP and sentiment analyzer. Before the integration, we also compare two broad categories of sentiment analysis methods, namely lexicon-based and machine learning approaches. We mainl... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Evaluation algorithm for clustering quality based on information entropy

    Publication Year: 2016, Page(s):1 - 5
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (428 KB) | HTML iconHTML

    As a branch of statistics, cluster analysis has been extensively studied and widely used in many applications. Cluster analysis has recently become a highly active topic in data mining research. As a data mining function, cluster analysis can be used as a standalone tool to gain insight into the distribution of data, to observe the characteristics of each cluster. Alternatively, it may serve as a ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Research on audit log association rule mining based on improved Apriori algorithm

    Publication Year: 2016, Page(s):1 - 7
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (723 KB) | HTML iconHTML

    Aimed at solving the problem of low-level intelligence and low utilization of audit logs of the security audit system, a secure audit system based on association rule mining is proposed in this paper. The system is able to take full advantage of the existing audit logs, establish the behavior pattern database of users and the system with data mining technique, and discover abnormal situation in a ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • What is your Mother Tongue?: Improving Chinese native language identification by cleaning noisy data and adopting BM25

    Publication Year: 2016, Page(s):1 - 6
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (438 KB) | HTML iconHTML

    Native language identification (NLI) is a process by which an author's native language can be identified from essays written in the second language of the author. In this work, a supervised model is built to accomplish this based on a Chinese learner corpus. In the NLI field, this is the first work to (1) eliminate noisy data automatically before the training phase and (2) employ a BM25 term weigh... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Using social media mining technology to assist in price prediction of stock market

    Publication Year: 2016, Page(s):1 - 4
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (282 KB) | HTML iconHTML

    Price prediction in stock market is considered to be one of the most difficult tasks, because of the price dynamic. Previous study found that stock price volatility in a short term is closely related to the market sentiment; especially for small-cap stocks. This paper used the social media mining technology to quantitative evaluation market segment, and in combination with other factors to predict... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Apriori-based diagnostical analysis of passings in the football game

    Publication Year: 2016, Page(s):1 - 4
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (391 KB) | HTML iconHTML

    Early football game research with data mining method tends to include all the basic game actions in the statistics and come out with a descriptive result. Taking use of modified data structure and renewed algorithm, diagnosed results could be worked out and expressed by the tendency network. In order to get these results, all data were firstly cleaned according to modified data structure after dat... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • RHadoop-based fuzzy data mining: Architecture, design and system implementation

    Publication Year: 2016, Page(s):1 - 5
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (421 KB) | HTML iconHTML

    Data mining is a challenge for end-users, which requires knowledge and skills on business domains, data mining algorithms and software development. In response to the challenge, we have proposed, designed and implemented a novel data mining system named RFDM (RHadoop-based Fuzzy Data Mining), which supports fuzzy data mining process and experience with user convenience and reduced cost. The system... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Keyword query approach over RDF data based on tree template

    Publication Year: 2016, Page(s):1 - 6
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (277 KB) | HTML iconHTML

    With the large increment of semantic data available in the web, the demand for access to Semantic data is increasing. Keyword query is regarded as an intuitive paradigm, especially for the users who are not familiar with the data and the RDF query language. In this paper, an approach based on tree template is proposed, which can return ranked answers to keyword query without the help of data schem... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Agricultural data modeling method based on semantics

    Publication Year: 2016, Page(s):1 - 5
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (317 KB) | HTML iconHTML

    Based on unified data expression model of agricultural data lacking with the common management data, and inconsistent problems of agricultural data in the process of the management and use, the establishing method of the agricultural data integration model based on ontology is presented in this paper. First, the common management data structure of agricultural production institutions is analyzed, ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Finding database contention hotspots under large-scale workloads - A big data approach

    Publication Year: 2016, Page(s):1 - 5
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (467 KB) | HTML iconHTML

    Database plays an important role in transactional information systems. One significant performance impacting factor is data lock contention in transaction processing. In order to guide better database design, we propose a novel solution to identify contention hotspots displayed in DBMS transaction logs. To analyze the large volume of data collected in the transaction log, our solution employees bi... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Big data analysis based on POT method for design flood prediction

    Publication Year: 2016, Page(s):1 - 5
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (455 KB) | HTML iconHTML

    The big data era in hydrology is on its way and will be of great significance to design flood prediction. How to discover and extract the most useful information from the abundant data with high correlation becomes an important problem. This paper aims to do a trial on data selection based on peak over threshold (POT) method. Attempts were made to evaluate the impacts of generalized Pareto (GP) di... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Original sustainability measurement based on data analysis and processing

    Publication Year: 2016, Page(s):1 - 8
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (388 KB) | HTML iconHTML

    In contemporary days, as the society develops, more indicators tend to be unpersuasive when it comes to measuring sustainability. Moreover, in order to gain a comprehensive indicator system, the indicator per seis expected to be extensive rather than specific. In this paper, a new comprehensive indicator system (PRETEC) is created and an original model (Fu's model) combining Principal Component An... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Research and implementation of big data preprocessing system based on Hadoop

    Publication Year: 2016, Page(s):1 - 5
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (546 KB) | HTML iconHTML

    With the rising growth trend of data size in the Internet era, storage, analysis, and processing of big data arebecomingamong the strongtopics in academia and industry. Typical big data processing platforms adopt the MapReduce programming model to perform application processing. For example, the deployment and calculation method of Hadoop are as follows: Hadoop first collects data and stores them ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Classification of customer requirements on Map Reduce-based Naive Bayes

    Publication Year: 2016, Page(s):1 - 4
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (513 KB) | HTML iconHTML

    In the era of continuously enlarging mass customization based on customer requirements, a Classification method called MP-NB (i.e. Map Reduce-based Naive Bayes) is proposed to process CRIA(i.e. Customer Requirement Information Acquisition) classifications on large-scale mobile data. It utilizes Hadoop2.X-based system to store CRIA on HDFS. Combination with the standardization theory of CRIA, It ut... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Enhanced KStore with the use of dictionary and Trie for retail business data

    Publication Year: 2016, Page(s):1 - 5
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (305 KB) | HTML iconHTML

    To support efficient business data analytics, having a good computer data structure to store and access business data is a top priority. KStore is a data structure proposed by Jane Campbell Mazzagatti based on the Phaneron of C. S. Peirce. KStore is designed and developed as a storage engine to support business intelligence data storage, queries and analysis. The generation and data access of KSto... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.