By Topic

Digital Information Management (ICDIM), 2011 Sixth International Conference on

Date 26-28 Sept. 2011

Filter Results

Displaying Results 1 - 25 of 64
  • Author index

    Publication Year: 2011 , Page(s): 1 - 5
    Save to Project icon | Request Permissions | PDF file iconPDF (153 KB)  
    Freely Available from IEEE
  • International Program Committee

    Publication Year: 2011 , Page(s): 1 - 2
    Save to Project icon | Request Permissions | PDF file iconPDF (231 KB)  
    Freely Available from IEEE
  • Contents

    Publication Year: 2011 , Page(s): 1 - 6
    Save to Project icon | Request Permissions | PDF file iconPDF (534 KB)  
    Freely Available from IEEE
  • [Copyright notice]

    Publication Year: 2011 , Page(s): 1
    Save to Project icon | Request Permissions | PDF file iconPDF (215 KB)  
    Freely Available from IEEE
  • On mining XML integrity constraints

    Publication Year: 2011 , Page(s): 23 - 29
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (221 KB) |  | HTML iconHTML  

    Since XML documents can appear in any semi-structured form, structural and integrity constraints are often imposed on the data that are to be modified or processed. These constraints are formally defined in a schema. But, despite the obvious advantages, the presence of a schema is not mandatory and many XML documents are not joined with any. Consequently, no integrity constrains are specified as well. In this paper we focus on extension of approaches for inferring an XML schema from a sample set of XML documents with mining primary and foreign keys. In particular we consider the keys in the context of XSD, i.e. absolute and relative as well as simple and composite keys. We propose a novel approach called KeyMiner and depict its efficiency experimentally using real-world and synthetic data. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Evaluation of stop word lists in text retrieval using Latent Semantic Indexing

    Publication Year: 2011 , Page(s): 133 - 136
    Cited by:  Papers (3)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (233 KB) |  | HTML iconHTML  

    The goal of this research is to evaluate the use of English stop word lists in Latent Semantic Indexing (LSI)-based Information Retrieval (IR) systems with large text datasets. Literature claims that the use of such lists improves retrieval performance. Here, three different lists are compared: two were compiled by IR groups at the University of Glasgow and the University of Tennessee, and one is our own list developed at the University of Northern British Columbia. We also examine the case where stop words are not removed from the input dataset. Our research finds that using tailored stop word lists improves retrieval performance. On the other hand, using arbitrary (non-tailored) lists or not using any list reduces the retrieval performance of LSI-based IR systems with large text datasets. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Online ngram-enhanced topic model for academic retrieval

    Publication Year: 2011 , Page(s): 137 - 142
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (308 KB) |  | HTML iconHTML  

    Applying topic model to text mining has achieved a great success. However, state-of-art topic modeling methods still have potential to improve in academic retrieval field. In this paper, we propose an online unified topic model, which is ngram-enhanced. Our model discovers topics with unigrams as well as topical bigrams and is updated by an online inference algorithm with the new incoming data streams. On this basis, we combine our model into the query likelihood model and develop an integrated academic searching system. Experiment results on ACM collection show that our proposed methods outperform the existing approaches on document modeling and searching accuracy. Especially, we prove the efficiency of our system on academic retrieval problem. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Private range query by perturbation and matrix based encryption

    Publication Year: 2011 , Page(s): 211 - 216
    Cited by:  Papers (2)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (371 KB) |  | HTML iconHTML  

    In this paper, we propose a novel approach for private query; IPP (inner product predicate) method. Private query is a query processing protocol to obtain requesting tuples without exposing any information about what users request to third persons including service providers. Existing works about private query such as PIR, which ensure information theoretic safety, have severe restriction because they do not support range queries nor allow tuples having a same value in queried attributes. Our IPP method, on the other hands, focuses range queries mainly and it allows tuples having a same value in any attributes. IPP method employs a query transform by trusted clients (QT) scheme and proposes transformation algorithms which make the correlation between plain queries and transformed queries and the correlation between plain attribute values and transformed attribute values small enough. Thus, the transformed queries and attribute values have resistance to frequency analysis attacks which implies IPP method prevents attackers, who know the plain distribution of them, from computing the plain queries and attribute values from transformed values. IPP method adds perturbations to queries and attribute values and gives them a matrix based encryption to achieve the above property. We also confirm the computational cost on servers belongs to O(n) with the number of tuples n and is virtually no correlation between the distributions of transformed queries and queried attribute values and the plain distributions of them by experimental evaluations. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A study of motivations for using mobile content sharing games

    Publication Year: 2011 , Page(s): 258 - 263
    Cited by:  Papers (2)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1082 KB) |  | HTML iconHTML  

    Applications blending games with mobile content sharing have recently attracted much interest. In this paper, we examine users' motivations for creating content in the context of Indagator, a mobile content sharing game. Studying motivations is important because a deeper understanding will help designers create compelling and impactful applications for users. We conducted an experiment where 28 participants used Indagator for a week to create content (annotations). Participants were also interviewed. All interview responses, and annotations generated (599) were manually examined and coded to ascertain motivations. Motivations for creating content include altruism, task performance, competitive play, killing time, reminder of experiences, self-presentation, and socializing. Additionally, games emerged as a motivator for sharing mobile content. Implications from our findings are discussed. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Pattern mining for query answering in marine sensor data

    Publication Year: 2011 , Page(s): 288 - 293
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1218 KB) |  | HTML iconHTML  

    An integrated pattern mining technique for query answering is proposed for marine sensor data. In pattern query, we adopt the dynamic time warping (DTW) method and propose the use of a query relaxation approach in finding similar patterns. We further calculate prediction from discovered similar patterns in marine sensor data. The predictive values are then compared with the forecast from hydrodynamic model data. In addition, we present query answering using a clustering technique. Finally, we show implementation results in a marine sensor network deployed in the South East of Tasmania, Australia. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Chart image understanding and numerical data extraction

    Publication Year: 2011 , Page(s): 115 - 120
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2002 KB) |  | HTML iconHTML  

    Chart images in digital documents are an important source of valuable information that is largely under-utilized for data indexing and information extraction purposes. We developed a framework to automatically extract data carried by charts and convert them to XML format. The proposed algorithm classifies image by chart type, detects graphical and textual components, extracts semantic relations between graphics and text. Classification is performed by a novel model-based method, which was extensively tested against the state-of-the-art supervised learning methods and showed high accuracy, comparable to those of the best supervised approaches. The proposed text detection algorithm is applied prior to optical character recognition and leads to significant improvement in text recognition rate (up to 20 times better). The analysis of graphical components and their relations to textual cues allows the recovering of chart data. For testing purpose, a benchmark set was created with the XML/SWF Chart tool. By comparing the recovered data and the original data used for chart generation, we are able to evaluate our information extraction framework and confirm its validity. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Multisensor fusion-based object detection and tracking using Active Shape Model

    Publication Year: 2011 , Page(s): 108 - 114
    Cited by:  Papers (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1536 KB) |  | HTML iconHTML  

    This paper proposes automatic target detection and tracking system using Active Shape Model (ASM). Existing model based approaches for tracking are either manually initiated or need some form of user interaction to locate the object in images. Also the low light environmental conditions for surveillance systems make the tracking further harder. Hence the proposed system makes use of multiple sensors in the form of IR and visible cameras to enable tracking in degraded and low light environments. The proposed algorithm consists of the following stages: (i) input image evaluation for obtaining the conditions under which the camera is placed, (ii) an integrated motion detector and target tracker, (iii) active shape tracker (AST) for performing tracking, (iv) update of tracking results for real time tracking of targets. In the first stage the input image is evaluated for the lighting conditions. If the lighting conditions are poor then IR sensor is integrated with the CCD sensor for tracking applications. In the second stage the motion detector and region tracker are used to provide feedback to AST for automatic initialization of tracking. Tracking is carried out in the third stage using ASM. The final stage extracts the parameters and tracking information and applies it to the next frame if the tracking is carried out in real time. The major contribution this work lies in the integration for a completed system, which covers from image processing to tracking algorithms. The approach of combining multiple algorithms succeeds in overcoming fundamental limitations of tracking and at the same time realizes real time implementation. Experimental results show that the proposed algorithm can track people under various environment in real-time. The proposed system has potential uses in the area of surveillance, shape analysis, and model-based coding. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • When theory meets practice: A case report on conceptual modeling for XML

    Publication Year: 2011 , Page(s): 242 - 251
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (438 KB) |  | HTML iconHTML  

    Modern information systems usually exploit numerous XML formats for communication with other systems. There are, however, many potential problems hidden. This includes the degree of readability, integrability and adaptability of the XML formats. In the first part of this paper we demonstrate the problems on a real-world application - the National Register of Public Procurement in the Czech Republic. In the second part we show how we can improve readability, integrability and adaptability of the XML formats of this system with a conceptual model for XML we have developed in our previous works. Finally, we generalize our experience gained into a methodology which can be applied in any other problem domain. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Applying multi-correlation for improving forecasting in cyber security

    Publication Year: 2011 , Page(s): 179 - 186
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1557 KB) |  | HTML iconHTML  

    Currently, defense of the cyber space is mostly based on detection and/or blocking of attacks (Intrusion Detection and Prevention System - IDPS). But, a significant improvement for IDPS is the employment of forecasting techniques in a Distributed Intrusion Forecasting System (DIFS), which enables the capability for predicting attacks. Notwithstanding, during our earlier works, one of the issues we have faced was the huge amount of alerts produced by IDPS, several of them were false positives. Checking the veracity of alerts through other sources (multi-correlation), e.g. logs taken from the operating system (OS), is a way of reducing the number of false alerts, and, therefore, improving data (historical series) to be used by the DIFS. The goal of this paper is to propose a two stage system which allows: (1) employment of an Event Analysis System (EAS) for making multi-correlation between alerts from an IDPS with the OS' logs; and (2) applying forecasting techniques on data generated by the EAS. Tests applied on laboratory by the use of the two stage system allow concluding about the improvement of the historical series reliability, and the consequent improvement of the forecasts accuracy. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Index-based n-gram extraction from large document collections

    Publication Year: 2011 , Page(s): 73 - 78
    Cited by:  Papers (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (159 KB) |  | HTML iconHTML  

    N-grams are applied in some applications searching in text documents, especially in cases when one must work with phrases, e.g. in plagiarism detection. N-gram is a sequence of n terms (or generally tokens) from a document. We get a set of n-grams by moving a floating window from the begin to the end of the document. During the extraction we must remove duplicate n-grams and we must store additional values to each n-gram type, e.g. n-gram type frequency for each document and so on, it depends on a query model used. Previous works utilize a sorting algorithm to compute the n-gram frequency. These approaches must handle a high number of the same n-grams resulting in high time and space overhead. Moreover, these techniques are often main-memory only, it means they must be executed for small or middle size collections. In this paper, we show an index-based method to the n-gram extraction for large collections. This method utilizes common data structures like B+-tree and Hash table. We show the scalability of our method by presenting experiments with the gigabytes collection. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A new perfect hashing based approach for secure stegnograph

    Publication Year: 2011 , Page(s): 174 - 178
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (402 KB) |  | HTML iconHTML  

    Image stegnography is an emerging field of research for secure data hiding for data transmission over internet, copyright protection, and ownership identification. A couple of techniques have been proposed for colour image stegnography. However, the colour images are more costly to transmit on internet due to their size. In this paper, we propose a new perfect hashing based approach for stegnography in grey-scale images. The proposed approach is more efficient and effective that provides a more secure way of data transmission at higher speed. The presented approach is implemented into a prototype tool coded in VB.NET. The presented approach is effective in a way that multiple file formats such as bmp, gif, jpeg, and tiff are also supported. A set of sample images were processed with the tool and the results of the initial experiments indicate the potential of the presented approach not only in terms of secure stegnography but also in terms of fast data transmission over internet. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Discriminating early stage AD patients from healthy controls using synchronization analysis of EEG

    Publication Year: 2011 , Page(s): 282 - 287
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1019 KB) |  | HTML iconHTML  

    In this paper we study how the meso-scale and micro-scale electroencephalography (EEG) synchronization measures can be used for discriminating patients suffering from Alzheimer's disease (AD) from normal control subjects. To this end, two synchronization measures, namely power spectral density and multivariate phase synchronization, are considered and the topography of the changes in patients vs. Controls is shown. The AD patients showed increased power spectral density in the frontal area in theta band and widespread decrease in the higher frequency bands. It was also characterized with decreased multivariate phase synchronization in the left fronto-temporal and medial regions, which was consistent across all frequency bands. A region of interest was selected based on these maps and the average of the power spectral density and phase synchrony was obtained in these regions. These two quantities were then used as features for classification of the subjects into patients' and controls' groups. Our analysis showed that the theta band can be a marker for discriminating AD patients from normal controls, where a simple linear discriminant resulted in 83% classification precision. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Context-aware SQA e-learning system

    Publication Year: 2011 , Page(s): 327 - 331
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2031 KB) |  | HTML iconHTML  

    In this paper, we propose an ontological design for developing a context-aware e-learning system that supports learners developing Software Quality Assurance (SQA) compliant software. The learning process is driven by the type of software product the learner is dealing with, as well as, its SQA requirements and corresponding SQA techniques and procedures. The paper shows a global ontology design to embed knowledge related to the learner, SQA domain in general, and product-based SQA requirement and procedures. Reasoning tools are provided to infer knowledge that can provide more-modular and just-in-time contextual SQA resources for the task in hand. A learning scenario is shown to illustrate the system's ability to deal with SQA requirements facing the learner in the software development process. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • BatCave: Adding security to the BATMAN protocol

    Publication Year: 2011 , Page(s): 199 - 204
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (133 KB) |  | HTML iconHTML  

    The Better Approach To Mobile Ad-hoc Networking (BATMAN) protocol is intended as a replacement for protocols such as OLSR, but just like most such efforts, BATMAN has no built-in security features. In this paper we describe security extensions to BATMAN that control network participation and prevent unauthorized nodes from influencing network routing. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Automatic text classification and focused crawling

    Publication Year: 2011 , Page(s): 143 - 148
    Cited by:  Papers (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (213 KB) |  | HTML iconHTML  

    A focused crawler is a web crawler that traverse the web to explore information that is related to a particular topic of interest only. On the other hand, generic web crawlers try to search the entire web, which is impossible due to the size and the complexity of WWW. In this paper we make a survey of some of the latest focused web crawling approaches discussing each with their experimental results. We categorize them as focused crawling based on content analysis, focused crawling based on link analysis and focused crawling based on both the content and link analysis. We also give an insight to the future research and draw the overall conclusions. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Predicting software black-box defects using stacked generalization

    Publication Year: 2011 , Page(s): 294 - 299
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (235 KB) |  | HTML iconHTML  

    Defect number prediction is essential to make a key decision on when to stop testing. For more applicable and accurate prediction, we propose an ensemble prediction model based on stacked generalization (PMoSG), and use it to predict the number of defects detected by third-party black-box testing. Taking the characteristics of black-box defects and causal relationships among factors which influence defect detection into account, Bayesian net and other numeric prediction models are employed in our ensemble models. Experimental results show that our PMoSG model achieves a significant improvement in accuracy of defect numeric prediction than any individual model, and achieves best prediction accuracy when using LWL(Locally Weighted Learning) method as level-1 model. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A new method based on Finite State Machine for detecting misbehaving nodes in ad hoc networks

    Publication Year: 2011 , Page(s): 187 - 192
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (341 KB) |  | HTML iconHTML  

    In this paper we express a new intrusion detection system based FSM (Finite State Machine) in ad hoc networks. Security is one of the most important issues in current networks. The most common cases of attacks in mobile Ad hoc networks can be drop of routing packages and changes in the incoming packet which aims at disrupting the network routing and overall network reduce performance. The presented approach based on FSM focuses at recognizing the malicious nodes within the network in a fast and accurate way, then it deals with rapid introduction of the malicious nodes to other nodes in the network to prevent sending multiple packets and drop and packet change. Finally, we will show the significant improvement of some factors in comparison with other last works and we simulated our methods by NS2 software. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Web service with criteria: Extending WSDL

    Publication Year: 2011 , Page(s): 205 - 210
    Cited by:  Papers (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (240 KB) |  | HTML iconHTML  

    WSDL is used to describe the interface of a service, in XML format. The interface describes the functional properties as well as non functional properties. We are concerned with specifying `criteria' as a non functional property of a web service. For this we have extend WSDL to X-WSDL. In order to add criteria information we extend the WSDL (Web Service Definition Language) schema by adding a new element `criteriaservice' this is available in the new namespace. Using this `criteriaservcie' element it is possible to specify the criteria along with a service in an X-WSDL document. The WSDL document is also extended by adding new attributes `criteria name' and `description' to service element. Using this extension it is possible to specify the criteria along with the service in X-WSDL document. The criteria are specified by the user when invoking a service. As a result, we are providing support to discover a more appropriate service according to his/her requirement. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Strategic government initiatives to promote diffusion of online retailing in Saudi Arabia

    Publication Year: 2011 , Page(s): 217 - 222
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (492 KB) |  | HTML iconHTML  

    This paper presents findings from a study of factors affecting the rate of diffusion and adoption of online retailing in Saudi Arabia. In general, Saudi retailers have not responded actively to the global growth of online retailing. Accordingly, this research was conducted to find the key factors involved in this phenomenon. A major finding presented here, is that both buyers and sellers emphasize the need for government involvement to support and promote development of online commerce. Particularly, it indicates the need for strategic government initiatives to provide regulation, legislation, education, and trusted infrastructure for secure payment and delivery. Saudi Arabia presents a unique cultural, technological and political context for the development of e-commerce. We highlight the particular motivators and potential benefits of Saudi government involvement in e-commerce development. A new model for formulating roles and strategic government initiatives to support the successful diffusion of online retailing in KSA is presented and discussed. This will be of interest to any following the development of e-commerce and the information economy in the Arab nations. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Programming for evaluating strip layout of progressive dies

    Publication Year: 2011 , Page(s): 229 - 234
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (888 KB) |  | HTML iconHTML  

    A progressive die is an effective tool for efficient and economical production of sheet metal parts in large quantities. Nowadays, progressive die designers still spend much of their time on choosing better layouts among feasible ones. This study employs Pro/Web.Link, Hyper Text Markup Language (HTML) and JavaScript to develop an application which helps evaluate automatically strip layouts in Pro/Engineer software environment. This paper proposes solutions for calculating total evaluation score of the strip layout based on four factors: station number factor, moment balancing factor, strip stability factor and feed height factor. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.