2018 IEEE 34th International Conference on Data Engineering (ICDE)

16-19 April 2018

Filter Results

Displaying Results 1 - 25 of 273
  • [Title page i]

    Publication Year: 2018, Page(s): 1
    Request permission for reuse | PDF file iconPDF (26 KB)
    Freely Available from IEEE
  • [Title page iii]

    Publication Year: 2018, Page(s): 3
    Request permission for reuse | PDF file iconPDF (99 KB)
    Freely Available from IEEE
  • [Copyright notice]

    Publication Year: 2018, Page(s): 4
    Request permission for reuse | PDF file iconPDF (127 KB)
    Freely Available from IEEE
  • Table of Contents

    Publication Year: 2018, Page(s):5 - 32
    Request permission for reuse | PDF file iconPDF (145 KB)
    Freely Available from IEEE
  • Message from the ICDE 2018 Chairs

    Publication Year: 2018, Page(s):33 - 35
    Request permission for reuse | PDF file iconPDF (176 KB)
    Freely Available from IEEE
  • ICDE 2018 Organizing Committee

    Publication Year: 2018, Page(s):36 - 37
    Request permission for reuse | PDF file iconPDF (128 KB)
    Freely Available from IEEE
  • ICDE 2018 Program Committees

    Publication Year: 2018, Page(s):38 - 47
    Request permission for reuse | PDF file iconPDF (167 KB)
    Freely Available from IEEE
  • ICDE 2018 Keynotes

    Publication Year: 2018, Page(s):48 - 49
    Request permission for reuse | Click to expandAbstract | PDF file iconPDF (131 KB)

    Provides an abstract for each of the keynote presentations and may include a brief professional biography of each presenter. The complete presentations were not made available for publication as part of the conference proceedings. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Human Factors in Data Science

    Publication Year: 2018, Page(s):1 - 12
    Request permission for reuse | Click to expandAbstract | PDF file iconPDF (1741 KB) | HTML iconHTML

    Data Science (DS) has been shifting from libraries and stacks to usage and impact. While "database thinking" is permeating all levels in a DS stack, the DS lifecycle can only be fully realized by looping in humans in a principled and safe fashion. This paper focuses on the role of humans and user data in DS. It starts with the impact of human factors on the design of sustainable and fair data gene... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Actor-Oriented Database Systems

    Publication Year: 2018, Page(s):13 - 14
    Request permission for reuse | Click to expandAbstract | PDF file iconPDF (100 KB) | HTML iconHTML

    We present the vision of an actor-oriented database. Its goal is to integrate database abstractions into an actor-oriented programming language for interactive, stateful, scalable, distributed applications that use cloud storage. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An NVM Carol: Visions of NVM Past, Present, and Future

    Publication Year: 2018, Page(s):15 - 23
    Request permission for reuse | Click to expandAbstract | PDF file iconPDF (158 KB) | HTML iconHTML

    Around 2010, we observed significant research activity around the development of non-volatile memory technologies. Shortly thereafter, other research communities began considering the implications of non-volatile memory on system design, from storage systems to data management solutions to entire systems. Finally, in July 2015, Intel and Micron Technology announced 3D XPoint. It's now 2018; Intel ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • My Top Ten Fears about the DBMS Field

    Publication Year: 2018, Page(s):24 - 28
    Request permission for reuse | Click to expandAbstract | PDF file iconPDF (112 KB) | HTML iconHTML

    In this paper, I present my top ten fears about the future of the DBMS field, with apologies to David Letterman. There are three ”big fears”, which I discuss first. Five additional fears are a result of the ”big three”. I then conclude with ”the big enchilada”, which is a pair of fears. In each case, I indicate what I think is the best way to deal with the current situation. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • GPH: Similarity Search in Hamming Space

    Publication Year: 2018, Page(s):29 - 40
    Request permission for reuse | Click to expandAbstract | PDF file iconPDF (398 KB) | HTML iconHTML

    A similarity search in Hamming space finds binary vectors whose Hamming distances are no more than a threshold from a query vector. It is a fundamental problem in many applications, including image retrieval, near-duplicate Web page detection, and machine learning. State-of-the-art approaches to answering such queries are mainly based on the pigeonhole principle to generate a set of candidates and... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Extracting Syntactical Patterns from Databases

    Publication Year: 2018, Page(s):41 - 52
    Request permission for reuse | Click to expandAbstract | PDF file iconPDF (763 KB) | HTML iconHTML

    Many database columns contain string or numerical data that conforms to a pattern, such as phone numbers, dates, addresses, product identifiers, and employee ids. These patterns are useful in a number of data processing applications, including understanding what a specific field represents when field names are ambiguous, identifying outlier values, and finding similar fields across data sets.One w... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Schema-Agnostic Progressive Entity Resolution

    Publication Year: 2018, Page(s):53 - 64
    Request permission for reuse | Click to expandAbstract | PDF file iconPDF (544 KB) | HTML iconHTML

    Entity Resolution (ER) is the task of finding entity profiles that correspond to the same real-world entity. Progressive ER aims to efficiently resolve large datasets when limited time and/or computational resources are available. In practice, its goal is to provide the best possible partial solution by approximating the optimal comparison order of the entity profiles. So far, Progressive ER has o... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Ares: Automatic Disaggregation of Historical Data

    Publication Year: 2018, Page(s):65 - 76
    Request permission for reuse | Click to expandAbstract | PDF file iconPDF (1589 KB) | HTML iconHTML

    We address the challenge of reconstructing historical counts from aggregated, possibly overlapping historical reports. For example, given the monthly and weekly sums, how can we find the daily counts of people infected with flu? We propose an approach, called ARES (Automatic REStoration), that performs automatic data reconstruction in two phases: (1) first, it estimates the sequence of historical ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Augmented Access for Querying and Exploring a Polystore

    Publication Year: 2018, Page(s):77 - 88
    Request permission for reuse | Click to expandAbstract | PDF file iconPDF (1464 KB) | HTML iconHTML

    The huge diversity of database technologies in use inside organizations pose today new challenges of data management and integration. Polystores provide a solution to this scenario based on a loosely coupled integration of data sources and the direct access, with the local language, to each storage engine for exploiting its distinctive features. However, given the absence of a global schema, it is... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • What-If Analysis with Conflicting Goals: Recommending Data Ranges for Exploration

    Publication Year: 2018, Page(s):89 - 100
    Request permission for reuse | Click to expandAbstract | PDF file iconPDF (3102 KB) | HTML iconHTML

    What-if analysis is a data-intensive exploration to inspect how changes in a set of input parameters of a model influence some outcomes. It is motivated by a user trying to understand the sensitivity of a model to a certain parameter in order to reach a set of goals that are defined over the outcomes. To avoid an exploration of all possible combinations of parameter values, efficient what-if analy... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • DeepEye: Towards Automatic Data Visualization

    Publication Year: 2018, Page(s):101 - 112
    Request permission for reuse | Click to expandAbstract | PDF file iconPDF (1237 KB) | HTML iconHTML

    Data visualization is invaluable for explaining the significance of data to people who are visually oriented. The central task of automatic data visualization is, given a dataset, to visualize its compelling stories by transforming the data (e.g., selecting attributes, grouping and binning values) and deciding the right type of visualization (e.g., bar or line charts). We present DEEPEYE, a novel ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Towards a Holistic Integration of Spreadsheets with Databases: A Scalable Storage Engine for Presentational Data Management

    Publication Year: 2018, Page(s):113 - 124
    Request permission for reuse | Click to expandAbstract | PDF file iconPDF (1470 KB) | HTML iconHTML

    Spreadsheet software is the tool of choice for interactive ad-hoc data management, with adoption by billions of users. However, spreadsheets are not scalable, unlike database systems. On the other hand, database systems, while highly scalable, do not support interactivity as a first-class primitive. We are developing DataSpread, to holistically integrate spreadsheets as a front-end interface with ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Rule Sharing for Fraud Detection via Adaptation

    Publication Year: 2018, Page(s):125 - 136
    Request permission for reuse | Click to expandAbstract | PDF file iconPDF (724 KB) | HTML iconHTML

    Writing rules to capture precisely fraudulent transactions is a challenging task where domain experts spend significant effort and time. A key observation is that much of this difficulty originates from the fact that such experts typically work as "lone rangers" or in isolated groups, or work on detecting frauds in one context in isolation from frauds that occur in another context. However, in pra... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • DisTenC: A Distributed Algorithm for Scalable Tensor Completion on Spark

    Publication Year: 2018, Page(s):137 - 148
    Request permission for reuse | Click to expandAbstract | PDF file iconPDF (876 KB) | HTML iconHTML

    How can we efficiently recover missing values for very large-scale real-world datasets that are multi-dimensional even when the auxiliary information is regularized at certain mode? Tensor completion is a useful tool to recover a low-rank tensor that best approximates partially observed data and further predicts the unobserved data by this low-rank tensor, which has been successfully used for many... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A Generic Top-N Recommendation Framework for Trading-Off Accuracy, Novelty, and Coverage

    Publication Year: 2018, Page(s):149 - 160
    Request permission for reuse | Click to expandAbstract | PDF file iconPDF (558 KB) | HTML iconHTML

    Standard collaborative filtering approaches for top-N recommendation are biased toward popular items. As a result, they recommend items that users are likely aware of and under-represent long-tail items. This is inadequate, both for consumers who prefer novel items and because concentrating on popular items poorly covers the item space, whereas high item space coverage increases providers' revenue... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Predicting Named Entity Location Using Twitter

    Publication Year: 2018, Page(s):161 - 172
    Request permission for reuse | Click to expandAbstract | PDF file iconPDF (380 KB) | HTML iconHTML

    A knowledge base contains a set of concepts, entities, attributes, and relations. Knowledge bases are increasingly critical to a wide variety of applications in both industry and academia. Yet despite all that, knowledge bases are greatly incomplete. As the world evolves, new entities are generated. Enriching existing knowledge bases with new entities and new location attribute values for them bec... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • CUB, a Consensus Unit-Based Storage Scheme for Blockchain System

    Publication Year: 2018, Page(s):173 - 184
    Request permission for reuse | Click to expandAbstract | PDF file iconPDF (1459 KB) | HTML iconHTML

    Recently, Blockchain becomes a hot research topic due to the success of Blockchain in many applications, such as cryptocurrency, smart contract, digital assets, distributed cloud storage and so on. The power of Blockchain is that it can achieve the consensus of an ordered set of transactions among nodes which do not trust each other, even with the existence of malicious nodes. However, compared to... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.