Towards Agile Integration: Specification-based Data Alignment | IEEE Conference Publication | IEEE Xplore

Towards Agile Integration: Specification-based Data Alignment


Abstract:

Utilizing data sets from multiple domains is a common procedure in scientific research. For example, research on the performance of buildings may require data from multip...Show More

Abstract:

Utilizing data sets from multiple domains is a common procedure in scientific research. For example, research on the performance of buildings may require data from multiple sources that lack a singular standard for data reporting. The Building Management System might report data at regular 5minute intervals, whereas an air-quality sensor might capture values only when there has been significant change from the previous value. Many systems exist to help integrate multiple data sources into a single system or interface. However, such systems do not necessarily make it easy to modify an integration plan, for example, to accommodate data exploration, new and changing data sets or shifts in the questions of interest. We propose an agile data-integration system to enable quick and adaptive analysis across many data sets, concentrating initially on the data alignment step: combining data values from multiple time-series based data sets whose time schedules. To this end, we adopt a Domain Specific Language approach where we construct a domain model for alignment, provide a specification language for describing alignments in the model and implement an interpreter for specification in that language. Our implementation exploits a rank-based join in SQL that produces faster alignment times than the commonly suggested method of aligning data sets in a database. We present experiments to demonstrate the advantage of our method and exploit data properties for optimization.
Date of Conference: 11-13 August 2020
Date Added to IEEE Xplore: 10 September 2020
ISBN Information:
Conference Location: Las Vegas, NV, USA

Contact IEEE to Subscribe

References

References is not available for this document.