Abstract:
Over the last decade, reproducibility of experimental results has been a prime focus in database systems research, and many high-profile conferences award results that ca...Show MoreMetadata
Abstract:
Over the last decade, reproducibility of experimental results has been a prime focus in database systems research, and many high-profile conferences award results that can be independently verified. Since database systems research involves complex software stacks that non-trivially interact with hardware, sharing experimental setups is anything but trivial: Building a working reproduction package goes far beyond providing a DOI to some repository hosting data, code, and setup instructions.This tutorial revisits reproducible engineering in the face of state-of-the-art technology, and best practices gained in other computer science research communities. In particular, in the hands-on part, we demonstrate how to package entire system software stacks for dissemination. To ascertain long-term reproducibility over decades (or ideally, forever), we discuss why relying on open source technologies massively employed in industry has essential advantages over approaches crafted specifically for research. Supplementary material shows how version control systems that allow for non-linearly rewriting recorded history can document the structured genesis behind experimental setups in a way that is substantially easier to understand, without involvement of the original authors, compared to detour-ridden, strictly historic evolution.
Date of Conference: 19-22 April 2021
Date Added to IEEE Xplore: 22 June 2021
ISBN Information: