Skip to Main Content
We consider the problem of approximating sliding window joins over data streams in a data stream processing system with limited resources. In our model, we deal with resource constraints by shedding load in the form of dropping tuples from the data streams. We make two main contributions. First, we define the problem space by discussing architectural models for data stream join processing and surveying suitable measures for the quality of an approximation of a set-valued query result. Second, we examine in detail a large part of this problem space. More precisely, we consider the number of generated result tuples as the quality measure and we propose optimal offline and fast online algorithms for it. In a thorough experimental study with synthetic and real data, we show the efficacy of our solutions.