Skip to Main Content
The overwhelming flow of information in many data stream applications forces many companies to outsource to a third-party the deployment of a data stream management system (DSMS) for performing desired computations. Remote computations intrinsically raise issues of trust, making query execution assurance on data streams a problem with practical implications. Consider a client observing the same data stream as a remote server (e.g., network traffic), that registers a continuous query on the server's DSMS, and receives answers upon request. The client needs to verify the integrity of the results using significantly fewer resources than evaluating the query locally. Towards that goal, we propose a probabilistic algorithm for selection and aggregate/group-by queries, that uses constant space irrespective of the result-set size, has low update cost, and arbitrarily small probability of failure. We generalize this algorithm to allow some tolerance on the number of errors permitted (irrespective of error magnitude), and also discuss the hardness of permitting arbitrary errors of small magnitude. We also perform an empirical evaluation using live network traffic.