We consider the problem of computing functions of data streams while employing limited memory in a standard information-theoretic framework. A streaming system with memory constraint has to observe a collection of sources X1, X2,...,Xm sequentially, store synopses of the sources in memory, and compute a function of the sources based on the synopses. We establish a correspondence between this problem and a functional source coding problem in cascade/line networks. For the general functional source coding problem in cascade networks, we derive inner and outer bounds, and for distributions satisfying certain properties, we characterize the achievable rate-region exactly for the computation of any function. As a result of the correspondence we established, this result also characterizes the minimum amount of memory required to compute the function in a streaming system. We briefly discuss the implications of this result for the problem of distinct value computation.
Published in:
Information Theory Proceedings (ISIT), 2010 IEEE International Symposium on
Date of Conference: 13-18 June 2010