Skip to Main Content
Streaming applications are often distributed, manage large quantities of data and, as a result, have large memory requirements. Therefore, efficient garbage collection (GC) is crucial for their performance. On the other hand, not all data items affect the application output due to differences in the processing rates of various application threads. In this paper we propose extending the definition of the garbage identification problem for streaming applications and include not only data items that are not "reachable " but also data items that have no effect on the final outcome of the application. We present four optimizations to an existing GC algorithm in Stampede, a parallel programming system to support interactive multimedia applications. We ask the question how far off these algorithms are from an ideal garbage collector, one in which the memory usage exactly equals the amount required for buffering only the relevant data items. This oracle, while unimplementable, serves as an empirical lower-bound for memory usage. We then propose optimizations that will help us get closer to this lower- bound. Using an elaborate measurement and post-mortem analysis infrastructure, we simulate the performance potential for these optimizations and implement the most promising ones. A color-based people tracking application is used for the performance evaluation. Our results show that these optimizations reduce the memory usage by up to 60%.