Summary form only given. The past few years have seen the emergence of application domains that need to process data elements arriving as a continuous stream. Recently, several architectures to process database queries over these data streams have been proposed in the literature. Although these architectures may be suitable for general purpose query processing in a centralized-setting, they have serious limitations when it comes to supporting data mining queries in a distributed-setting. Data mining is an interactive process and it is crucial that we provide the user with interactive response times. In addition, many data mining applications, such as network intrusion detection, need to process data streams arriving at distributed end-points. Centralized processing of data streams for network intrusion detection would be overwhelming. These are fundamental issues for data mining over data streams and have been addressed. Our schemes give controlled interactive response times when processing data streams in a distributed-setting.
Published in:
Parallel and Distributed Processing Symposium, 2004. Proceedings. 18th International
Date of Conference: 26-30 April 2004