Mining Top-k Co-Occurrence Patterns across Multiple Streams | IEEE Journals & Magazine | IEEE Xplore

Mining Top-k Co-Occurrence Patterns across Multiple Streams


Abstract:

The recent Bigdata and IoTera has presented a number of applications that generate objects in a streaming fashion. It is well-known that real-time mining of important pat...Show More

Abstract:

The recent Bigdata and IoTera has presented a number of applications that generate objects in a streaming fashion. It is well-known that real-time mining of important patterns from data streams support many domains. In retail markets and social network services, for example, such patterns are itemsets and words that frequently appear in many user-accounts, i.e., co-occurrence patterns. To efficiently monitor co-occurrence patterns, we address the novel problem of mining top-k closed co-occurrence patterns across multiple streams. We employ sliding window setting in this problem, and each pattern is ranked based on count, which is the number of streams that have generated the pattern. Since objects are consecutively generated and deleted, the count of a given pattern is dynamic, which may change the rank of the pattern. This renders a challenge to monitoring the top-k answer in real-time. We propose an index-based algorithm that addresses the challenge and provides the exact answer. Specifically, we propose the CP-Graph, a hybrid index of graph and inverted file structures. The CP-Graph can efficiently compute the count of a given pattern and update the answer while pruning unnecessary patterns. Our experimental study on real datasets demonstrates the efficiency and scalability of our solution.
Published in: IEEE Transactions on Knowledge and Data Engineering ( Volume: 29, Issue: 10, 01 October 2017)
Page(s): 2249 - 2262
Date of Publication: 18 July 2017

ISSN Information:

Funding Agency:


Contact IEEE to Subscribe

References

References is not available for this document.