Skip to Main Content
In networks carrying large volume of traffic, accurate traffic characterization is necessary for understanding the dynamics and patterns of network resource usage. Previous approaches to flow characterization are based on random sampling of the packets (e.g., Cisco's NetFlow) or inferring characteristics solely based on long lived flows (LLFs) or on lossy data structures (e.g., bloom filters, hash tables). However, none of these approaches takes into account the heavy-tailed nature of the Internet traffic and separates the estimation algorithm from the flow measurement architecture.In this paper, we propose an alternate approach to traffic characterization by closely linking the flow measurement architecture with the estimation algorithm. Our measurement framework stores complete information related to short lived flows (SLFs) while collecting partial information related to LLFs. For real-time separation of LLFs and SLFs, we propose a novel algorithm based on typical sequences from information theory. The distribution (pdf) and sample space of the underlying traffic is estimated using the non-parametric Parzen window technique and likelihood function defined over the Coupon collector problem. We validate the accuracy and performance of our estimation technique using traffic traces from the internal LAN in our laboratory and from National Library for Applied Network Research (NLANR).