Skip to Main Content
Real-time classification of Internet traffic according to application types is vital for network management and surveillance. Identifying emerging applications based on well-known port numbers is no longer reliable. While deep packet inspection (DPI) solutions can be accurate, they require constant updates of signatures and become infeasible for encrypted payload especially in multimedia applications (e.g. Skype). Statistical approaches based on machine learning have thus been considered more promising and robust to encryption, privacy, protocol obfuscation, etc. However, the computation complexity of traffic classification using those statistical solutions is high, which prevents them being deployed in systems that need to manage Internet traffic in real time. This paper proposes a FPGA-based parallel architecture to accelerate the statistical identification of multimedia applications while maintaining high classification accuracy. Specifically, we base our design on the k-Nearest Neighbors (k-NN) algorithm which has been shown to be one of the most accurate machine learning algorithms for Internet traffic classification. To enable high-rate data streaming for real-time classification, we adopt the locality sensitive hashing (LSH) for approximate k-NN. The LSH scheme is carefully designed to achieve high accuracy while being efficient for implementation on FPGA. Processing components in the architecture are optimized to realize high throughput. Extensive experiments and FPGA implementation results show that our design can achieve high accuracy above 99% for classifying three main categories of multimedia applications from Internet traffic while sustaining 80 Gbps throughput for minimum size (40 bytes) packets.