Skip to Main Content
Machine Learning (ML) for classifying IP traffic has relied on the analysis of statistics of full flows or their first few packets only. However, automated QoS management for interactive traffic flows requires quick and timely classification well before the flows finish. Also, interactive flows are often long-lived and should be continuously monitored during their lifetime. We propose to achieve this by using statistics derived from sub-flows—a small number of most recent packets taken at any point in a flow's lifetime. Then, the ML classifier must be trained on a set of sub-flows, and we investigate different sub-flow selection strategies. We also propose to augment training datasets so that classification accuracy is maintained even when a classifier mixes up client-to-server and server-to-client directions for applications exhibiting asymmetric traffic characteristics. We demonstrate the effectiveness of our approach with the Naive Bayes and C4.5 Decision Tree ML algorithms, for the identification of first-person-shooter online game and VoIP traffic. Our results show that we can classify both applications with up to 99% Precision and 95% Recall within less than 1 s. Stable results are achieved regardless of where within a flow the classifier captures the packets and the traffic direction.