Skip to Main Content
Network traffic characterisation and modeling using time series models is an area which has been extensively studied in the past. Coarse-grained (aggregated traffic) time series analysis using parametric approach, primarily carried out at the backbone network over a long time period (of the order of days to months), show strong deterministic cyclic trends, while the fine-grained (at the packet or flow level) counterpart, done mostly at edge network over small time period (of the order of few minutes), exhibit self-similar behaviour. This paper is an attempt to study the fine-grained time series characteristics of network traffic at an edge network, observed over a long period (of the order of days and weeks), using parametric approach. The analysis is carried out in the context of anomaly detection. Most of the earlier attempts in this direction followed a non-parametric approach, by either using adaptive or non-adaptive (i.e assuming stationarity) mechanisms, whose performance is found to be extremely sensitive towards empirically determined parameters of the model and hence difficult to determine. Also, the model parameters need to be recomputed at regular intervals of time (of the order of few seconds to minutes). To some extent, this make such algorithms less attractive in terms of generality and practical implementation. The first part of the paper discusses the statistical characteristics of such long range network time series. These are found to exhibit structural breaks apart from transient shocks and can be approximated by a stationary AR model, after an absolute first difference transformation (i.e decoupling stationary component from the non-stationary one). In the later part of the paper, the efficacy of the model proposed is evaluated, by conducting extensive trace driven simulations for the detection of low intensity TCP SYN flood Denial of Service (DoS) attacks. Performance is measured in terms of false positives, false alarm time, detection rate and- detection delay. Experiments are performed on actual traffic traces collected from one of the edge networks over a period of three months and for various sampling intervals (10s, 60s, 120s). Comparative studies with adaptive and non-adaptive methods are carried out to demonstrate the relevance of the proposed model. It is observed that the proposed method gives better performance with 100% detection accuracy for false positive as low as 0.9%.