Scheduled System Maintenance:
On Wednesday, July 29th, IEEE Xplore will undergo scheduled maintenance from 7:00-9:00 AM ET (11:00-13:00 UTC). During this time there may be intermittent impact on performance. We apologize for any inconvenience.
By Topic

Sequential anomaly detection in a batch with growing number of tests: Application to network intrusion detection

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

3 Author(s)
Miller, D.J. ; Dept. of Electr. Eng., Pennsylvania State Univ., University Park, PA, USA ; Kocak, F. ; Kesidis, G.

For high (N)-dimensional feature spaces, we consider detection of an unknown, anomalous class of samples amongst a batch of collected samples (of size T), under the null hypothesis that all samples follow the same probability law. Since the features which will best identify the anomalies are a priori unknown, several common detection strategies are: 1) evaluating atypicality of a sample (its p-value) based on the null distribution defined on the full N-dimensional feature space; 2) considering a (combinatoric) set of low order distributions, e.g. all singletons and all feature pairs, with detections made based on the smallest p-value yielded over all such low order tests. The first approach relies on accurate estimation of the joint distribution, while the second may suffer from increased false alarm rates as N and T grow. Alternatively, inspired by greedy feature selection commonly used in supervised learning, we propose a novel sequential anomaly detection procedure with a growing number of tests. Here, new tests are (greedily) included only when they are needed, i.e., when their use (on currently undetected samples) will yield greater aggregate statistical significance of (multiple testing corrected) detections than obtainable using the existing test cadre. Our approach thus aims to maximize aggregate statistical significance of all detections made up until a finite horizon. Our method is evaluated, along with supervised methods, for a network intrusion domain, detecting Zeus bot (intrusion) packet flows embedded amongst (normal)Web flows. It is shown that judicious feature representation is essential for discriminating Zeus from Web.

Published in:

Machine Learning for Signal Processing (MLSP), 2012 IEEE International Workshop on

Date of Conference:

23-26 Sept. 2012