By Topic

Using Multiclass Machine Learning Methods to Classify Malicious Behaviors Aimed at Web Systems

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$33 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

3 Author(s)
Katerina Goseva-Popstojanova ; Lane Dept. of Comput. Sci. & Electr. Eng., West Virginia Univ., Morgantown, WV, USA ; Goce Anastasovski ; Risto Pantev

The number of vulnerabilities and attacks on Web systems show an increasing trend and tend to dominate on the Internet. Furthermore, due to their popularity and users ability to create content, Web 2.0 applications have become particularly attractive targets. These trends clearly illustrate the need for better understanding of malicious cyber activities based on both qualitative and quantitative analysis. This paper is focused on multiclass classification of malicious Web activities using three supervised machine learning methods: J48, PART, and Support Vector Machines (SVM). The empirical analysis is based on data collected in duration of nine months by a high interaction honey pot consisting of a three-tier Web system, which included Web 2.0 applications (i.e., a blog and wiki). Our results show that supervised learning methods can be used to efficiently distinguish among multiple vulnerability scan and attack classes, with high recall and precision values for all but several very small classes. For our dataset, decision tree based methods J48 and PART perform slightly better than SVM in terms of overall accuracy and weighted recall. Additionally, J48 and PART require less than half of the features (i.e., session attributes) used by SVM, as well as they execute much faster. Therefore, they seem to be clear methods of choice.

Published in:

2012 IEEE 23rd International Symposium on Software Reliability Engineering

Date of Conference:

27-30 Nov. 2012