Skip to Main Content
Users on the web are unknowingly becoming more susceptible to scams from cyber deviants and malicious websites. There has been much work in the identification of malicious websites using application layer features based on content (HTML, images, links, etc.) and a plethora of classification techniques. However, there has been little work on using features from the other layers in the Open Systems Interconnection (OSI) network stack. Capturing features from the transport and internet layers of the network stack based on responses to various Hypertext Transfer Protocol (HTTP) requests may allow for increased classification accuracy. In this paper, we use learning techniques (Winnow, Logit Regression, Naïve Bayes, J48, and Bayesian) utilizing these new features to identify fake pharmacy websites. The results show that using transport and Internet layer features yields an accuracy of 80% to 95% for detecting fake websites using standard machine learning algorithms. The results suggest that many organizations may be hosting multiple websites using shared code and hosting services to enable them to produce the maximum number of fraudulent websites.
Date of Conference: 11-14 June 2012