An Overview of Web Robots Detection Techniques | IEEE Conference Publication | IEEE Xplore

An Overview of Web Robots Detection Techniques


Abstract:

Web robots or web crawlers have become the major source of web traffic. While some robots are well-behaving such as search engines, others can perform DDoS attacks, which...Show More

Abstract:

Web robots or web crawlers have become the major source of web traffic. While some robots are well-behaving such as search engines, others can perform DDoS attacks, which put great threats on websites. Effectively detecting web robots will benefit not only for network traffic cleaning, but also for improving the cybersecurity of IoT enabled systems and services. To get the state of the arts in web robot detection, this paper reviews recent decade research on web robot or web robot/crawler detection techniques and compares their performances and identify the challenges of different techniques, thus providing researchers a reference for the development of web robots detection in real applications. To protect web content from malicious web robots, researchers have investigated various approaches, but they can be classified into three themes: offline web log analysis, honeypots and online robot detection. We conclude that off-line web log analysis methods have quite high accuracy, but they are time-consuming compared to online detection methods. Honeypots, as a computer security mechanism, can be used to engage and deceive hackers and identify malicious activities performed over the Internet, but they may block legitimate robots. The review shows that a hybrid method is better than an individual classifier, and the performance of online web robot detection needs to be improved. Also, different types of features could play different roles in different machine learning models. Therefore, feature selection is important for web robot/crawler detection.
Date of Conference: 15-19 June 2020
Date Added to IEEE Xplore: 13 July 2020
ISBN Information:
Conference Location: Dublin, Ireland

Contact IEEE to Subscribe

References

References is not available for this document.