Skip to Main Content
The Session Initiation Protocol (SIP) is an important multimedia session establishment protocol used on the Internet. Due to the nature and deployment realities of the protocol (ASCII message representation, most deployments over UDP, limited use of message encryption), it becomes relatively easy to attack the protocol at the message level. To mitigate this, self-learning systems have been proposed to counteract new threats. However the efficacy of existing machine learning algorithms must be studied on varied data sets before they can be successfully used. Existing literature indicates that Euclidean distance based classifiers work well to detect anomalous messages. Our work suggests that such classifiers do not produce adequate results for well-crafted malicious messages that differ very slightly from normal messages. To demonstrate this, we gather SIP traffic and minimally perturb it using 13 generic transforms to create malicious SIP messages. We use the Levenshtein distance, L, as a measure of similarity between normal and malicious SIP messages. We subject our dataset - consisting of malicious and normal SIP messages - to Euclidean distance-based classifiers as well as four standard classifiers. Our results show vast differences for Euclidean distance-based classifiers on our dataset than reported in current literature. We further see that the standard classifiers are better able to classify an anomalous message when L is small.