Skip to Main Content
The Session Initiation Protocol (SIP) is an important multimedia session establishment protocol used on the Internet. It is a text-based protocol, which is complex to parse due to the wide variability in representing the information elements. Building a parser for SIP may appear straight-forward; however, writing an efficient, robust, and scalable parser that is immune to low-effort attacks using malformed messages is surprisingly difficult. To mitigate this, self-learning systems based on Euclidean distance classifiers have been proposed to determine whether a message is well-formed or not. The efficacy of such machine learning algorithms must be studied on varied data sets before they can be successfully used. Our previous work has shown that Euclidean distance-based classifiers and standard classifiers used for self-learning problems are unable to detect malformed self-similar SIP messages (i.e., invalid SIP messages that differ by only a few bytes from normal SIP messages). This paper proposes using multiple classifier systems to detect malformed self-similar SIP messages. Our results show that a judiciously constructed multiple classifier system yields classification performance as high as 97.56% of the messages being classified correctly. We further show that for self-similar SIP messages, feature reduction measures based on the first moment are insufficient for improving classification accuracy.