Message Level SIP Filtering (SIP_LEX)

Message Level SIP Filtering (SIP_LEX)

The Session Initiation Protocol (SIP) is at the root of many sessions-based applications such as VoIP and media streaming that are used by a growing number of users and organizations. The increase of the availability and use of such applications calls for careful attention to the possibility of transferring malformed, incorrect, or malicious SIP messages as they can cause problems ranging from relatively innocuous disturbances to full blown attacks and frauds.

To this end, LEX_SIP, a novel multi-stage filtering architecture is developed which analyses SIP messages to classify them as "good" or "bad" depending on whether their structure and content are deemed acceptable or not. The first stage is a lexical analyzer to weed out SIP messages with syntax errors. The second stage is based on machine learning techniques, specifically on Support Vector Machine (SVM), to detect harmful SIP messages with semantic errors by learning from previous experiences. 

Download & Install SIP_LEX

Please download the latest version of 'SIP_LEX' from Github directory. After downloading please unzip the compressed file. The compressed file contains :

  • Jar file of the 'SIP_LEX' application named 'SIP_LEX.jar'.
  • A configuration file named “input.properties” containing the list of parameters that should be configured before running SIP_LEX.
  • A sample input file named 'SIP_MSG.txt' containing a set of SIP messages. The set consists of 5000 SIP messages with a balanced mix of 'good' and 'bad' messages. This sample file is located under the folder named “Incoming SIP Messages”.
  • A sample training set to be used to build the model by training a SVM classifier. The training set contains around 500 SIP messages' vectors with a balance mix of "good" and "bad" SIP messages.

Configure SIP_LEX

The first step of using SIP_LEX application is to configure the input parameters which are defined in the configuration file named “input.properties”. The following parameters are mandatory to configure:

  • INPUT_FILE_PATH - Path of the input files containing SIP messages
  • OUTPUT_SYNTAX_ERROR_LIST - Path of the output file containing the list of SIP messages with syntax error
  • OUTPUT_SYNTAX_ERROR_LIST - Path of the output file containing the list of syntactically well-formed SIP messages

Input of SIP_LEX

Input of the SIP_LEX is a set of SIP messages that can be stored in one single file or multiple files. Users should store the input files in a folder and specify the folder name as the parameter “INPUT_FILE_PATH”. End of each SIP messages is indicated by the text “$ENDMSG$” and a newline.

Output of SIP_LEX

SIP_LEX produces two output files:

  • List of messages with syntax error
  • List of syntactically well-formed SIP messages

Run SIP_LEX

  • Set PATH variable --     export PATH=$PATH:"Location of the .jar file"
  • Browsing the location of SIP_LEX application folder
  • Run the jar file of the application : java -jar LEX_SIP.jar

Experiment – SIP message filtering with SIP_LEX and SVM classifiers

Stage 1 : Filtering messages with syntax error using SIP_LEX

  • configure the input parameters of SIP_LEX
  • Run SIP_LEX
  • The outputs of SIP_LEX are two file; (i) messages with syntax error, and (ii) syntactically well-formed messages' vectors.

Stage 2 : Filtering messages with semantic error using SVM classifiers

SVM classifiers perform the second level filtering on the syntactically well-formed messages to find out the messages with semantic errors and harmful contents. Here, we have used LibSVM, an widely used support vector machine library which ensures efficient and fast classification of input data.

For the experiment

  • Test set - The output file of the SIP_LEX containing the vectors of syntactically well-formed SIP messages.
  • Training Set - A set of SIP messages consisting of a balance mix of “good” and “bad” messages. The training set that is used for training the SVM for SIP message classification is found in the Github directory .
  • For building the model, the parameters of SVM should be configured as follows,
    • Soft margin constant, C = 1
    • Kernel function, t = 1 (polynomial kernel)
    • Parameters of polynomial kernel :
      • degree, d=2
      • gamma, g = 1
      • coefficient, r =1

SIP Message Generator (SIP-Msg-Gen)

Performance evaluation of a SIP message filtering architecture relies on large scale of SIP traces. But reliable real world VoIP traces are not always available as VoIP providers are not willing to distribute their data due to user privacy agreements. Moreover, VoIP traces with attack information are not so frequent. Considering this situation, we have developed a Synthetic generator 'SIP-Msg-Gen' for generating SIP traces. 'SIP-Msg-Gen' is a synthetic SIP message generator which is capable of generating SIP messages for evaluating the performance of SIP-Parser and SIP malformed detection system. The generator generates SIP malformed messages following SIP torture test messages defined in RFC 4475. 'SIP-Msg-Gen' is capable of generating well-formed SIP messages following SIP grammar defined in RFC 3261.

Download & Install 'SIP-Msg-Gen'

Please download the latest version of 'SIP-Msg-Gen' from Github directory. After downloading please unzip the compressed file. The compressed file contains :

  • Jar file of the 'SIP_LEX' application named 'SIP Malformed Msg Generator.jar'.
  • A configuration file named “input.properties” containing the list of parameters that should be configured before running SIP-Msg-Gen .

Configure SIP-Msg-Gen

The first step of using SIP-Msg-Gen application is to configure the input parameters which are defined in the configuration file named “input.properties”. The mandatory parameters are :

  • OUTPUT_FILE_NAME - Path of the input files containing SIP messages
  • NUMBER_OF_MESSAGE_TYPE -User should define the message types. Value range 1 to 14. Default value is 14 indicating 14 types of messages including both well-formed and malformed INVITE, REGISTER, CANCEL, BEY, ACK, OPTION request messages and response messages.
  • NUMBER_OF_MESSAGE_SCENARIO - Number of scenarios indicates the number of implemented scenarios of generating malformed messages. Value range 1-19.
  • MESSAGE_NUMBER - Number of messages to be generaged

Run SIP-Msg-Gen

After downloading 'SIP-Msg-Gen', users need to untar it.

  1. Set PATH variable - export PATH=$PATH:"Location of the .jar file"
  2. java -jar SIP Malformed Msg Generator.jar
  3. Output file: SIP_MSG.txt