Date of Award


Document Type


Degree Name

Master of Science (MS)


Information Security

First Advisor

Dr. Mohammad Mehedy Masud

Second Advisor

Dr. Zouheir Trabelsi

Third Advisor

Dr. Amjad Gawanmeh


Intrusion detection systems (IDSs) such as Snort apply deep packet inspection to detect intrusions. Usually, these are rule-based systems, where each incoming packet is matched with a set of rules. Each rule consists of two parts: the rule header and the rule options. The rule header is compared with the packet header. The rule options usually contain a signature string that is matched with packet content using an efficient string matching algorithm. The traditional approach to IDS packet inspection checks a packet against the detection rules by scanning from the first rule in the set and continuing to scan all the rules until a match is found. This approach becomes inefficient if the number of rules is too large and if the majority of the packets match with rules located at the end of the rule set. In this thesis, we propose an intelligent predictive technique for packet inspection based on data mining. We consider each rule in a rule set as a ‘class’. A classifier is first trained with labeled training data. Each such labeled data point contains packet header information, packet content summary information, and the corresponding class label (i.e. the rule number with which the packet matches). Then the classifier is used to classify new incoming packets. The predicted class, i.e. rule, is checked against the packet to see if this packet really matches the predicted rule. If it does, the corresponding action (i.e. alert) of the rule is taken. Otherwise, if the prediction of the classifier is wrong, we go back to the traditional way of matching rules. The advantage of this intelligent predictive packet matching is that it offers much faster rule matching. We have proved, both analytically and empirically, that even with millions of real network traffic packets and hundreds of rules, the classifier can achieve very high accuracy, thereby making the IDS several times faster in making matching decisions.