With the breakneck development of the internet, the information security threat is becoming one of the most crucial problems. In the whole information security protection system, Network Intrusion Detection (NID) is one of the indispensable parts. So it is of much essence to do research on NID for advancing the network techniques and further improving the internet utilization efficiency. NID aims at identifying the normal behaviors and the attacks on the internet. Furthermore, different types of attacks have to be distinguished from each other. In recent years, it has become a heated research area to bring into NID the theories and methods of machine learning. In this thesis, based on the practical application of NID, we study several prevailing methods in machine learning, which include supervised learning methods, unsupervised learning methods and active learning methods, etc. The main contributions of this thesis include the following issues: 1. We design and implement an Network Intrusion Detection System (NIDS) based on Adaboost algorithm. With an improved setting of initialized weights and a simple strategy to avoid overfitting, our Adaboost-based NIDS can achieve a very low false positive rate while keeping a relatively high detection rate. Meanwhile, this NIDS owns low computational complexity which makes it possible for the system to be frequently retrained in complicated and changeful network environments. So it is very promising that this NIDS will be used in future practice. 2. We design and implement NIDS based on unsupervised learning respectively using multiclass spectral clustering algorithm and dominant set clustering algorithm. Voting strategy is used to offset the low clustering accuracy due to the size limit of clustering cases to some extent. But the abjective experimental results show that it is not a very good idea to use these two algorithms separately in NID. 3. We propose an unsupervised-based active learning framework, and implement a hierarchical clustering active learning system. In our system, dominant set clustering and spectral clustering are designed to complement with each other, leading to large reduction of human labeling effort and competitive detection results. The advantages of our active learning framework includes these aspects: 1) It can be easily extended; 2)It has high flexibility; 3)It can to a large extent solve the semantic gap problem; 4)It simulates the learning process of our human beings. We believe that with these strongs, the proposed active learning framework is much worthy of being further developed. In a word, in this thesis, we have made a lot of fruitful attempts and significant progresses on research on machine learning and its application in network intrusion detection.
修改评论