Distributed online averaged one dependence estimator algorithm for network anomaly detection systems
Abstract
Network anomaly detection systems (NADS) are widely used for monitoring the applications involving the streaming data such as the Internet of Things (IoT) by determining the normal and anomalies in the network. Streaming data deal with the nature of network data that is frequent updating data due to the fast-incoming network traffic in the system requires a fast learning classification algorithm to detect the patterns. Hence, this research used an online averaged one dependence estimator (AODE) algorithm of large streaming data for binary and multi-class classification for fast learning the data to ensure the classifier always updated. Furthermore, to deal with a large amount of data and centralization issues the classifier is necessary to develop in a distributed classification algorithm to detect the pattern of network traffic by using network dataset. Therefore, this thesis developed a network anomaly detection system by using a distributed online AODE (DOAODE) algorithm by using UNSW-NB15 dataset that concerning several issues such as the large-scale classification, frequently updated data, and centralization. DOAODE algorithm will detect the network attack by collaborating several stations. Then, the local classifier at each node is combined by using majority voting to have a global classifier and make a final prediction of network traffic. First finding from the conducted experiment showed that the online AODE algorithm is high in accuracy with the percentage equal to 97.26% for binary classification and 83.32% for multi-class classification. Also, online classifier learns faster than a batch classifier. Where online AODE took about less than 10 seconds for binary as well as multi-class classification the network dataset. Second, the finding shows that the proposed work (DOAODE algorithm) obtained high accuracy and not much diverged from the accuracy of centralized algorithm where the results recorded in the range 95% to 97% for binary classification and approximately 83% for multi-class classification. Although, the accuracy of DOAODE algorithm degraded the obtained result still comparable to the centralized classifier and it can avoid the single failure point to occur in a network system due to the architecture share their knowledge among them and all the nodes have the same level where all of them can make a prediction on the network traffic.