Intrusion detection based on concept drift detection and online incremental learning

Guardado en:

Bibliografiske detaljer
Udgivet i:	International Journal of Pervasive Computing and Communications vol. 21, no. 1 (2025), p. 81-115
Hovedforfatter:	Jemili, Farah
Andre forfattere:	Jouini, Khaled, Korbaa, Ouajdi
Udgivet:	Emerald Group Publishing Limited
Fag:	Accuracy Deep learning System reliability Performance evaluation Classification Threats Adaptability Artificial neural networks Fault tolerance Taxonomy Availability Batch processing Machine learning Distributed processing Efficiency Intrusion detection systems Nonstationary environments Adaptive algorithms Design analysis Datasets Adaptive systems Performance measurement Methodology Network reliability IP (Internet Protocol) Neural networks Empowerment Effectiveness Industrial Internet of Things Methods Algorithms Resource utilization Drift Cybersecurity Cloud computing
Online adgang:	Citation/Abstract Full Text Full Text - PDF
Tags:	Tilføj Tag Ingen Tags, Vær først til at tagge denne postø!

MARC


LEADER	00000nab a2200000uu 4500
001	3150563109
003	UK-CbPIL
022			\|a 1742-7371
022			\|a 1742-738X
024	7		\|a 10.1108/IJPCC-12-2023-0358 \|2 doi
035			\|a 3150563109
045	2		\|b d20250101 \|b d20250331
084			\|a 164430 \|2 nlm
100	1		\|a Jemili, Farah \|u ISITCom, Mars Research Laboratory, LR17ES05, University of Sousse, Sousse, Tunisia
245	1		\|a Intrusion detection based on concept drift detection and online incremental learning
260			\|b Emerald Group Publishing Limited \|c 2025
513			\|a Journal Article
520	3		\|a PurposeThe primary purpose of this paper is to introduce the drift detection method-online random forest (DDM-ORF) model for intrusion detection, combining DDM for detecting concept drift and ORF for incremental learning. The paper addresses the challenges of dynamic and nonstationary data, offering a solution that continuously adapts to changes in the data distribution. The goal is to provide effective intrusion detection in real-world scenarios, demonstrated through comprehensive experiments and evaluations using Apache Spark.Design/methodology/approachThe paper uses an experimental approach to evaluate the DDM-ORF model. The design involves assessing classification performance metrics, including accuracy, precision, recall and F-measure. The methodology integrates Apache Spark for distributed computing, using metrics such as processed records per second and input rows per second. The evaluation extends to the analysis of IP addresses, ports and taxonomies in the MAWILab data set. This comprehensive design and methodology showcase the model’s effectiveness in detecting intrusions through concept drift detection and online incremental learning on large-scale, heterogeneous data.FindingsThe paper’s findings reveal that the DDM-ORF model achieves outstanding classification results with 99.96% accuracy, demonstrating its efficacy in intrusion detection. Comparative analysis against a convolutional neural network-based model indicates superior performance in anomalous and suspicious detection rates. The exploration of IP addresses, ports and taxonomies uncovers valuable insights into attack patterns. Apache Spark evaluation attests to the system’s high processing rates. The study emphasizes the scalability, availability and fault tolerance of DDM-ORF, making it suitable for real-world scenarios. Overall, the paper establishes the model’s proficiency in handling dynamic, nonstationary data for intrusion detection.Research limitations/implicationsThe research acknowledges certain limitations, including the potential challenge of DDM detecting only frequency changes in class labels and not complex concept drifts. The incremental random forest’s reliance on memory may pose constraints as the forest size increases, potentially leading to overfitting. Addressing these limitations could involve exploring alternative concept drift detection algorithms and implementing ensemble pruning techniques for memory efficiency. Further research avenues may investigate algorithms balancing accuracy and memory usage, such as compressed random forests, to enhance the model’s effectiveness in evolving data environments.Practical implicationsThe study’s practical implications are noteworthy. The proposed DDM-ORF model, designed for intrusion detection through concept drift detection and online incremental learning, offers a scalable, available and fault-tolerant solution. Leveraging Apache Spark and Microsoft Azure Cloud enhances processing capabilities for large data sets in dynamic, nonstationary scenarios. The model’s applicability to heterogeneous data sets and its achievement of high-accuracy multi-class classification make it suitable for real-world intrusion detection. Moreover, the auto-scaling features of Microsoft Azure Cloud contribute to adaptability, ensuring efficient resource utilization without downtime. These practical implications underscore the model’s relevance and effectiveness in diverse operational contexts.Social implicationsThe DDM-ORF model’s social implications are significant, contributing to enhanced cybersecurity measures. By providing an effective intrusion detection system, it helps safeguard digital ecosystems, preserving user privacy and securing sensitive information. The model’s accuracy in identifying and classifying various intrusion attempts aids in mitigating potential cyber threats, thereby fostering a safer online environment for individuals and organizations. As cybersecurity is paramount in the digital age, the social impact lies in fortifying the resilience of networks, systems and data against malicious activities, ultimately promoting trust and reliability in online interactions.Originality/valueThe DDM-ORF model introduces a novel approach to intrusion detection by combining drift detection and online incremental learning. This originality lies in its utilization of the DDM-ORF algorithm, offering a dynamic and adaptive system for evolving data. The model’s contribution extends to its scalability, fault-tolerance and suitability for heterogeneous data sets, addressing challenges in dynamic, nonstationary environments. Its application on a large-scale data set and multi-class classification, along with integration with Apache Spark and Microsoft Azure Cloud, enhances the field’s understanding and application of intrusion detection, providing valuable insights for securing digital infrastructures.
653			\|a Accuracy
653			\|a Deep learning
653			\|a System reliability
653			\|a Performance evaluation
653			\|a Classification
653			\|a Threats
653			\|a Adaptability
653			\|a Artificial neural networks
653			\|a Fault tolerance
653			\|a Taxonomy
653			\|a Availability
653			\|a Batch processing
653			\|a Machine learning
653			\|a Distributed processing
653			\|a Efficiency
653			\|a Intrusion detection systems
653			\|a Nonstationary environments
653			\|a Adaptive algorithms
653			\|a Design analysis
653			\|a Datasets
653			\|a Adaptive systems
653			\|a Performance measurement
653			\|a Methodology
653			\|a Network reliability
653			\|a IP (Internet Protocol)
653			\|a Neural networks
653			\|a Empowerment
653			\|a Effectiveness
653			\|a Industrial Internet of Things
653			\|a Methods
653			\|a Algorithms
653			\|a Resource utilization
653			\|a Drift
653			\|a Cybersecurity
653			\|a Cloud computing
700	1		\|a Jouini, Khaled \|u ISITCom, Mars Research Laboratory, LR17ES05, University of Sousse, Sousse, Tunisia
700	1		\|a Korbaa, Ouajdi \|u ISITCom, Mars Research Laboratory, LR17ES05, University of Sousse, Sousse, Tunisia
773	0		\|t International Journal of Pervasive Computing and Communications \|g vol. 21, no. 1 (2025), p. 81-115
786	0		\|d ProQuest \|t Advanced Technologies & Aerospace Database
856	4	1	\|3 Citation/Abstract \|u https://www.proquest.com/docview/3150563109/abstract/embedded/J7RWLIQ9I3C9JK51?source=fedsrch
856	4	0	\|3 Full Text \|u https://www.proquest.com/docview/3150563109/fulltext/embedded/J7RWLIQ9I3C9JK51?source=fedsrch
856	4	0	\|3 Full Text - PDF \|u https://www.proquest.com/docview/3150563109/fulltextPDF/embedded/J7RWLIQ9I3C9JK51?source=fedsrch