Machine learning-based cyber threat detection: an approach to malware detection and security with explainable AI insights

Guardado en:
书目详细资料
发表在:Human-Intelligent Systems Integration vol. 6, no. 1 (Dec 2024), p. 61
出版:
Springer Nature B.V.
主题:
在线阅读:Citation/Abstract
Full Text
Full Text - PDF
标签: 添加标签
没有标签, 成为第一个标记此记录!
实物特征
摘要:The growing prevalence of malware in the digital landscape presents significant risks to the security and integrity of computer networks and devices. Malicious software, designed with harmful intent, can disrupt operations, compromise sensitive data, and undermine critical processes. To counter these ongoing threats, enhanced cyber threat detection systems are essential to identify and mitigate emerging risks proactively. One promising approach to improving cybersecurity involves applying Machine Learning (ML) techniques, which allow systems to detect patterns and make informed predictions. In this paper, we examined the effectiveness of ML in cyber threat detection, focusing on the classification of dangerous and benign entities within digital ecosystems. We tested four ML algorithms: Support Vector Machine (SVM), Decision Tree (DT), K-Nearest Neighbors (KNN), and Random Forest (RF). The dataset, sourced from Kaggle, was carefully pre-processed to ensure accurate malware and benign data classification. We used k-fold cross-validation for splitting the dataset and manually tuned hyper-parameters to refine model performance, reducing bias and variance. Our results revealed distinct performance metrics among the models, with RF emerging as the top performer with an impressive accuracy rate of 100%. To further enhance the interpretability of the RF model’s predictions, we employed Local Interpretable Model-Agnostic Explanations (LIME) and SHapley Additive exPlanations (SHAP) as the Explainable AI (XAI) techniques. These approaches ensure that static_prio, nivcsw, vm_truncate_count, shared_vm, and millisecond are the most significant features for validating and trusting ML-based cybersecurity solutions. The interaction-aware technique, Partial Dependence Plot (PDP), is utilized in LIME to demonstrate the impact of individual features on model predictions. We assessed LIME and SHAP, applying optimization techniques to minimize performance overhead. This research also incorporated opinions from cybersecurity experts and employed the Chi-Squared test to validate the explanations of XAI. These results reinforce the importance of ML in bolstering cybersecurity by enhancing cyber defense systems against malware and other threats. Ultimately, our research aims to strengthen computer network resilience and protect digital assets, ensuring the integrity and security of digital ecosystems amid evolving cyber threats.
ISSN:2524-4876
2524-4884
DOI:10.1007/s42454-024-00055-7
Fuente:Advanced Technologies & Aerospace Database