A Machine Learning-Based Detection for Parameter Tampering Vulnerabilities in Web Applications Using BERT Embeddings

Guardado en:

Detalles Bibliográficos
Publicado en:	Symmetry vol. 17, no. 7 (2025), p. 985-1000
Autor principal:	Yun Sun Young
Otros Autores:	Nam-Wook, Cho
Publicado:	MDPI AG
Materias:	Forgery Machine learning Parameter identification Accuracy Datasets Security Artificial intelligence Applications programs Exploitation Open source software Fraud prevention Cybersecurity Automation Access control
Acceso en línea:	Citation/Abstract Full Text + Graphics Full Text - PDF
Etiquetas:	Agregar Etiqueta Sin Etiquetas, Sea el primero en etiquetar este registro!

Descripción
Resumen:	The widespread adoption of web applications has led to a significant increase in the number of automated cyberattacks. Parameter tampering attacks pose a substantial security threat, enabling privilege escalation and unauthorized data exfiltration. Traditional pattern-based detection tools exhibit limited efficacy against such threats, as identical parameters may produce varying response patterns contingent on their processing context, including security filtering mechanisms. This study proposes a machine learning-based detection model to address these limitations by identifying parameter tampering vulnerabilities through a contextual analysis. The training dataset aggregates real-world vulnerability cases collected from web crawls, public vulnerability databases, and penetration testing reports. The Synthetic Minority Over-sampling Technique (SMOTE) was employed to address the data imbalance during training. Recall was adopted as the primary evaluation metric to prioritize the detection of true vulnerabilities. Comparative analysis showed that the XGBoost model demonstrated superior performance and was selected as the detection model. Validation was performed using web URLs with known parameter tampering vulnerabilities, achieving a detection rate of 73.3%, outperforming existing open-source automated tools. The proposed model enhances vulnerability detection by incorporating semantic representations of parameters and their values using BERT embeddings, enabling the system to learn contextual characteristics beyond the capabilities of pattern-based methods. These findings suggest the potential of the proposed method for scalable, efficient, and automated security diagnostics in large-scale web environments.
ISSN:	2073-8994
DOI:	10.3390/sym17070985
Fuente:	Engineering Database