Fine-Tuning Arabic and Multilingual BERT Models for Crime Classification to Support Law Enforcement and Crime Prevention

Salvato in:
Dettagli Bibliografici
Pubblicato in:International Journal of Advanced Computer Science and Applications vol. 16, no. 5 (2025)
Autore principale: PDF
Pubblicazione:
Science and Information (SAI) Organization Limited
Soggetti:
Accesso online:Citation/Abstract
Full Text - PDF
Tags: Aggiungi Tag
Nessun Tag, puoi essere il primo ad aggiungerne!!

MARC

LEADER 00000nab a2200000uu 4500
001 3222641132
003 UK-CbPIL
022 |a 2158-107X 
022 |a 2156-5570 
024 7 |a 10.14569/IJACSA.2025.0160544  |2 doi 
035 |a 3222641132 
045 2 |b d20250101  |b d20251231 
100 1 |a PDF 
245 1 |a Fine-Tuning Arabic and Multilingual BERT Models for Crime Classification to Support Law Enforcement and Crime Prevention 
260 |b Science and Information (SAI) Organization Limited  |c 2025 
513 |a Journal Article 
520 3 |a Safety and security are essential to social stability since their absence disrupts economic, social, and political structures and weakens basic human needs. A secure environment promotes development, social cohesion, and well-being, making national resilience and advancement crucial. Law enforcement struggles with rising crime, population density, and technology. Time and effort are required to analyze and utilize data. This study employs AI to classify Arabic text to detect criminal activity. Recent transformer methods, such as Bidirectional Encoder Representation Form Transformer (BERT) models, have shown promise in NLP applications, including text classification. Applying these models to crime prevention motivates significant insights. They are effective because of their unique architecture, especially their capacity to handle text in both left and right contexts after pre-training on massive data. The limited number of crime field studies that employ the BERT transformer and the limited availability of Arabic crime datasets are the primary concerns with the previous studies. This study creates its own X (previously Twitter) dataset. Next, the tweets will be pre-processed, data imbalance addressed, and BERT-based models fine-tuned using six Arabic BERT models and three multilingual models to classify criminal tweets and assess optimal variation. Findings demonstrate that Arabic models are more effective than multilingual models. MARBERT, the best Arabic model, surpasses the outcomes of previous studies by achieving an accuracy and F1-score of 93%. However, mBERT is the best multilingual model with an F1-score and accuracy of 89%. This emphasizes the efficacy of MARBERT in the classification of Arabic criminal text and illustrates its potential to assist in the prevention of crime and the defense of national security. 
653 |a Accuracy 
653 |a Datasets 
653 |a Classification 
653 |a Population density 
653 |a Crime 
653 |a Law enforcement 
653 |a Effectiveness 
653 |a Arabic language 
653 |a Text categorization 
653 |a Computer science 
653 |a Artificial intelligence 
653 |a Sentiment analysis 
653 |a Social networks 
653 |a Natural language processing 
653 |a Multilingualism 
653 |a Crime prevention 
653 |a Large language models 
773 0 |t International Journal of Advanced Computer Science and Applications  |g vol. 16, no. 5 (2025) 
786 0 |d ProQuest  |t Advanced Technologies & Aerospace Database 
856 4 1 |3 Citation/Abstract  |u https://www.proquest.com/docview/3222641132/abstract/embedded/6A8EOT78XXH2IG52?source=fedsrch 
856 4 0 |3 Full Text - PDF  |u https://www.proquest.com/docview/3222641132/fulltextPDF/embedded/6A8EOT78XXH2IG52?source=fedsrch