Bridging the Semantic Gap: An Ensemble Learning Framework With Textual Topic‐Raw Financial Feature Fusion to Enhance Fraud Detection in Chinese Markets

Αποθηκεύτηκε σε:
Λεπτομέρειες βιβλιογραφικής εγγραφής
Εκδόθηκε σε:Journal of Mathematics vol. 2025, no. 1 (2025)
Κύριος συγγραφέας: Wei, Congying
Άλλοι συγγραφείς: Qian, Xiyuan
Έκδοση:
John Wiley & Sons, Inc.
Θέματα:
Διαθέσιμο Online:Citation/Abstract
Full Text
Full Text - PDF
Ετικέτες: Προσθήκη ετικέτας
Δεν υπάρχουν, Καταχωρήστε ετικέτα πρώτοι!
Περιγραφή
Περίληψη:With the increasing complexity of financial statement manipulation and severe class imbalance issue, the growing complexity of financial fraud detection systems has revealed limitations in conventional approaches that rely exclusively on quantitative financial data and traditional machine learning algorithms. To overcome these constraints, we propose an enhanced financial fraud detection model that leverages advanced ensemble learning classifiers on combined features, comprising both textual information extracted from annual reports through natural language processing techniques and structured financial data from corporate statements. Utilizing a dataset of Chinese manufacturing firms listed between 2010 and 2019, we integrate textual topic indicators derived from the latent Dirichlet allocation (LDA) model with raw financial items to construct a comprehensive fraud detection system. Empirical results demonstrate the superiority of combined textual and financial indicators, which achieves significant improvements, with AUC increasing +1.5% for RUSBoost and +1.6% for XGBoost, alongside 4.5% and 3.8% NDCG@K gains (p < 0.01). Further evaluation using precision, recall, and F1‐score confirms the robustness and practical effectiveness of the proposed model under imbalanced class distributions.
ISSN:2314-4629
2314-4785
DOI:10.1155/jom/6643152
Πηγή:Engineering Database