CLFF-NER: A Cross-Lingual Feature Fusion Model for Named Entity Recognition in the Traditional Chinese Festival Culture Domain

Salvato in:
Dettagli Bibliografici
Pubblicato in:Informatics vol. 12, no. 4 (2025), p. 136-154
Autore principale: Yang Shenghe
Altri autori: He, Kun, Li, Wei, He, Yingying
Pubblicazione:
MDPI AG
Soggetti:
Accesso online:Citation/Abstract
Full Text + Graphics
Full Text - PDF
Tags: Aggiungi Tag
Nessun Tag, puoi essere il primo ad aggiungerne!!
Descrizione
Abstract:With the rapid development of information technology, there is an increasing demand for the digital preservation of traditional festival culture and the extraction of relevant knowledge. However, existing research on Named Entity Recognition (NER) for Chinese traditional festival culture lacks support from high-quality corpora and dedicated model methods. To address this gap, this study proposes a Named Entity Recognition model, CLFF-NER, which integrates multi-source heterogeneous information. The model operates as follows: first, Multilingual BERT is employed to obtain the contextual semantic representations of Chinese and English sentences. Subsequently, a Multiconvolutional Kernel Network (MKN) is used to extract the local structural features of entities. Then, a Transformer module is introduced to achieve cross-lingual, cross-attention fusion of Chinese and English semantics. Furthermore, a Graph Neural Network (GNN) is utilized to selectively supplement useful English information, thereby alleviating the interference caused by redundant information. Finally, a gating mechanism and Conditional Random Field (CRF) are combined to jointly optimize the recognition results. Experiments were conducted on the public Chinese Festival Culture Dataset (CTFCDataSet), and the model achieved 89.45%, 90.01%, and 89.73% in precision, recall, and F1 score, respectively—significantly outperforming a range of mainstream baseline models. Meanwhile, the model also demonstrated competitive performance on two other public datasets, Resume and Weibo, which verifies its strong cross-domain generalization ability.
ISSN:2227-9709
DOI:10.3390/informatics12040136
Fonte:Advanced Technologies & Aerospace Database