Optimising Contextual Embeddings for Meaning Conflation Deficiency Resolution in Low-Resourced Languages

Salvato in:

Dettagli Bibliografici
Pubblicato in:	Computers vol. 14, no. 9 (2025), p. 402-431
Autore principale:	Masethe, Mosima A
Altri autori:	Ojo, Sunday O, Masethe, Hlaudi D
Pubblicazione:	MDPI AG
Soggetti:	Sparsity Language Recall Accuracy Semantics Word sense disambiguation Case studies Natural language processing Optimization Adaptation Morphological complexity Annotations Polysemy Morphology Sotho languages Meaning Termination Ambiguity Languages
Accesso online:	Citation/Abstract Full Text + Graphics Full Text - PDF
Tags:	Aggiungi Tag Nessun Tag, puoi essere il primo ad aggiungerne!!

MARC


LEADER	00000nab a2200000uu 4500
001	3254483396
003	UK-CbPIL
022			\|a 2073-431X
024	7		\|a 10.3390/computers14090402 \|2 doi
035			\|a 3254483396
045	2		\|b d20250101 \|b d20251231
084			\|a 231447 \|2 nlm
100	1		\|a Masethe, Mosima A \|u Department of Information Technology, Faculty of Accounting and Informatics, Durban University of Technology, Durban 4001, South Africa
245	1		\|a Optimising Contextual Embeddings for Meaning Conflation Deficiency Resolution in Low-Resourced Languages
260			\|b MDPI AG \|c 2025
513			\|a Journal Article
520	3		\|a Meaning conflation deficiency (MCD) presents a continual obstacle in natural language processing (NLP), especially for low-resourced and morphologically complex languages, where polysemy and contextual ambiguity diminish model precision in word sense disambiguation (WSD) tasks. This paper examines the optimisation of contextual embedding models, namely XLNet, ELMo, BART, and their improved variations, to tackle MCD in linguistic settings. Utilising Sesotho sa Leboa as a case study, researchers devised an enhanced XLNet architecture with specific hyperparameter optimisation, dynamic padding, early termination, and class-balanced training. Comparative assessments reveal that the optimised XLNet attains an accuracy of 91% and exhibits balanced precision–recall metrics of 92% and 91%, respectively, surpassing both its baseline counterpart and competing models. Optimised ELMo attained the greatest overall metrics (accuracy: 92%, F1-score: 96%), whilst optimised BART demonstrated significant accuracy improvements (96%) despite a reduced recall. The results demonstrate that fine-tuning contextual embeddings using MCD-specific methodologies significantly improves semantic disambiguation for under-represented languages. This study offers a scalable and flexible optimisation approach suitable for additional low-resource language contexts.
653			\|a Sparsity
653			\|a Language
653			\|a Recall
653			\|a Accuracy
653			\|a Semantics
653			\|a Word sense disambiguation
653			\|a Case studies
653			\|a Natural language processing
653			\|a Optimization
653			\|a Adaptation
653			\|a Morphological complexity
653			\|a Annotations
653			\|a Polysemy
653			\|a Morphology
653			\|a Sotho languages
653			\|a Meaning
653			\|a Termination
653			\|a Ambiguity
653			\|a Languages
700	1		\|a Ojo, Sunday O \|u Department of Information Technology, Faculty of Accounting and Informatics, Durban University of Technology, Durban 4001, South Africa
700	1		\|a Masethe, Hlaudi D \|u Department of Data Science, Faculty of Information Communication Technology, Tshwane University of Technology, Pretoria 0001, South Africa
773	0		\|t Computers \|g vol. 14, no. 9 (2025), p. 402-431
786	0		\|d ProQuest \|t Advanced Technologies & Aerospace Database
856	4	1	\|3 Citation/Abstract \|u https://www.proquest.com/docview/3254483396/abstract/embedded/7BTGNMKEMPT1V9Z2?source=fedsrch
856	4	0	\|3 Full Text + Graphics \|u https://www.proquest.com/docview/3254483396/fulltextwithgraphics/embedded/7BTGNMKEMPT1V9Z2?source=fedsrch
856	4	0	\|3 Full Text - PDF \|u https://www.proquest.com/docview/3254483396/fulltextPDF/embedded/7BTGNMKEMPT1V9Z2?source=fedsrch