Resolving Linguistic Asymmetry: Forging Symmetric Multilingual Embeddings Through Asymmetric Contrastive and Curriculum Learning

Spremljeno u:

Bibliografski detalji
Izdano u:	Symmetry vol. 17, no. 9 (2025), p. 1386-1407
Glavni autor:	Meng Lei
Daljnji autori:	Li, Yinlin, Wei, Wei, Yang Caipei
Izdano:	MDPI AG
Teme:	Natural language Language Dictionaries Curricula Equivalence Task complexity Symmetry Lexical choice Forging Connotation Robustness Asymmetry Linguistics Semantics Large language models Syntax Syntactic structures Instructional scaffolding Multilingualism Language modeling Bilingualism Embedding Sentences Experiments Classification Retrieval Learning Frame analysis Languages
Online pristup:	Citation/Abstract Full Text + Graphics Full Text - PDF
Oznake:	Dodaj oznaku Bez oznaka, Budi prvi tko označuje ovaj zapis!

MARC


LEADER	00000nab a2200000uu 4500
001	3254649201
003	UK-CbPIL
022			\|a 2073-8994
024	7		\|a 10.3390/sym17091386 \|2 doi
035			\|a 3254649201
045	2		\|b d20250101 \|b d20251231
084			\|a 231635 \|2 nlm
100	1		\|a Meng Lei \|u College of Information Engineering, Xuchang University, Xuchang 461000, China; mengl@xcu.edu.cn
245	1		\|a Resolving Linguistic Asymmetry: Forging Symmetric Multilingual Embeddings Through Asymmetric Contrastive and Curriculum Learning
260			\|b MDPI AG \|c 2025
513			\|a Journal Article
520	3		\|a The pursuit of universal, symmetric semantic representations within large language models (LLMs) faces a fundamental challenge: the inherent asymmetry of natural languages. Different languages exhibit vast disparities in syntactic structures, lexical choices, and cultural nuances, making the creation of a truly shared, symmetric embedding space a non-trivial task. This paper aims to address this critical problem by introducing a novel framework to forge robust and symmetric multilingual sentence embeddings. Our approach, named DACL (Dynamic Asymmetric Contrastive Learning), is anchored in two powerful asymmetric learning paradigms: Contrastive Learning and Dynamic Curriculum Learning (DCL). We extend Contrastive Learning to the multilingual context, where it asymmetrically treats semantically equivalent sentences from different languages (positive pairs) and sentences with distinct meanings (negative pairs) to enforce semantic symmetry in the target embedding space. To further refine this process, we incorporate Dynamic Curriculum Learning, which introduces a second layer of asymmetry by dynamically scheduling training instances from easy to hard. This dual-asymmetric strategy enables the model to progressively master complex cross-lingual relationships, starting with more obvious semantic equivalences and advancing to subtler ones. Our comprehensive experiments on benchmark cross-lingual tasks, including sentence retrieval and cross-lingual classification (XNLI, PAWS-X, MLDoc, MARC), demonstrate that DACL significantly outperforms a wide range of established baselines. The results validate our dual-asymmetric framework as a highly effective approach for forging robust multilingual embeddings, particularly excelling in tasks involving complex linguistic asymmetries. Ultimately, this work contributes a novel dual-asymmetric learning framework that effectively leverages linguistic asymmetry to achieve robust semantic symmetry across languages. It offers valuable insights for developing more capable, fair, and interpretable multilingual LLMs, emphasizing that deliberately leveraging asymmetry in the learning process is a highly effective strategy.
653			\|a Natural language
653			\|a Language
653			\|a Dictionaries
653			\|a Curricula
653			\|a Equivalence
653			\|a Task complexity
653			\|a Symmetry
653			\|a Lexical choice
653			\|a Forging
653			\|a Connotation
653			\|a Robustness
653			\|a Asymmetry
653			\|a Linguistics
653			\|a Semantics
653			\|a Large language models
653			\|a Syntax
653			\|a Syntactic structures
653			\|a Instructional scaffolding
653			\|a Multilingualism
653			\|a Language modeling
653			\|a Bilingualism
653			\|a Embedding
653			\|a Sentences
653			\|a Experiments
653			\|a Classification
653			\|a Retrieval
653			\|a Learning
653			\|a Frame analysis
653			\|a Languages
700	1		\|a Li, Yinlin \|u State Key Laboratory of Multimodal Artificial Intelligence Systems, Institute of Automation, Chinese Academy of Sciences, No. 95 Zhongguancun East Road, Beijing 100100, China
700	1		\|a Wei, Wei \|u School of Science and Electrical Engineering, Beihang University, Beijing 100190, China
700	1		\|a Yang Caipei \|u Wuhan Second Ship Design and Research Institute, Wuhan 430000, China; ycp03042025@163.com
773	0		\|t Symmetry \|g vol. 17, no. 9 (2025), p. 1386-1407
786	0		\|d ProQuest \|t Engineering Database
856	4	1	\|3 Citation/Abstract \|u https://www.proquest.com/docview/3254649201/abstract/embedded/7BTGNMKEMPT1V9Z2?source=fedsrch
856	4	0	\|3 Full Text + Graphics \|u https://www.proquest.com/docview/3254649201/fulltextwithgraphics/embedded/7BTGNMKEMPT1V9Z2?source=fedsrch
856	4	0	\|3 Full Text - PDF \|u https://www.proquest.com/docview/3254649201/fulltextPDF/embedded/7BTGNMKEMPT1V9Z2?source=fedsrch