Resolving Linguistic Asymmetry: Forging Symmetric Multilingual Embeddings Through Asymmetric Contrastive and Curriculum Learning
Spremljeno u:
| Izdano u: | Symmetry vol. 17, no. 9 (2025), p. 1386-1407 |
|---|---|
| Glavni autor: | |
| Daljnji autori: | , , |
| Izdano: |
MDPI AG
|
| Teme: | |
| Online pristup: | Citation/Abstract Full Text + Graphics Full Text - PDF |
| Oznake: |
Bez oznaka, Budi prvi tko označuje ovaj zapis!
|
MARC
| LEADER | 00000nab a2200000uu 4500 | ||
|---|---|---|---|
| 001 | 3254649201 | ||
| 003 | UK-CbPIL | ||
| 022 | |a 2073-8994 | ||
| 024 | 7 | |a 10.3390/sym17091386 |2 doi | |
| 035 | |a 3254649201 | ||
| 045 | 2 | |b d20250101 |b d20251231 | |
| 084 | |a 231635 |2 nlm | ||
| 100 | 1 | |a Meng Lei |u College of Information Engineering, Xuchang University, Xuchang 461000, China; mengl@xcu.edu.cn | |
| 245 | 1 | |a Resolving Linguistic Asymmetry: Forging Symmetric Multilingual Embeddings Through Asymmetric Contrastive and Curriculum Learning | |
| 260 | |b MDPI AG |c 2025 | ||
| 513 | |a Journal Article | ||
| 520 | 3 | |a The pursuit of universal, symmetric semantic representations within large language models (LLMs) faces a fundamental challenge: the inherent asymmetry of natural languages. Different languages exhibit vast disparities in syntactic structures, lexical choices, and cultural nuances, making the creation of a truly shared, symmetric embedding space a non-trivial task. This paper aims to address this critical problem by introducing a novel framework to forge robust and symmetric multilingual sentence embeddings. Our approach, named DACL (Dynamic Asymmetric Contrastive Learning), is anchored in two powerful asymmetric learning paradigms: Contrastive Learning and Dynamic Curriculum Learning (DCL). We extend Contrastive Learning to the multilingual context, where it asymmetrically treats semantically equivalent sentences from different languages (positive pairs) and sentences with distinct meanings (negative pairs) to enforce semantic symmetry in the target embedding space. To further refine this process, we incorporate Dynamic Curriculum Learning, which introduces a second layer of asymmetry by dynamically scheduling training instances from easy to hard. This dual-asymmetric strategy enables the model to progressively master complex cross-lingual relationships, starting with more obvious semantic equivalences and advancing to subtler ones. Our comprehensive experiments on benchmark cross-lingual tasks, including sentence retrieval and cross-lingual classification (XNLI, PAWS-X, MLDoc, MARC), demonstrate that DACL significantly outperforms a wide range of established baselines. The results validate our dual-asymmetric framework as a highly effective approach for forging robust multilingual embeddings, particularly excelling in tasks involving complex linguistic asymmetries. Ultimately, this work contributes a novel dual-asymmetric learning framework that effectively leverages linguistic asymmetry to achieve robust semantic symmetry across languages. It offers valuable insights for developing more capable, fair, and interpretable multilingual LLMs, emphasizing that deliberately leveraging asymmetry in the learning process is a highly effective strategy. | |
| 653 | |a Natural language | ||
| 653 | |a Language | ||
| 653 | |a Dictionaries | ||
| 653 | |a Curricula | ||
| 653 | |a Equivalence | ||
| 653 | |a Task complexity | ||
| 653 | |a Symmetry | ||
| 653 | |a Lexical choice | ||
| 653 | |a Forging | ||
| 653 | |a Connotation | ||
| 653 | |a Robustness | ||
| 653 | |a Asymmetry | ||
| 653 | |a Linguistics | ||
| 653 | |a Semantics | ||
| 653 | |a Large language models | ||
| 653 | |a Syntax | ||
| 653 | |a Syntactic structures | ||
| 653 | |a Instructional scaffolding | ||
| 653 | |a Multilingualism | ||
| 653 | |a Language modeling | ||
| 653 | |a Bilingualism | ||
| 653 | |a Embedding | ||
| 653 | |a Sentences | ||
| 653 | |a Experiments | ||
| 653 | |a Classification | ||
| 653 | |a Retrieval | ||
| 653 | |a Learning | ||
| 653 | |a Frame analysis | ||
| 653 | |a Languages | ||
| 700 | 1 | |a Li, Yinlin |u State Key Laboratory of Multimodal Artificial Intelligence Systems, Institute of Automation, Chinese Academy of Sciences, No. 95 Zhongguancun East Road, Beijing 100100, China | |
| 700 | 1 | |a Wei, Wei |u School of Science and Electrical Engineering, Beihang University, Beijing 100190, China | |
| 700 | 1 | |a Yang Caipei |u Wuhan Second Ship Design and Research Institute, Wuhan 430000, China; ycp03042025@163.com | |
| 773 | 0 | |t Symmetry |g vol. 17, no. 9 (2025), p. 1386-1407 | |
| 786 | 0 | |d ProQuest |t Engineering Database | |
| 856 | 4 | 1 | |3 Citation/Abstract |u https://www.proquest.com/docview/3254649201/abstract/embedded/7BTGNMKEMPT1V9Z2?source=fedsrch |
| 856 | 4 | 0 | |3 Full Text + Graphics |u https://www.proquest.com/docview/3254649201/fulltextwithgraphics/embedded/7BTGNMKEMPT1V9Z2?source=fedsrch |
| 856 | 4 | 0 | |3 Full Text - PDF |u https://www.proquest.com/docview/3254649201/fulltextPDF/embedded/7BTGNMKEMPT1V9Z2?source=fedsrch |