In-Context Learning for Low-Resource Machine Translation: A Study on Tarifit with Large Language Models
Guardado en:
| Publicado en: | Algorithms vol. 18, no. 8 (2025), p. 489-508 |
|---|---|
| Autor principal: | |
| Otros Autores: | |
| Publicado: |
MDPI AG
|
| Materias: | |
| Acceso en línea: | Citation/Abstract Full Text + Graphics Full Text - PDF |
| Etiquetas: |
Sin Etiquetas, Sea el primero en etiquetar este registro!
|
| Resumen: | This study presents the first systematic evaluation of in-context learning for Tarifit machine translation, a low-resource Amazigh language spoken by 5 million people in Morocco and Europe. We assess three large language models (GPT-4, Claude-3.5, PaLM-2) across Tarifit–Arabic, Tarifit–French, and Tarifit–English translation using 1000 sentence pairs and 5-fold cross-validation. Results show that 8-shot similarity-based demonstration selection achieves optimal performance. GPT-4 achieved 20.2 BLEU for Tarifit–Arabic, 14.8 for Tarifit–French, and 10.9 for Tarifit–English. Linguistic proximity significantly impacts translation quality, with Tarifit–Arabic substantially outperforming other language pairs by 8.4 BLEU points due to shared vocabulary and morphological patterns. Error analysis reveals systematic issues with morphological complexity (42% of errors) and cultural terminology preservation (18% of errors). This work establishes baseline benchmarks for Tarifit translation and demonstrates the viability of in-context learning for morphologically complex low-resource languages, contributing to linguistic equity in AI systems. |
|---|---|
| ISSN: | 1999-4893 |
| DOI: | 10.3390/a18080489 |
| Fuente: | Engineering Database |