Integrating Hybrid AI Approaches for Enhanced Translation in Minority Languages
Guardado en:
| Publicado en: | Applied Sciences vol. 15, no. 16 (2025), p. 9039-9055 |
|---|---|
| Autor principal: | |
| Otros Autores: | , , |
| Publicado: |
MDPI AG
|
| Materias: | |
| Acceso en línea: | Citation/Abstract Full Text + Graphics Full Text - PDF |
| Etiquetas: |
Sin Etiquetas, Sea el primero en etiquetar este registro!
|
| Resumen: | The proposed hybrid AI-driven translation system’s architecture integrates phrase-based machine translation (PBMT) and neural machine translation (NMT) within a recursive learning framework. It provides a blueprint for institutions that digitize, translate, or teach under-resourced languages. Due to its ability to adapt to multilingual inputs and preserve cultural expressions, it is highly suitable for applications in education, community media, cultural preservation, and government-supported language revitalization initiatives. This study presents a hybrid artificial intelligence model designed to enhance translation quality for low-resource languages, specifically targeting the Hakka language. The proposed model integrates phrase-based machine translation (PBMT) and neural machine translation (NMT) within a recursive learning framework. The methodology consists of three key stages: (1) initial translation using PBMT, where Hakka corpus data is structured into a parallel dataset; (2) NMT training with Transformers, leveraging the generated parallel corpus to train deep learning models; and (3) recursive translation refinement, where iterative translations further enhance model accuracy by expanding the training dataset. This study employs preprocessing techniques to clean and optimize the dataset, reducing noise and improving sentence segmentation. A BLEU score evaluation is conducted to compare the effectiveness of PBMT and NMT across various corpus sizes, demonstrating that while PBMT performs well with limited data, the Transformer-based NMT achieves superior results as training data increases. The findings highlight the advantages of a hybrid approach in overcoming data scarcity challenges for minority languages. This research contributes to machine translation methodologies by proposing a scalable framework for improving linguistic accessibility in under-resourced languages. |
|---|---|
| ISSN: | 2076-3417 |
| DOI: | 10.3390/app15169039 |
| Fuente: | Publicly Available Content Database |