Integrating Hybrid AI Approaches for Enhanced Translation in Minority Languages

Guardado en:
Detalles Bibliográficos
Publicado en:Applied Sciences vol. 15, no. 16 (2025), p. 9039-9055
Autor principal: Chen-Chi, Chang
Otros Autores: Yu-Hsun, Lin, Yun-Hsiang, Hsu, I-Hsin, Fan
Publicado:
MDPI AG
Materias:
Acceso en línea:Citation/Abstract
Full Text + Graphics
Full Text - PDF
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
Descripción
Resumen:The proposed hybrid AI-driven translation system’s architecture integrates phrase-based machine translation (PBMT) and neural machine translation (NMT) within a recursive learning framework. It provides a blueprint for institutions that digitize, translate, or teach under-resourced languages. Due to its ability to adapt to multilingual inputs and preserve cultural expressions, it is highly suitable for applications in education, community media, cultural preservation, and government-supported language revitalization initiatives. This study presents a hybrid artificial intelligence model designed to enhance translation quality for low-resource languages, specifically targeting the Hakka language. The proposed model integrates phrase-based machine translation (PBMT) and neural machine translation (NMT) within a recursive learning framework. The methodology consists of three key stages: (1) initial translation using PBMT, where Hakka corpus data is structured into a parallel dataset; (2) NMT training with Transformers, leveraging the generated parallel corpus to train deep learning models; and (3) recursive translation refinement, where iterative translations further enhance model accuracy by expanding the training dataset. This study employs preprocessing techniques to clean and optimize the dataset, reducing noise and improving sentence segmentation. A BLEU score evaluation is conducted to compare the effectiveness of PBMT and NMT across various corpus sizes, demonstrating that while PBMT performs well with limited data, the Transformer-based NMT achieves superior results as training data increases. The findings highlight the advantages of a hybrid approach in overcoming data scarcity challenges for minority languages. This research contributes to machine translation methodologies by proposing a scalable framework for improving linguistic accessibility in under-resourced languages.
ISSN:2076-3417
DOI:10.3390/app15169039
Fuente:Publicly Available Content Database