Automatic In-Place Text Detection and Translation in Video Games

Guardado en:
Detalles Bibliográficos
Publicado en:PQDT - Global (2025)
Autor principal: Shchegolkov, Mikhail
Publicado:
ProQuest Dissertations & Theses
Materias:
Acceso en línea:Citation/Abstract
Full Text - PDF
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
Descripción
Resumen:This project is dedicated to solving the problem of translating in-game text in video games, where full localization is often lacking, especially for interface elements such as menus and system messages. The main goal was to develop a program capable of recognizing text on the screen during gameplay, translating it from English to the player's chosen language, and displaying the translation directly in the game window. The program includes several key components: the CRAFT (Character Region Awareness for Text detection) method is used for text detection, which identifies individual regions of characters and their relationships to form words. Words are then combined into blocks based on the width of the letters. Text recognition in MATRN leverages a bi-directional enhancement strategy between visual and semantic features, offering robust performance on irregular text shapes. MATRN integrates multi-modal refinement modules and spatial-aware semantic encodings to dynamically capture complex text variations. The program uses free machine translation models from the HuggingFace platform to translate text, avoiding the hassle of setting up paid APIs. Due to performance limitations, the translated text is displayed on top of the original text on a light background, instead of cutting and replacing the font. An important feature is the ability to transmit user clicks on the translated text to the game, allowing interaction with menu elements. Usability testing with four participants demonstrated the effectiveness of the program: all tasks, such as translating game menus and interacting with the game settings, were successfully completed without prompts, confirming its practical usefulness. Although the initial performance showed a translation time of about 30 seconds for a screen with 10 words, this prototype successfully demonstrates a new end-to-end solution for real-time text translation in games, significantly increasing accessibility for players.
ISBN:9798265489906
Fuente:ProQuest Dissertations & Theses Global