Automatic In-Place Text Detection and Translation in Video Games

Guardat en:
Dades bibliogràfiques
Publicat a:PQDT - Global (2025)
Autor principal: Shchegolkov, Mikhail
Publicat:
ProQuest Dissertations & Theses
Matèries:
Accés en línia:Citation/Abstract
Full Text - PDF
Etiquetes: Afegir etiqueta
Sense etiquetes, Sigues el primer a etiquetar aquest registre!

MARC

LEADER 00000nab a2200000uu 4500
001 3283380150
003 UK-CbPIL
020 |a 9798265489906 
035 |a 3283380150 
045 2 |b d20250101  |b d20251231 
084 |a 189128  |2 nlm 
100 1 |a Shchegolkov, Mikhail 
245 1 |a Automatic In-Place Text Detection and Translation in Video Games 
260 |b ProQuest Dissertations & Theses  |c 2025 
513 |a Dissertation/Thesis 
520 3 |a This project is dedicated to solving the problem of translating in-game text in video games, where full localization is often lacking, especially for interface elements such as menus and system messages. The main goal was to develop a program capable of recognizing text on the screen during gameplay, translating it from English to the player's chosen language, and displaying the translation directly in the game window. The program includes several key components: the CRAFT (Character Region Awareness for Text detection) method is used for text detection, which identifies individual regions of characters and their relationships to form words. Words are then combined into blocks based on the width of the letters. Text recognition in MATRN leverages a bi-directional enhancement strategy between visual and semantic features, offering robust performance on irregular text shapes. MATRN integrates multi-modal refinement modules and spatial-aware semantic encodings to dynamically capture complex text variations. The program uses free machine translation models from the HuggingFace platform to translate text, avoiding the hassle of setting up paid APIs. Due to performance limitations, the translated text is displayed on top of the original text on a light background, instead of cutting and replacing the font. An important feature is the ability to transmit user clicks on the translated text to the game, allowing interaction with menu elements. Usability testing with four participants demonstrated the effectiveness of the program: all tasks, such as translating game menus and interacting with the game settings, were successfully completed without prompts, confirming its practical usefulness. Although the initial performance showed a translation time of about 30 seconds for a screen with 10 words, this prototype successfully demonstrates a new end-to-end solution for real-time text translation in games, significantly increasing accessibility for players. 
653 |a User interface 
653 |a Accuracy 
653 |a Computer & video games 
653 |a Datasets 
653 |a Multilingual systems 
653 |a Neural networks 
653 |a Diffusion models 
653 |a Access to information 
653 |a Methods 
653 |a Machine translation 
653 |a Literature reviews 
653 |a Text editing 
653 |a Libraries 
653 |a Usability testing 
653 |a Python 
653 |a Research & development--R&D 
653 |a Chatbots 
653 |a Semantics 
653 |a Artificial intelligence 
773 0 |t PQDT - Global  |g (2025) 
786 0 |d ProQuest  |t ProQuest Dissertations & Theses Global 
856 4 1 |3 Citation/Abstract  |u https://www.proquest.com/docview/3283380150/abstract/embedded/6A8EOT78XXH2IG52?source=fedsrch 
856 4 0 |3 Full Text - PDF  |u https://www.proquest.com/docview/3283380150/fulltextPDF/embedded/6A8EOT78XXH2IG52?source=fedsrch