Syllable-, Bigram-, and Morphology-Driven Pseudoword Generation in Greek

Guardado en:
Detalles Bibliográficos
Publicado en:Applied Sciences vol. 15, no. 12 (2025), p. 6582
Autor principal: Kosmidis Kosmas
Otros Autores: Apostolouda Vassiliki, Revithiadou Anthi
Publicado:
MDPI AG
Materias:
Acceso en línea:Citation/Abstract
Full Text + Graphics
Full Text - PDF
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
Descripción
Resumen:SyBig-r-Morph is a versatile tool for generating pseudowords designed for Greek, but it can be easily modified to work with any language. By allowing researchers to produce phonotactically and morphologically well-formed pseudowords that are specifically tailored to particular morphosyntactic categories, such as nouns or verbs, it overcomes the shortcomings of current multilingual generators. This tool is especially valuable for designing controlled linguistic experiments, including studies on stress assignment, lexical access, and morphophonological and lexical processing. By serving as an important link between orthographic representation and phonological realization—an important step in the text-to-speech pipeline—SyBig-r-Morph offers a valuable tool for psycholinguistic research, computational phonology, and speech synthesis applications that require linguistically authentic pseudoword stimuli. Pseudowords are essential in (psycho)linguistic research, offering a way to study language without meaning interference. Various methods for creating pseudowords exist, but each has its limitations. Traditional approaches modify existing words, risking unintended recognition. Modern algorithmic methods use high-frequency n-grams or syllable deconstruction but often require specialized expertise. Currently, no automatic process for pseudoword generation is designed explicitly for Greek, which is our primary focus. Therefore, we developed SyBig-r-Morph, a novel application that constructs pseudowords using syllables as the main building block, replicating Greek phonotactic patterns. SyBig-r-Morph draws input from word lists and databases that include syllabification, word length, part of speech, and frequency information. It categorizes syllables by position to ensure phonotactic consistency with user-selected morphosyntactic categories and can optionally assign stress to generated words. Additionally, the tool uses multiple lexicons to eliminate phonologically invalid combinations. Its modular architecture allows easy adaptation to other languages. To further evaluate its output, we conducted a manual assessment using a tool that verifies phonotactic well-formedness based on phonological parameters derived from a corpus. Most SyBig-r-Morph words passed the stricter phonotactic criteria, confirming the tool’s sound design and linguistic adequacy.
ISSN:2076-3417
DOI:10.3390/app15126582
Fuente:Publicly Available Content Database