Syllable-, Bigram-, and Morphology-Driven Pseudoword Generation in Greek
Guardat en:
| Publicat a: | Applied Sciences vol. 15, no. 12 (2025), p. 6582 |
|---|---|
| Autor principal: | |
| Altres autors: | , |
| Publicat: |
MDPI AG
|
| Matèries: | |
| Accés en línia: | Citation/Abstract Full Text + Graphics Full Text - PDF |
| Etiquetes: |
Sense etiquetes, Sigues el primer a etiquetar aquest registre!
|
| Resum: | SyBig-r-Morph is a versatile tool for generating pseudowords designed for Greek, but it can be easily modified to work with any language. By allowing researchers to produce phonotactically and morphologically well-formed pseudowords that are specifically tailored to particular morphosyntactic categories, such as nouns or verbs, it overcomes the shortcomings of current multilingual generators. This tool is especially valuable for designing controlled linguistic experiments, including studies on stress assignment, lexical access, and morphophonological and lexical processing. By serving as an important link between orthographic representation and phonological realization—an important step in the text-to-speech pipeline—SyBig-r-Morph offers a valuable tool for psycholinguistic research, computational phonology, and speech synthesis applications that require linguistically authentic pseudoword stimuli. Pseudowords are essential in (psycho)linguistic research, offering a way to study language without meaning interference. Various methods for creating pseudowords exist, but each has its limitations. Traditional approaches modify existing words, risking unintended recognition. Modern algorithmic methods use high-frequency n-grams or syllable deconstruction but often require specialized expertise. Currently, no automatic process for pseudoword generation is designed explicitly for Greek, which is our primary focus. Therefore, we developed SyBig-r-Morph, a novel application that constructs pseudowords using syllables as the main building block, replicating Greek phonotactic patterns. SyBig-r-Morph draws input from word lists and databases that include syllabification, word length, part of speech, and frequency information. It categorizes syllables by position to ensure phonotactic consistency with user-selected morphosyntactic categories and can optionally assign stress to generated words. Additionally, the tool uses multiple lexicons to eliminate phonologically invalid combinations. Its modular architecture allows easy adaptation to other languages. To further evaluate its output, we conducted a manual assessment using a tool that verifies phonotactic well-formedness based on phonological parameters derived from a corpus. Most SyBig-r-Morph words passed the stricter phonotactic criteria, confirming the tool’s sound design and linguistic adequacy. |
|---|---|
| ISSN: | 2076-3417 |
| DOI: | 10.3390/app15126582 |
| Font: | Publicly Available Content Database |