Enhancing Standard and Dialectal Frisian ASR: Multilingual Fine-tuning and Language Identification for Improved Low-resource Performance

Збережено в:
Бібліографічні деталі
Опубліковано в::The Institute of Electrical and Electronics Engineers, Inc. (IEEE) Conference Proceedings (2025), p. 1-5
Автор: Amooie, Reihaneh
Інші автори: De Vries, Wietse, Yun Hao, Dijkstra, Jelske, Coler, Matt, Wieling, Martijn
Опубліковано:
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Предмети:
Онлайн доступ:Citation/Abstract
Теги: Додати тег
Немає тегів, Будьте першим, хто поставить тег для цього запису!

MARC

LEADER 00000nab a2200000uu 4500
001 3268870229
003 UK-CbPIL
024 7 |a 10.1109/ICASSP49660.2025.10889692  |2 doi 
035 |a 3268870229 
045 2 |b d20250101  |b d20251231 
084 |a 228229  |2 nlm 
100 1 |a Amooie, Reihaneh  |u Center for Language and Cognition University of Groningen,Groningen,The Netherlands 
245 1 |a Enhancing Standard and Dialectal Frisian ASR: Multilingual Fine-tuning and Language Identification for Improved Low-resource Performance 
260 |b The Institute of Electrical and Electronics Engineers, Inc. (IEEE)  |c 2025 
513 |a Conference Proceedings 
520 3 |a Conference Title: ICASSP 2025 - 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)Conference Start Date: 2025 April 6Conference End Date: 2025 April 11Conference Location: Hyderabad, IndiaAutomatic Speech Recognition (ASR) performance for low-resource languages is still far behind that of higher-resource languages such as English, due to a lack of sufficient labeled data. State-of-the-art methods deploy self-supervised transfer learning where a model pre-trained on large amounts of data is fine-tuned using little labeled data in a target low-resource language. In this paper, we present and examine a method for fine-tuning an SSL-based model in order to improve the performance for Frisian and its regional dialects (Clay Frisian, Wood Frisian, and South Frisian). We show that Frisian ASR performance can be improved by using multilingual (Frisian, Dutch, English and German) fine-tuning data and an auxiliary language identification task. In addition, our findings show that performance on dialectal speech suffers substantially, and, importantly, that this effect is moderated by the elicitation approach used to collect the dialectal data. Our findings also particularly suggest that relying solely on standard language data for ASR evaluation may underestimate real-world performance, particularly in languages with substantial dialectal variation. 
653 |a Performance enhancement 
653 |a Multilingualism 
653 |a Automatic speech recognition 
653 |a English language 
653 |a Languages 
653 |a Acoustics 
653 |a Economic 
700 1 |a De Vries, Wietse  |u Center for Language and Cognition University of Groningen,Groningen,The Netherlands 
700 1 |a Yun Hao  |u Center for Language and Cognition University of Groningen,Groningen,The Netherlands 
700 1 |a Dijkstra, Jelske  |u Mercator European Research Centre Fryske Akademy,Friesland,The Netherlands 
700 1 |a Coler, Matt  |u Speech Technology Lab University of Groningen,Friesland,The Netherlands 
700 1 |a Wieling, Martijn  |u Center for Language and Cognition University of Groningen,Groningen,The Netherlands 
773 0 |t The Institute of Electrical and Electronics Engineers, Inc. (IEEE) Conference Proceedings  |g (2025), p. 1-5 
786 0 |d ProQuest  |t Science Database 
856 4 1 |3 Citation/Abstract  |u https://www.proquest.com/docview/3268870229/abstract/embedded/7BTGNMKEMPT1V9Z2?source=fedsrch