Resource-Efficient Fine-Tuning Strategies for Automatic MOS Prediction in Text-to-Speech for Low-Resource Languages
Furkejuvvon:
| Publikašuvnnas: | arXiv.org (May 30, 2023), p. n/a |
|---|---|
| Váldodahkki: | |
| Eará dahkkit: | , , |
| Almmustuhtton: |
Cornell University Library, arXiv.org
|
| Fáttát: | |
| Liŋkkat: | Citation/Abstract Full text outside of ProQuest |
| Fáddágilkorat: |
Eai fáddágilkorat, Lasit vuosttaš fáddágilkora!
|
MARC
| LEADER | 00000nab a2200000uu 4500 | ||
|---|---|---|---|
| 001 | 2821493327 | ||
| 003 | UK-CbPIL | ||
| 022 | |a 2331-8422 | ||
| 035 | |a 2821493327 | ||
| 045 | 0 | |b d20230530 | |
| 100 | 1 | |a Do, Phat | |
| 245 | 1 | |a Resource-Efficient Fine-Tuning Strategies for Automatic MOS Prediction in Text-to-Speech for Low-Resource Languages | |
| 260 | |b Cornell University Library, arXiv.org |c May 30, 2023 | ||
| 513 | |a Working Paper | ||
| 520 | 3 | |a We train a MOS prediction model based on wav2vec 2.0 using the open-access data sets BVCC and SOMOS. Our test with neural TTS data in the low-resource language (LRL) West Frisian shows that pre-training on BVCC before fine-tuning on SOMOS leads to the best accuracy for both fine-tuned and zero-shot prediction. Further fine-tuning experiments show that using more than 30 percent of the total data does not lead to significant improvements. In addition, fine-tuning with data from a single listener shows promising system-level accuracy, supporting the viability of one-participant pilot tests. These findings can all assist the resource-conscious development of TTS for LRLs by progressing towards better zero-shot MOS prediction and informing the design of listening tests, especially in early-stage evaluation. | |
| 653 | |a Prediction models | ||
| 653 | |a Speech recognition | ||
| 700 | 1 | |a Coler, Matt | |
| 700 | 1 | |a Dijkstra, Jelske | |
| 700 | 1 | |a Klabbers, Esther | |
| 773 | 0 | |t arXiv.org |g (May 30, 2023), p. n/a | |
| 786 | 0 | |d ProQuest |t Engineering Database | |
| 856 | 4 | 1 | |3 Citation/Abstract |u https://www.proquest.com/docview/2821493327/abstract/embedded/75I98GEZK8WCJMPQ?source=fedsrch |
| 856 | 4 | 0 | |3 Full text outside of ProQuest |u http://arxiv.org/abs/2305.19396 |