Generalizing Low-Resource Morphology: Cognitive and Neural Perspectives on Inflection
Guardado en:
| Publicado en: | ProQuest Dissertations and Theses (2025) |
|---|---|
| Autor principal: | |
| Publicado: |
ProQuest Dissertations & Theses
|
| Materias: | |
| Acceso en línea: | Citation/Abstract Full Text - PDF |
| Etiquetas: |
Sin Etiquetas, Sea el primero en etiquetar este registro!
|
| Resumen: | State of the art NLP methods to leverage enormous amounts of digital text are transforming the experience of working with computers and accessing the internet for many people. However, for most of the world’s languages, there is insufficient digital data to make recently popular technology like large language models (LLMs) possible. New technology like LLMs are typically not well-suited for underrepresented languages—often referred to as low-resource languages in NLP—without sufficient digital data. In this case, simpler language technologies like dictionaries, morphological analyzers, and text normalizers are useful. This is especially apparent for language documentary life-cycles, building educational tools, and the development of language typology databases. With this in mind, we propose techniques for automatically expanding coverage of morphological databases and develop methods for building morphological tools for the large set of languages with few available resources. We then study the generation capabilities of neural network models that learn from these resources. Finally we propose methods for training neural networks when only small amounts of data are available, taking inspiration from the recent successes of self-supervised pretraining in high-resource NLP. |
|---|---|
| ISBN: | 9798315702948 |
| Fuente: | ProQuest Dissertations & Theses Global |