Few-Shot Learning With Domain Spanning for Supradiegetic and Semantic Language Tasks
Guardat en:
| Publicat a: | ProQuest Dissertations and Theses (2025) |
|---|---|
| Autor principal: | |
| Publicat: |
ProQuest Dissertations & Theses
|
| Matèries: | |
| Accés en línia: | Citation/Abstract Full Text - PDF |
| Etiquetes: |
Sense etiquetes, Sigues el primer a etiquetar aquest registre!
|
| Resum: | Our work sets out to properly study a large language model’s ability to utilize few-shot demonstrations at inference. To begin, we benchmark the effect of few-shot learning on a language model’s ability to model supradiegetic linguistic tasks. These tasks differ from semantic language tasks as they require information not encoded within the ordinal structure of language found in training data or within any few-shot references. This information would include a human’s deciphering of non-verbal body language, or tonal inflections, features of language that large language models (LLMs) traditionally lack. We apply few-shot learning to supradiegetic linguistic tasks such as encoding shift ciphers, gematria, or morse code, decoding so-called “leet”-speak, and matching scrambled letters to their respective words. In this setting we propose a few-shot learning technique we call Domain Spanning, where we ensure the input and output domains of our few-shot demonstrations map one-to-one and onto the input and output domains of any supradiegetic task modeled by our large language models. Domain spanning helps our language models overcome the innate hurdles of working with supradiegetic linguistic information. In the case of GPT-4-turbo, we find that there is little precedence for performing at least one supradiegetic encoding task where our few-shot learning framework increases our percent of correct answers from 0% to nearly 100%.In semantic language tasks, inspired by the principles of Domain Spanning, we propose a novel role-playing framework for language models as well as a novel LLM-as-a-judge framework. We deploy retrieval augmented generation to create our Domain Spanning references for both applications. When we opt to treat these semantic tasks as supradiegetic tasks that require precise demonstrations without expectation of generalization, this leads to state-of-the-art results across both applications. Our role-playing models, when compared to In-Context Learning role-playing models, utilize an average of 35% more tokens found within few-shot demonstrations at inference which preserves character alignment when role-playing large language models are challenged by hostile users. Our LLM-as-a-judge framework, Crowd Vote, utilizes an ensemble of judges to consistently judge and self-audit conditioned on personalities built from few-shot demonstrations using retrieval augmented generation and Domain Spanning. Crowd Vote more accurately than zero-shot baselines differentiates LLM-generated content from human-generated content, more accurately fact-checks answers to college-level exams, and mitigates task-agnostic biases that arise from flaws in the training data. |
|---|---|
| ISBN: | 9798291588031 |
| Font: | ProQuest Dissertations & Theses Global |