Building the Learning-From-Interaction Pipeline for Large Language Models

Պահպանված է:
Մատենագիտական մանրամասներ
Հրատարակված է:ProQuest Dissertations and Theses (2025)
Հիմնական հեղինակ: Murty, Shikhar
Հրապարակվել է:
ProQuest Dissertations & Theses
Խորագրեր:
Առցանց հասանելիություն:Citation/Abstract
Full Text - PDF
Ցուցիչներ: Ավելացրեք ցուցիչ
Չկան պիտակներ, Եղեք առաջինը, ով նշում է այս գրառումը!

MARC

LEADER 00000nab a2200000uu 4500
001 3238016459
003 UK-CbPIL
020 |a 9798288815157 
035 |a 3238016459 
045 2 |b d20250101  |b d20251231 
084 |a 66569  |2 nlm 
100 1 |a Murty, Shikhar 
245 1 |a Building the Learning-From-Interaction Pipeline for Large Language Models 
260 |b ProQuest Dissertations & Theses  |c 2025 
513 |a Dissertation/Thesis 
520 3 |a LLMs have demonstrated remarkable capabilities, and there is growing interest in using them as agents—systems that can translate complex human goals, expressed in natural language, into sequences of actions within digital environments like web browsers. Achieving this requires two core competencies: first, the ability to understand arbitrary and compositional language inputs; and second, the capacity to learn about unfamiliar environments so that language goals can be grounded in effective, multi-step decision-making. This thesis addresses both of these challenges. In the first part, I introduce Tree Projections, a framework for understanding how transformers build compositional structure. I then present a series of results based on Tree Projections that illuminate the mechanisms behind compositional generalization, grokking, and sample-efficient learning in transformers. While Tree Projections help explain successful generalization, prior work has shown that standard transformers struggle with deep recursion due to a lack of mechanisms for unbounded hierarchical depth. To address this, I propose Pushdown Layers, an architectural augmentation that adds a stack-based memory to transformers. Pushdown Layers improve sample efficiency and generalization on tasks requiring nested or recursive reasoning. In the second party, I introduce NNetNav and BAGEL, methods for unsupervised, open-ended exploration in web environments that enable models to automatically collect training data for new websites, without human supervision. Our best results come from fine-tuning LLMs with demonstrations collected via NNetNav, which uses the hierarchical structure of language to guide exploration policies. Using NNetNav, we collect 10,000 demonstrations from 20 real-world websites and fine-tune an 8B model, setting a new state-of-the-art among unsupervised methods and outperforming zero-shot GPT-4 on multiple browser benchmarks. Taken together, these contributions bring us closer to digital language agents that can both handle the complexity of language instructions and autonomously learn from interacting with their environments. 
653 |a Artificial intelligence 
653 |a Error analysis 
653 |a Large language models 
653 |a Computer engineering 
773 0 |t ProQuest Dissertations and Theses  |g (2025) 
786 0 |d ProQuest  |t ProQuest Dissertations & Theses Global 
856 4 1 |3 Citation/Abstract  |u https://www.proquest.com/docview/3238016459/abstract/embedded/75I98GEZK8WCJMPQ?source=fedsrch 
856 4 0 |3 Full Text - PDF  |u https://www.proquest.com/docview/3238016459/fulltextPDF/embedded/75I98GEZK8WCJMPQ?source=fedsrch