Building the Learning-From-Interaction Pipeline for Large Language Models

Պահպանված է:

Մատենագիտական մանրամասներ
Հրատարակված է:	ProQuest Dissertations and Theses (2025)
Հիմնական հեղինակ:	Murty, Shikhar
Հրապարակվել է:	ProQuest Dissertations & Theses
Խորագրեր:	Artificial intelligence Error analysis Large language models Computer engineering
Առցանց հասանելիություն:	Citation/Abstract Full Text - PDF
Ցուցիչներ:	Ավելացրեք ցուցիչ Չկան պիտակներ, Եղեք առաջինը, ով նշում է այս գրառումը!

MARC


LEADER	00000nab a2200000uu 4500
001	3238016459
003	UK-CbPIL
020			\|a 9798288815157
035			\|a 3238016459
045	2		\|b d20250101 \|b d20251231
084			\|a 66569 \|2 nlm
100	1		\|a Murty, Shikhar
245	1		\|a Building the Learning-From-Interaction Pipeline for Large Language Models
260			\|b ProQuest Dissertations & Theses \|c 2025
513			\|a Dissertation/Thesis
520	3		\|a LLMs have demonstrated remarkable capabilities, and there is growing interest in using them as agents—systems that can translate complex human goals, expressed in natural language, into sequences of actions within digital environments like web browsers. Achieving this requires two core competencies: first, the ability to understand arbitrary and compositional language inputs; and second, the capacity to learn about unfamiliar environments so that language goals can be grounded in effective, multi-step decision-making. This thesis addresses both of these challenges. In the first part, I introduce Tree Projections, a framework for understanding how transformers build compositional structure. I then present a series of results based on Tree Projections that illuminate the mechanisms behind compositional generalization, grokking, and sample-efficient learning in transformers. While Tree Projections help explain successful generalization, prior work has shown that standard transformers struggle with deep recursion due to a lack of mechanisms for unbounded hierarchical depth. To address this, I propose Pushdown Layers, an architectural augmentation that adds a stack-based memory to transformers. Pushdown Layers improve sample efficiency and generalization on tasks requiring nested or recursive reasoning. In the second party, I introduce NNetNav and BAGEL, methods for unsupervised, open-ended exploration in web environments that enable models to automatically collect training data for new websites, without human supervision. Our best results come from fine-tuning LLMs with demonstrations collected via NNetNav, which uses the hierarchical structure of language to guide exploration policies. Using NNetNav, we collect 10,000 demonstrations from 20 real-world websites and fine-tune an 8B model, setting a new state-of-the-art among unsupervised methods and outperforming zero-shot GPT-4 on multiple browser benchmarks. Taken together, these contributions bring us closer to digital language agents that can both handle the complexity of language instructions and autonomously learn from interacting with their environments.
653			\|a Artificial intelligence
653			\|a Error analysis
653			\|a Large language models
653			\|a Computer engineering
773	0		\|t ProQuest Dissertations and Theses \|g (2025)
786	0		\|d ProQuest \|t ProQuest Dissertations & Theses Global
856	4	1	\|3 Citation/Abstract \|u https://www.proquest.com/docview/3238016459/abstract/embedded/75I98GEZK8WCJMPQ?source=fedsrch
856	4	0	\|3 Full Text - PDF \|u https://www.proquest.com/docview/3238016459/fulltextPDF/embedded/75I98GEZK8WCJMPQ?source=fedsrch