Bridging Language Models and Structured Knowledge: Extraction, Representation, and Reasoning

محفوظ في:
التفاصيل البيبلوغرافية
الحاوية / القاعدة:ProQuest Dissertations and Theses (2025)
المؤلف الرئيسي: Wang, Zilong
منشور في:
ProQuest Dissertations & Theses
الموضوعات:
الوصول للمادة أونلاين:Citation/Abstract
Full Text - PDF
الوسوم: إضافة وسم
لا توجد وسوم, كن أول من يضع وسما على هذه التسجيلة!

MARC

LEADER 00000nab a2200000uu 4500
001 3192449374
003 UK-CbPIL
020 |a 9798310393288 
035 |a 3192449374 
045 2 |b d20250101  |b d20251231 
084 |a 66569  |2 nlm 
100 1 |a Wang, Zilong 
245 1 |a Bridging Language Models and Structured Knowledge: Extraction, Representation, and Reasoning 
260 |b ProQuest Dissertations & Theses  |c 2025 
513 |a Dissertation/Thesis 
520 3 |a Structured knowledge—diversely embedded in document images, web pages, and tabular data—presents distinct challenges for language models. Unlike free-form text, structured data encodes meaning through spatial arrangements, hierarchical structures, and relational dependencies, requiring models to extract, interpret, and reason beyond linguistic signals. This dissertation advances the integration of structured knowledge with language models, introducing novel methodologies for document understanding, web mining, and table-based reasoning. We first introduce VRDU, a benchmark for Visually-Rich Document Understanding, designed to evaluate how models extract structured information from business documents with complex layouts and hierarchical entities. By identifying key challenges in template generalization and few-shot adaptation, VRDU provides a more realistic assessment of multimodal language models. Next, we present LASER, a label-aware sequence-to-sequence framework for few-shot entity recognition in document images. By embedding label semantics and spatial relationships directly into the decoding process, LASER enables models to recognize entities with minimal supervision, outperforming traditional sequence-labeling approaches in low-resource scenarios. For web mining, we propose ReXMiner, a zero-shot relation extraction framework that captures structural dependencies within semi-structured web pages. By encoding relative XML paths in the Document Object Model (DOM) tree, ReXMiner improves the generalization of relation extraction across diverse and unseen web templates, demonstrating that structural signals enhance information retrieval from the web. Finally, we introduce Chain-of-Table, a framework for table-based reasoning that evolves tabular data iteratively. Unlike previous approaches that treat tables as static inputs, Chain-of-Table dynamically applies structured transformations, enabling models to reason step-by-step over tabular data. This approach achieves state-of-the-art performance across multiple benchmarks in table-based question answering and fact verification. Together, these contributions redefine how language models interact with structured knowledge, bridging the gap between unstructured text processing and structured data reasoning. By integrating multimodal signals, relational structures, and iterative reasoning mechanisms, this dissertation lays the foundation for more robust and generalizable models in structural knowledge understanding. 
653 |a Computer science 
653 |a Linguistics 
653 |a Artificial intelligence 
653 |a Information technology 
773 0 |t ProQuest Dissertations and Theses  |g (2025) 
786 0 |d ProQuest  |t ProQuest Dissertations & Theses Global 
856 4 1 |3 Citation/Abstract  |u https://www.proquest.com/docview/3192449374/abstract/embedded/BH75TPHOCCPB476R?source=fedsrch 
856 4 0 |3 Full Text - PDF  |u https://www.proquest.com/docview/3192449374/fulltextPDF/embedded/BH75TPHOCCPB476R?source=fedsrch