Large Language Model-Driven Structured Output: A Comprehensive Benchmark and Spatial Data Generation Framework

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	ISPRS International Journal of Geo-Information vol. 13, no. 11 (2024), p. 405
1. Verfasser:	Li, Diya
Weitere Verfasser:	Zhao, Yue, Wang, Zhifang, Jung, Calvin, Zhang, Zhe
Veröffentlicht:	MDPI AG
Schlagworte:	Language Data analysis Research methodology Accuracy Data processing Spatial data Large language models Human performance Knowledge discovery Workflow Geographic information systems Information processing Prompt engineering Automation Benchmarks Efficiency Natural language
Online-Zugang:	Citation/Abstract Full Text + Graphics Full Text - PDF
Tags:	Tag hinzufügen Keine Tags, Fügen Sie das erste Tag hinzu!

Beschreibung
Abstract:	Large language models (LLMs) have demonstrated remarkable capabilities in document processing, data analysis, and code generation. However, the generation of spatial information in a structured and unified format remains a challenge, limiting their integration into production environments. In this paper, we introduce a benchmark for generating structured and formatted spatial outputs from LLMs with a focus on enhancing spatial information generation. We present a multi-step workflow designed to improve the accuracy and efficiency of spatial data generation. The steps include generating spatial data (e.g., GeoJSON) and implementing a novel method for indexing R-tree structures. In addition, we explore and compare a series of methods commonly used by developers and researchers to enable LLMs to produce structured outputs, including fine-tuning, prompt engineering, and retrieval-augmented generation (RAG). We propose new metrics and datasets along with a new method for evaluating the quality and consistency of these outputs. Our findings offer valuable insights into the strengths and limitations of each approach, guiding practitioners in selecting the most suitable method for their specific use cases. This work advances the field of LLM-based structured spatial data output generation and supports the seamless integration of LLMs into real-world applications.
ISSN:	2220-9964
DOI:	10.3390/ijgi13110405
Quelle:	Advanced Technologies & Aerospace Database