Large Language Model-Driven Structured Output: A Comprehensive Benchmark and Spatial Data Generation Framework

Guardado en:
Detalles Bibliográficos
Publicado en:ISPRS International Journal of Geo-Information vol. 13, no. 11 (2024), p. 405
Autor principal: Li, Diya
Otros Autores: Zhao, Yue, Wang, Zhifang, Jung, Calvin, Zhang, Zhe
Publicado:
MDPI AG
Materias:
Acceso en línea:Citation/Abstract
Full Text + Graphics
Full Text - PDF
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!

MARC

LEADER 00000nab a2200000uu 4500
001 3133059795
003 UK-CbPIL
022 |a 2220-9964 
024 7 |a 10.3390/ijgi13110405  |2 doi 
035 |a 3133059795 
045 2 |b d20240101  |b d20241231 
084 |a 231472  |2 nlm 
100 1 |a Li, Diya  |u Survey123, Esri, Redlands, CA 92374, USA; <email>diya.li@tamu.edu</email> (D.L.); ; Department of Geography, Texas A&M University, College Station, TX 77840, USA 
245 1 |a Large Language Model-Driven Structured Output: A Comprehensive Benchmark and Spatial Data Generation Framework 
260 |b MDPI AG  |c 2024 
513 |a Journal Article 
520 3 |a Large language models (LLMs) have demonstrated remarkable capabilities in document processing, data analysis, and code generation. However, the generation of spatial information in a structured and unified format remains a challenge, limiting their integration into production environments. In this paper, we introduce a benchmark for generating structured and formatted spatial outputs from LLMs with a focus on enhancing spatial information generation. We present a multi-step workflow designed to improve the accuracy and efficiency of spatial data generation. The steps include generating spatial data (e.g., GeoJSON) and implementing a novel method for indexing R-tree structures. In addition, we explore and compare a series of methods commonly used by developers and researchers to enable LLMs to produce structured outputs, including fine-tuning, prompt engineering, and retrieval-augmented generation (RAG). We propose new metrics and datasets along with a new method for evaluating the quality and consistency of these outputs. Our findings offer valuable insights into the strengths and limitations of each approach, guiding practitioners in selecting the most suitable method for their specific use cases. This work advances the field of LLM-based structured spatial data output generation and supports the seamless integration of LLMs into real-world applications. 
653 |a Language 
653 |a Data analysis 
653 |a Research methodology 
653 |a Accuracy 
653 |a Data processing 
653 |a Spatial data 
653 |a Large language models 
653 |a Human performance 
653 |a Knowledge discovery 
653 |a Workflow 
653 |a Geographic information systems 
653 |a Information processing 
653 |a Prompt engineering 
653 |a Automation 
653 |a Benchmarks 
653 |a Efficiency 
653 |a Natural language 
700 1 |a Zhao, Yue  |u Survey123, Esri, Redlands, CA 92374, USA; <email>diya.li@tamu.edu</email> (D.L.); 
700 1 |a Wang, Zhifang  |u Survey123, Esri, Redlands, CA 92374, USA; <email>diya.li@tamu.edu</email> (D.L.); 
700 1 |a Jung, Calvin  |u Survey123, Esri, Redlands, CA 92374, USA; <email>diya.li@tamu.edu</email> (D.L.); 
700 1 |a Zhang, Zhe  |u Department of Geography, Texas A&M University, College Station, TX 77840, USA 
773 0 |t ISPRS International Journal of Geo-Information  |g vol. 13, no. 11 (2024), p. 405 
786 0 |d ProQuest  |t Advanced Technologies & Aerospace Database 
856 4 1 |3 Citation/Abstract  |u https://www.proquest.com/docview/3133059795/abstract/embedded/6A8EOT78XXH2IG52?source=fedsrch 
856 4 0 |3 Full Text + Graphics  |u https://www.proquest.com/docview/3133059795/fulltextwithgraphics/embedded/6A8EOT78XXH2IG52?source=fedsrch 
856 4 0 |3 Full Text - PDF  |u https://www.proquest.com/docview/3133059795/fulltextPDF/embedded/6A8EOT78XXH2IG52?source=fedsrch