HierLabelNet: A Two-Stage LLMs Framework with Data Augmentation and Label Selection for Geographic Text Classification

Spremljeno u:
Bibliografski detalji
Izdano u:ISPRS International Journal of Geo-Information vol. 14, no. 7 (2025), p. 268-286
Glavni autor: Chen Zugang
Daljnji autori: Zhao, Le
Izdano:
MDPI AG
Teme:
Online pristup:Citation/Abstract
Full Text + Graphics
Full Text - PDF
Oznake: Dodaj oznaku
Bez oznaka, Budi prvi tko označuje ovaj zapis!
Opis
Sažetak:Earth observation data serve as a fundamental resource in Earth system science. The rapid advancement of remote sensing and in situ measurement technologies has led to the generation of massive volumes of data, accompanied by a growing body of geographic textual information. Efficient and accurate classification and management of these geographic texts has become a critical challenge in the field. However, the effectiveness of traditional classification approaches is hindered by several issues, including data sparsity, class imbalance, semantic ambiguity, and the prevalence of domain-specific terminology. To address these limitations and enable the intelligent management of geographic information, this study proposes an efficient geographic text classification framework based on large language models (LLMs), tailored to the unique semantic and structural characteristics of geographic data. Specifically, LLM-based data augmentation strategies are employed to mitigate the scarcity of labeled data and class imbalance. A semantic vector database is utilized to filter the label space prior to inference, enhancing the model’s adaptability to diverse geographic terms. Furthermore, few-shot prompt learning guides LLMs in understanding domain-specific language, while an output alignment mechanism improves classification stability for complex descriptions. This approach offers a scalable solution for the automated semantic classification of geographic text for unlocking the potential of ever-expanding geospatial big data, thereby advancing intelligent information processing and knowledge discovery in the geospatial domain.
ISSN:2220-9964
Digitalni identifikator objekta:10.3390/ijgi14070268
Izvor:Advanced Technologies & Aerospace Database