Resource Selection in Federated Search

Guardado en:
Detalles Bibliográficos
Publicado en:ProQuest Dissertations and Theses (2025)
Autor principal: Ergashev, Ulugbek
Publicado:
ProQuest Dissertations & Theses
Materias:
Acceso en línea:Citation/Abstract
Full Text - PDF
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
Descripción
Resumen:In today's digital landscape, users often face challenges accessing relevant information scattered across multiple disparate sources. For example, a healthcare professional might need patient records, medical research papers, and pharmaceutical data stored in separate databases. Such fragmentation causes inefficiencies and delays in critical decisions, highlighting the need for systems that seamlessly aggregate information from diverse repositories. Federated Search (Distributed Information Retrieval) addresses this by allowing users to submit one query and receive integrated results from multiple heterogeneous resources without central indexing. These systems distribute search processes across various nodes, improving scalability and efficiency, particularly valuable in academic journals, enterprise systems, and deep web segments inaccessible to standard search engines. Existing federated search techniques primarily employ term-based statistical methods like query-based sampling and resource descriptions using term distributions. However, these approaches often inadequately capture complex semantic relationships and the internal diversity of resources. Representing resources as single-point embeddings further limits the precision of retrieval due to insufficient semantic representation. This dissertation investigates advanced representation learning methods to enhance resource selection in federated search. Specifically, it addresses the limitations of current approaches by modeling the semantic diversity within resources and intricate query-resource relationships. Building upon previous studies, we integrate sophisticated resource modeling with advanced neural architectures, including Graph Neural Networks (GNNs) and pre-trained language models, to improve retrieval accuracy. We further propose representing resources as hyperrectangular boxes in latent space, offering richer semantic depiction than single-point embeddings, effectively capturing the internal diversity and varying relevance of resource content. Recognizing the evolving nature of resources and user interests, we incorporate temporal dynamics and user interaction patterns, enabling adaptive and contextually relevant retrieval. Finally, we introduce FedRAGraph, a novel federated search method combining Retrieval-Augmented Generation (RAG) with graph-based indexing and hierarchical community detection. FedRAGraph constructs detailed knowledge graphs, segments them into semantically coherent communities, and generates hierarchical summaries via large language models, achieving superior precision, context-awareness, and iterative reasoning capabilities compared to state-of-the-art methods.
ISBN:9798280753938
Fuente:ProQuest Dissertations & Theses Global