From Latent Knowledge Gathering to Side Information Injection in Discrete Sequential Models

Guardado en:
Detalles Bibliográficos
Publicado en:ProQuest Dissertations and Theses (2024)
Autor principal: Rezaee Taghiabadi, Mohammad Mehdi
Publicado:
ProQuest Dissertations & Theses
Materias:
Acceso en línea:Citation/Abstract
Full Text - PDF
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
Descripción
Resumen:Representation learning is crucial for processing sequential and discrete data, such as text in natural language processing (NLP). From classical methods like topic modeling to modern transformer-based architectures, the goal is to utilize data to learn richer representations. This thesis focuses on two primary strategies: Latent Knowledge Gathering, which involves using clustering techniques to extract semantic knowledge from training data, and Injecting Background Information, where structural priors such as pretrained models are employed to enhance learning.The encoding process transforms high-dimensional documents into compact, lowdimensional representations, optimized to capture vital information for various language tasks. For instance, in document classification, both encoder and decoder play critical roles, especially with limited data. Our experiments assess model capabilities across different data regimes, emphasizing efficient representation in the situation entity classification task.Thematic analysis has seen advancements; however, the extraction of word-level thematic topics and the utilization of auxiliary knowledge are often overlooked. We propose a novel approach combining topic models with recurrent neural networks (RNNs) to maintain and utilize lower-level representations, enhancing natural language generation. Comparative experiments show this method achieves state-of-the-art performance by effectively retaining and using word topic assignments.Additionally, we explore using structured, discrete, semi-supervised variational autoencoders to leverage incomplete and noisy side knowledge for guiding text representation. This method robustly handles varied observation levels of side knowledge, consistently improving performance across language modeling and classification metrics.Finally, we introduce a universal framework for integrating discrete information using the information bottleneck principle. Through extensive theoretical and empirical studies, including a case study on event modeling, this framework significantly enhances performance and provides a robust foundation for future research in integrating noisy and incomplete side knowledge.
ISBN:9798382744445
Fuente:ProQuest Dissertations & Theses Global