Efficient and Faithful Algorithms for Interpretable Machine Learning

Guardat en:
Dades bibliogràfiques
Publicat a:ProQuest Dissertations and Theses (2025)
Autor principal: Wang, Guanchu
Publicat:
ProQuest Dissertations & Theses
Matèries:
Accés en línia:Citation/Abstract
Full Text - PDF
Etiquetes: Afegir etiqueta
Sense etiquetes, Sigues el primer a etiquetar aquest registre!
Descripció
Resum:As deep learning models continue to grow in complexity and scale, the demand for interpretable machine learning (ML) methods becomes increasingly critical across a wide range of applications. This thesis addresses the challenges of interpreting deep neural networks (DNNs) by designing efficient and faithful algorithms tailored to existing mainstream models: multilayer perceptrons (MLPs), graph neural networks (GNNs), vision transformers (ViTs), and large language models (LLMs). My ultimate goal is to develop frameworks that are not only theoretically grounded but also computationally efficient for interpenetrating DNN models.For tabular data modeled by MLPs, we focus on accelerating Shapley value computation, a widely used method rooted in cooperative game theory. While Shapley values provide theoretically sound feature attributions, their computation is NP-hard due to the exponential number of input coalitions. To address this, we propose SHEAR (Shapley Explanation Acceleration Rule), a novel approach that leverages a theoretical chain rule to identify a small set of contributive cooperators that preserve attribution accuracy while significantly reducing computation. SHEAR achieves substantial speed-up without degradation of fidelity across several benchmark datasets.For graph-structured data and GNNs, we propose LARA (Local Attribution via Removal-based Amortization), a fidelity-oriented framework for node attribution. Traditional GNN explanation methods often struggle with high computational cost and low faithfulness, especially on large-scale graphs. LARA addresses these limitations by introducing a bidirectional attribution mechanism that produces explanation-oriented node embeddings. It also incorporates subgraph sampling to enhance scalability and amortized training mechanism to generalize explanations across unseen nodes.In the domain of vision models, particularly ViTs, we present the TVE (Transferable Vision Explainer) to enable efficient and reusable explanations. While existing vision explainers require retraining for each model and task, TVE introduces the concept of meta-attribution, a generalized, pre-trained attribution representation that can be adapted to diverse downstream tasks without further training. By pretraining TVE on large-scale image datasets, we demonstrate that it can generate faithful and transferable explanations for multiple vision architectures, such as ViT, Swin, and DeiT, across several datasets. This pretrain-once, explain-everywhere mechanism offers a scalable solution for vision interpretability in real-world deployments.Finally, for interpreting LLMs, we focus on improving the faithfulness of natural language explanations. Existing approaches frequently yield inconsistent or non-faithful outputs due to the intrinsic complexity of LLMs and their one-pass generation style. To address this, we propose a novel fidelity metric based on contrary explanations and introduce FaithLM, a self-consistency-based framework that iteratively refines natural language explanations using in-context learning. FaithLM leverages feedback from fidelity evaluations to optimize explanation prompts, which achieves significant alignment of explanations with model decisions.Together, these contributions provide a comprehensive study for advancing the interpretability of modern AI systems across multiple data modalities. This thesis cannot only improve the transparency and trustworthiness of ML models but also act as a groundwork for safe and responsible AI deployment in high-stakes domains.
ISBN:9798297610019
Font:ProQuest Dissertations & Theses Global