Understanding and Enhancing AI Model Robustness and Reliability Through Input Perturbation

Salvato in:
Dettagli Bibliografici
Pubblicato in:ProQuest Dissertations and Theses (2025)
Autore principale: Zhang, Chi
Pubblicazione:
ProQuest Dissertations & Theses
Soggetti:
Accesso online:Citation/Abstract
Full Text - PDF
Tags: Aggiungi Tag
Nessun Tag, puoi essere il primo ad aggiungerne!!

MARC

LEADER 00000nab a2200000uu 4500
001 3201334715
003 UK-CbPIL
020 |a 9798314867679 
035 |a 3201334715 
045 2 |b d20250101  |b d20251231 
084 |a 66569  |2 nlm 
100 1 |a Zhang, Chi 
245 1 |a Understanding and Enhancing AI Model Robustness and Reliability Through Input Perturbation 
260 |b ProQuest Dissertations & Theses  |c 2025 
513 |a Dissertation/Thesis 
520 3 |a AI models, including machine learning models and large language models (LLMs), have demonstrated remarkable capabilities across tasks such as image recognition, coding tasks, and natural language processing (NLP). However, these models are vulnerable to input perturbations—small yet meaningful modifications to input data—which can compromise their performance. Such perturbations fall into two broad categories: natural perturbations (e.g., noise or typos) and adversarial perturbations (intentionally crafted by malicious actors). Despite their minimal changes to the input’s semantics, both types can cause severe model misjudgments. At the same time, these perturbations offer a valuable lens through which we can systematically study and improve model reliability and robustness. The central goal of this thesis is to investigate the effects of input perturbations, develop effective defenses, and leverage them as tools to improve the reliability of AI models. These findings could enhance the robustness and reliability of AI models across domains such as image recognition, natural language processing, and various coding tasks. To this end, the research focuses on two complementary directions: defending against perturbations to ensure robustness, and using them as a diagnostic tool to assess model reliability.First, the thesis builds on recent advances in adversarial attacks and defenses to explore robustness failures in AI systems across domains. It examines how perturbations undermine certifiably robust neural networks and cascading classifiers, and how LLMs are affected by both natural and adversarial perturbations. The thesis also proposes a range of defense strategies, including a prompt-based method tailored to LLMs that offers a scalable and cost-effective way to mitigate adversarial inputs without requiring expensive retraining. These contributions aim to support the broader goal of building more robust and reliable AI systems.Second, this thesis goes beyond the defense to introduce perturbation-based methods for evaluating internal model reliability. It proposes a method for delicately selecting few-shot examples for LLMs in the context of code vulnerability detection, aiming to improve model accuracy by choosing examples that are relevant and informative. In addition, it presents an approach to estimate the correctness of retrieval-augmented generation (RAG) outputs by analyzing model uncertainty under perturbed inputs—without requiring ground truth. Together, these contributions offer new ways to enhance both the performance and trustworthiness of AI models in real-world scenarios. 
653 |a Computer science 
653 |a Computer engineering 
653 |a Electrical engineering 
773 0 |t ProQuest Dissertations and Theses  |g (2025) 
786 0 |d ProQuest  |t ProQuest Dissertations & Theses Global 
856 4 1 |3 Citation/Abstract  |u https://www.proquest.com/docview/3201334715/abstract/embedded/H09TXR3UUZB2ISDL?source=fedsrch 
856 4 0 |3 Full Text - PDF  |u https://www.proquest.com/docview/3201334715/fulltextPDF/embedded/H09TXR3UUZB2ISDL?source=fedsrch