Statistical and Computational Analysis of Adversarial Training

Αποθηκεύτηκε σε:
Λεπτομέρειες βιβλιογραφικής εγγραφής
Εκδόθηκε σε:ProQuest Dissertations and Theses (2025)
Κύριος συγγραφέας: Xie, Yiling
Έκδοση:
ProQuest Dissertations & Theses
Θέματα:
Διαθέσιμο Online:Citation/Abstract
Full Text - PDF
Ετικέτες: Προσθήκη ετικέτας
Δεν υπάρχουν, Καταχωρήστε ετικέτα πρώτοι!

MARC

LEADER 00000nab a2200000uu 4500
001 3275477141
003 UK-CbPIL
020 |a 9798263326234 
035 |a 3275477141 
045 2 |b d20250101  |b d20251231 
084 |a 66569  |2 nlm 
100 1 |a Xie, Yiling 
245 1 |a Statistical and Computational Analysis of Adversarial Training 
260 |b ProQuest Dissertations & Theses  |c 2025 
513 |a Dissertation/Thesis 
520 3 |a Adversarial training is a powerful tool to hedge against data perturbations and distributional shifts, and has been widely used in Large Language Models [1, 2], computer vision [3], cybersecurity [4], etc. While the empirical risk minimization procedure optimizes the empirical loss, the adversarial training procedure seeks conservative solutions that optimize the worst-case loss. In general, there are two ways to define worst-case loss: Wassersteindistance-based [5] and perturbation-based [6, 7].In this thesis, we present a statistical and computational analysis of adversarial training. For the Wasserstein-distance-based adversarial training problem—also known as Wasserstein distributionally robust optimization—we explore both the computational aspects of the Wasserstein distance and the statistical properties of this framework. In the case of perturbation-based adversarial training, our focus is primarily on its statistical properties. Importantly, we establish computational and statistical foundations of adversarial training, including computational complexity, convergence rates, asymptotic distributions, and minimax optimality. Building on these insights, we propose potential improvements with provable theoretical guarantees.The following are more detailed descriptions for each chapter.In Chapter 1, we focus on the Wasserstein distance. It can be shown that computing the empirical Wasserstein distance in the Wasserstein-distance-based independence test is an optimal transport (OT) problem with a special structure. This observation inspires us to study a special type of OT problem and propose a modified Hungarian algorithm to solve it exactly. For the OT problem involving two marginals with m and n atoms (m ≥ n), respectively, the computational complexity of the proposed algorithm is O(m2n). The experiment results demonstrate that the proposed modified Hungarian algorithm compares favorably with the Hungarian algorithm, the well-known Sinkhorn algorithm, and the network simplex algorithm.In Chapter 2, we focus on the Wasserstein-distance-based adversarial training. We propose an adjusted Wasserstein distributionally robust estimator—based on a nonlinear transformation of the Wasserstein distributionally robust (WDRO) estimator in statistical learning. The classic WDRO estimator is asymptotically biased, while our adjusted WDRO estimator is asymptotically unbiased, resulting in a smaller asymptotic mean squared error. Further, under certain conditions, our proposed adjustment technique provides a general principle to de-bias asymptotically biased estimators. Specifically, we will investigate how the adjusted WDRO estimator is developed in the generalized linear model, including logistic regression, linear regression, and Poisson regression. Numerical experiments demonstrate the favorable practical performance of the adjusted estimator over the classic one. In Chapter 3 and Chapter 4, we focus on the perturbation-based adversarial training. In Chapter 3, we focus on adversarial training under ℓ∞-perturbation, which has recently attracted much research attention. The asymptotic behavior of the adversarial training estimator is investigated in the generalized linear model. The results imply that the asymptotic distribution of the adversarial training estimator under ℓ∞-perturbation could put a positive probability mass at 0 when the true parameter is 0, providing a theoretical guarantee of the associated sparsity-recovery ability. Alternatively, a two-step procedure is proposed— adaptive adversarial training, which could further improve the performance of adversarial training under ℓ∞-perturbation. Specifically, the proposed procedure could achieve asymptotic variable-selection consistency and unbiasedness. Numerical experiments are conducted to show the sparsity-recovery ability of adversarial training under ℓ∞-perturbation and to compare the empirical performance between classic adversarial training and adaptive adversarial training. In Chapter 4, we deliver a non-asymptotic consistency analysis of the adversarial training procedure under ℓ∞-perturbation in high-dimensional linear regression. It will be shown that, under the restricted eigenvalue condition, the associated convergence rate of prediction error can achieve the minimax rate up to a logarithmic factor in the highdimensional linear regression on the class of sparse parameters. Additionally, the group xvii adversarial training procedure is analyzed. Compared with classic adversarial training, it will be proved that the group adversarial training procedure enjoys a better prediction error upper bound under certain group-sparsity patterns. 
653 |a Sparsity 
653 |a Assignment problem 
653 |a Breast cancer 
653 |a Simplex method 
653 |a Oncology 
653 |a Operations research 
773 0 |t ProQuest Dissertations and Theses  |g (2025) 
786 0 |d ProQuest  |t ProQuest Dissertations & Theses Global 
856 4 1 |3 Citation/Abstract  |u https://www.proquest.com/docview/3275477141/abstract/embedded/L8HZQI7Z43R0LA5T?source=fedsrch 
856 4 0 |3 Full Text - PDF  |u https://www.proquest.com/docview/3275477141/fulltextPDF/embedded/L8HZQI7Z43R0LA5T?source=fedsrch