Theoretical Study of Neural Network Models: Error Bounds on Approximation, Generalization, and Optimization With Applications to Regression and Partial Differential Equations
Guardat en:
| Publicat a: | PQDT - Global (2025) |
|---|---|
| Autor principal: | |
| Publicat: |
ProQuest Dissertations & Theses
|
| Matèries: | |
| Accés en línia: | Citation/Abstract Full Text - PDF Full text outside of ProQuest |
| Etiquetes: |
Sense etiquetes, Sigues el primer a etiquetar aquest registre!
|
MARC
| LEADER | 00000nab a2200000uu 4500 | ||
|---|---|---|---|
| 001 | 3273657646 | ||
| 003 | UK-CbPIL | ||
| 020 | |a 9798263313067 | ||
| 035 | |a 3273657646 | ||
| 045 | 2 | |b d20250101 |b d20251231 | |
| 084 | |a 189128 |2 nlm | ||
| 100 | 1 | |a Lai, Yanming | |
| 245 | 1 | |a Theoretical Study of Neural Network Models: Error Bounds on Approximation, Generalization, and Optimization With Applications to Regression and Partial Differential Equations | |
| 260 | |b ProQuest Dissertations & Theses |c 2025 | ||
| 513 | |a Dissertation/Thesis | ||
| 520 | 3 | |a The remarkable success of machine learning in computer vision, natural language processing, and related domains has spurred vigorous development of its theoretical foundations. Neural networks are a core model of machine learning, achieving complex function approximation and automatic feature learning by simulating the connections between neurons in the human brain. As a nonparametric estimation method, the error analysis of machine learning typically comprises three fundamental components: approximation error, generalization error, and optimization error. This thesis presents a comprehensive investigation of these three error types for neural networks, along with convergence rate analysis when applied to solving both regression problems and partial differential equations (PDEs). In the first part of this thesis, we derive the convergence rate for solving regression problems using three-layer logistic overparameterized feedforward neural networks (FNNs) trained with gradient descent (GD). In the second part, within the framework of the Deep Ritz Method (DRM), we derive the convergence rate for solving second-order elliptic equations with three different types of boundary conditions using three-layer overparameterized tanh FNNs trained with projected gradient descent (PGD). In both parts, following the tradition of nonparametric estimation, our error bounds are expressed in terms of the sample size n. Our results also provide a quantitative description of various parameters, including the depth and width of the neural network, the training step size and number of iterations of the optimization algorithm. In the third part of this thesis, we focus on the expressive power (approximation capability) of the Transformer model, which has recently demonstrated strong performance in large language models but remain theoretically underexplored compared to classical FNNs with extensive literature. Specifically, we investigate the approximation of the Hölder continuous function class by Transformers and construct several Transformers that can overcome the curse of dimensionality. These results demonstrate that Transformers possess super expressive power. | |
| 653 | |a Language | ||
| 653 | |a Machine learning | ||
| 653 | |a Sample size | ||
| 653 | |a Neurons | ||
| 653 | |a Partial differential equations | ||
| 653 | |a Deep learning | ||
| 653 | |a Investigations | ||
| 653 | |a Artificial intelligence | ||
| 653 | |a Network management systems | ||
| 653 | |a Computer vision | ||
| 653 | |a Power | ||
| 653 | |a Neural networks | ||
| 653 | |a Brain | ||
| 653 | |a Numerical analysis | ||
| 653 | |a Eigenvalues | ||
| 653 | |a Natural language processing | ||
| 653 | |a Error analysis | ||
| 653 | |a Boundary conditions | ||
| 653 | |a Optimization algorithms | ||
| 653 | |a Data compression | ||
| 653 | |a Mathematics | ||
| 773 | 0 | |t PQDT - Global |g (2025) | |
| 786 | 0 | |d ProQuest |t ProQuest Dissertations & Theses Global | |
| 856 | 4 | 1 | |3 Citation/Abstract |u https://www.proquest.com/docview/3273657646/abstract/embedded/L8HZQI7Z43R0LA5T?source=fedsrch |
| 856 | 4 | 0 | |3 Full Text - PDF |u https://www.proquest.com/docview/3273657646/fulltextPDF/embedded/L8HZQI7Z43R0LA5T?source=fedsrch |
| 856 | 4 | 0 | |3 Full text outside of ProQuest |u https://doi.org/10.14711/thesis-hdl152378 |