Data-Driven Cloud Cover Parameterizations for the ICON Earth System Model Using Deep Learning and Symbolic Regression
Guardat en:
| Publicat a: | PQDT - Global (2024) |
|---|---|
| Autor principal: | |
| Publicat: |
ProQuest Dissertations & Theses
|
| Matèries: | |
| Accés en línia: | Citation/Abstract Full Text - PDF Full text outside of ProQuest |
| Etiquetes: |
Sense etiquetes, Sigues el primer a etiquetar aquest registre!
|
| Resum: | A promising approach to improving cloud parameterizations in climate models, and thus climate projections, is to train machine learning algorithms on the coarse-grained output of high-resolution storm-resolving model (SRM) simulations. The ICOsahedral Non-hydrostatic (ICON) modeling framework enables simulations ranging from numerical weather prediction to climate projections, making it an ideal target for developing machine learning based parameterizations. The main focus of this thesis lies in the improvement of the semi-empirical cloud cover parameterization used in the ICON Earth System Model. It diagnoses subgrid-scale fractional cloud cover from large-scale variables in every grid cell based on very simple assumptions. To instead parameterize cloud cover with more detailed complexity, we first develop three different types of neural networks (NNs) that differ in the degree of vertical locality they assume for diagnosing cloud cover. The NNs accurately estimate cloud cover in their training domain and globally-trained NNs can even estimate it for a distinct regional SRM. Using the game theory based interpretability library SHapley Additive exPlanations, we analyze our most non-local NN and identify an overemphasis on specific humidity and cloud ice as the reason why it cannot perfectly generalize from global to regional coarse-grained SRM data. The interpretability tool also helps visualize similarities and differences in feature importance between regionally- and globally-trained NNs, and reveals a local relationship between their cloud cover predictions and the thermodynamic environment. However, while our NNs already achieve excellent predictive performance ( 2 > 0.9) with as few as three features, they are climate model specific and require additional tools for post-hoc interpretation. To avoid these limitations, we also add symbolic regression, sequential feature selection, and physical constraints to a combined hierarchical modeling framework. Analytical equations derived from this framework are interpretable by construction and easily transferable to other grids or climate models. Our best equation balances performance and complexity, achieving a performance comparable to that of NNs ( 2 = 0.94) while remaining simple (with only 11 trainable parameters) and physically consistent. It learns to utilize the vertical relative humidity gradient to detect elusive marine stratocumulus clouds. Furthermore, it reproduces cloud cover distributions more accurately than the Xu-Randall scheme across all cloud regimes (Hellinger distances < 0.09), and matches NNs in condensate-rich regimes. When applied and fine-tuned to ERA5 reanalysis, the equation exhibits superior transferability compared to all other Pareto-optimal cloud cover schemes. Overall, this thesis shows the potential of deep learning to derive accurate cloud cover parameterizations from global SRMs. It also demonstrates the effectiveness of symbolic regression to discover interpretable, physically consistent, and nonlinear equations for cloud cover. |
|---|---|
| ISBN: | 9798315722939 |
| Font: | ProQuest Dissertations & Theses Global |