Advancing Parallel Programming Through Program Graph Representation and Unsupervised Code Translation

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:ProQuest Dissertations and Theses (2025)
1. Verfasser: TehraniJamsaz, Ali
Veröffentlicht:
ProQuest Dissertations & Theses
Schlagworte:
Online-Zugang:Citation/Abstract
Full Text - PDF
Tags: Tag hinzufügen
Keine Tags, Fügen Sie das erste Tag hinzu!
Beschreibung
Abstract:With the advent of multi-core and many-core systems, developers have increasingly focused on creating parallel programs to harness the potential of this hardware. However, developing parallel programs and high-performance kernels presents a unique set of challenges.Simultaneously, advancements in deep learning (DL) and machine learning (ML) have transformed numerous fields, including software engineering and HPC kernel development. Yet, unlike other domains, applying deep learning models to the HPC field poses distinct difficulties. For example, source code typically exhibits a specific structure, syntax, and semantics, making it challenging to train deep learning models to comprehend these characteristics effectively.Moreover, beyond the general challenges of applying deep learning to understand applications, comprehending parallel applications presents even greater difficulties. These types of applications have unique errors and data-sharing complexities that deep learning models must learn to address.This dissertation presents four studies aimed at enabling deep learning models to better understand parallel and HPC applications. Each study introduces a novel technique to enhance the ability of DL models to comprehend parallel programs.The early chapters mostly focus on the graph representation of parallel programs, with more focus on OpenMP applications. In particular, the first study focuses on how to model OpenMP programs to predict configurations for non-uniform memory Access (NUMA) systems and prefetchers. The second study tries to address limitations in the first study by identifying the flaws that exist in the program representation used in the first study and improving it further. The third study focuses on predicting the runtime of OpenMP applications. In this chapter, an augmented graph representation based on Abstract Syntax Tree (AST) is proposed to predict the runtime of OpenMP kernels. The last study leverages Transformers, and it looks at the problem of parallelization from a different angle. It considers the parallelization problem as a translation task, and an encoder-decoder transformer model is developed to learn how to do this translation in an unsupervised way.The techniques developed in these studies aim to address various challenges associated with applying deep learning models in the HPC domain. They focus on effectively modeling parallel programs and enabling the translation between serial and parallel code.I hope these techniques will inspire further research in this field and help mitigate the challenges inherent to the HPC domain.
ISBN:9798286446667
Quelle:ProQuest Dissertations & Theses Global