The Leaky Abstraction that Should Be: A Framework for Cross-Layer Information Sharing in Network Stacks

Guardado en:
Detalles Bibliográficos
Publicado en:ProQuest Dissertations and Theses (2025)
Autor principal: Taraz, Tooraj
Publicado:
ProQuest Dissertations & Theses
Materias:
Acceso en línea:Citation/Abstract
Full Text - PDF
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
Descripción
Resumen:High Performance Computing (HPC) is the answer to computational needs beyond what a single computer can deliver. This demand for massive computational need sparked the development of massive clusters across the world, each with their own unique features and characteristics.To use these clusters and develop software for them, different paradigms such as message passing were utilized, where nodes or processes share their state by sending messages to each other. Due to this, libraries implementing such paradigms became abstractions over the actual system running the software. These abstractions are expected to run on vastly different systems—clusters with diverse sets of technologies. To make maintenance of these libraries feasible, avoid duplication of efforts across different implementations, and ensure portable performance delivery, another layer of abstraction was necessary to interact with vast underlying hardware types within the communication abstractions.Abstractions are necessary to make HPC a reality, however they do not come without a cost. As the name abstraction suggests, they hide aspects of the sides interacting with each other through it. In this thesis, we answer whether it’s possible to have abstractions that serve their main purpose while allowing a controlled flow of information between communicating layers to enable fine-tuned optimizations and improvements without losing any benefits. This thesis introduces a framework for cross-abstraction information sharing and analyzes its implications through applying it to Open MPI, one of the largely deployed communication abstractions. This library due to its modular design and highly optimized implementation is used by thousands of researchers and scientists across the world in fields such as climate modeling, computational fluid dynamics, molecular dynamics, cosmology, bioinformatics, artificial intelligence, machine learning and more. What’s more, this dissertation augments Unified Communication X (UCX) with support for a unique networking interface. UCX is heavily utilized by Open MPI, MPICH (an alternative to Open MPI), Nvidia Collective Communications Library (NCCL), and by extension, many scientific applications. Finally, it is demonstrated how the introduced framework, when applied to both these libraries, enables optimizations and performance gains that otherwise would not have been possible.
ISBN:9798270207304
Fuente:ProQuest Dissertations & Theses Global