Dynamic GPU Kernel Scaling in Service-Oriented Architecture

Guardado en:
Detalles Bibliográficos
Publicado en:ProQuest Dissertations and Theses (2025)
Autor principal: Collins, Zakery
Publicado:
ProQuest Dissertations & Theses
Materias:
Acceso en línea:Citation/Abstract
Full Text - PDF
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
Descripción
Resumen:Using NVIDIA’s Compute Unified Device Architecture (CUDA) C/C++ as well as python libraries accelerated using a Graphics Processing Unit (GPU), GPU-accelerated computing was tested against traditional Central Processing Unit (CPU) computing to determine whether it’s feasible for GPU computing to replace traditional CPU computing. Three experiments were designed to compare the computational speed increase of the GPU versus the added overhead of the required memory transfers to and from the GPU. These experiments include purely computational tasks, machine learning model training, large-scale data processing, and cloud service creation. The goal was to use these experiments to optimize GPU kernels to fit into the GPU architecture to minimize execution time and to get a fair comparison against its CPU counterpart. In cloud services specifically, autoscaling is a major feature to handle varying workloads without wasting resources. The novel contribution for this thesis comes in the form of an intelligent autoscaling feature that schedules multiple kernels on a single GPU to maximize the resources available before autoscaling to more GPU resources.
ISBN:9798314880050
Fuente:ProQuest Dissertations & Theses Global