ZCCL: Advancing Exascale Collective Communications With Co-Designed Compression

محفوظ في:
التفاصيل البيبلوغرافية
الحاوية / القاعدة:ProQuest Dissertations and Theses (2025)
المؤلف الرئيسي: Huang, Jiajun
منشور في:
ProQuest Dissertations & Theses
الموضوعات:
الوصول للمادة أونلاين:Citation/Abstract
Full Text - PDF
الوسوم: إضافة وسم
لا توجد وسوم, كن أول من يضع وسما على هذه التسجيلة!

MARC

LEADER 00000nab a2200000uu 4500
001 3231995875
003 UK-CbPIL
020 |a 9798288854569 
035 |a 3231995875 
045 2 |b d20250101  |b d20251231 
084 |a 66569  |2 nlm 
100 1 |a Huang, Jiajun 
245 1 |a ZCCL: Advancing Exascale Collective Communications With Co-Designed Compression 
260 |b ProQuest Dissertations & Theses  |c 2025 
513 |a Dissertation/Thesis 
520 3 |a Scaling massively parallel computing tasks, such as scientific applications and LLMs, is critically constrained by the efficiency of collective communications. This efficiency is increasingly bottlenecked by network bandwidth, which struggles to keep pace with the rapid growth in computational power and communication data volumes. Furthermore, the heterogeneous architectures of modern supercomputers and the complexity of communication pipelines further exacerbate these efficiency challenges.To address these challenges, I pioneered a new research direction: advancing exascale collective communications with co-designed compression techniques. Under this direction, I developed ZCCL, a family of four novel frameworks that significantly improve communication efficiency across CPU and GPU clusters.The first framework, C-Coll, leverages error-bounded lossy compression to substantially reduce message sizes, thus improving communication performance. The second, gZCCL, presents GPU-aware, compression-enabled collectives that are optimized to achieve both high performance and data accuracy on GPU clusters. The third framework, hZCCL, introduces the first homomorphic compression-communication co-design. It enables direct computation and communication on compressed data, thereby removing the costly decompression and recompression steps required by both C-Coll and gZCCL. Finally, ghZCCL proposes the first GPU-based homomorphic compressor and GPU-aware homomorphic compression-accelerated collectives, offering substantial improvements in both GPU compression and communication efficiency.Together, the ZCCL family significantly outperforms state-of-the-art communication libraries, including the NVIDIA Collective Communications Library (NCCL), Cray-MPI, and MPICH, while maintaining high data accuracy. 
653 |a Computer science 
653 |a Computer engineering 
653 |a Communication 
773 0 |t ProQuest Dissertations and Theses  |g (2025) 
786 0 |d ProQuest  |t Publicly Available Content Database 
856 4 1 |3 Citation/Abstract  |u https://www.proquest.com/docview/3231995875/abstract/embedded/7BTGNMKEMPT1V9Z2?source=fedsrch 
856 4 0 |3 Full Text - PDF  |u https://www.proquest.com/docview/3231995875/fulltextPDF/embedded/7BTGNMKEMPT1V9Z2?source=fedsrch