A High-Precision Hybrid Floating-Point Compute-in-Memory Architecture for Complex Deep Learning

Guardat en:
Dades bibliogràfiques
Publicat a:Electronics vol. 14, no. 22 (2025), p. 4414-4436
Autor principal: Ma Zizhao
Altres autors: Wang, Chunshan, Chen, Qi, Wang, Yifan, Xie Yufeng
Publicat:
MDPI AG
Matèries:
Accés en línia:Citation/Abstract
Full Text + Graphics
Full Text - PDF
Etiquetes: Afegir etiqueta
Sense etiquetes, Sigues el primer a etiquetar aquest registre!

MARC

LEADER 00000nab a2200000uu 4500
001 3275511600
003 UK-CbPIL
022 |a 2079-9292 
024 7 |a 10.3390/electronics14224414  |2 doi 
035 |a 3275511600 
045 2 |b d20250101  |b d20251231 
084 |a 231458  |2 nlm 
100 1 |a Ma Zizhao 
245 1 |a A High-Precision Hybrid Floating-Point Compute-in-Memory Architecture for Complex Deep Learning 
260 |b MDPI AG  |c 2025 
513 |a Journal Article 
520 3 |a As artificial intelligence (AI) advances, deep learning models are shifting from convolutional architectures to transformer-based structures, highlighting the importance of accurate floating-point (FP) calculations. Compute-in-memory (CIM) enhances matrix multiplication performance by breaking down the von Neumann architecture. However, many FPCIMs struggle to maintain high precision while achieving efficiency. This work proposes a high-precision hybrid floating-point compute-in-memory (Hy-FPCIM) architecture for Vision Transformer (ViT) through post-alignment with two different CIM macros: Bit-wise Exponent Macro (BEM) and Booth Mantissa Macro (BMM). The high-parallelism BEM efficiently implements exponent calculations in-memory with the Bit-Separated Exponent Summation Unit (BSESU) and the routing-efficient Bit-wise Max Finder (BMF). The high-precision BMM achieves nearly lossless mantissa computation in-memory with efficient Booth 4 encoding and the sensitivity-amplifier-free Flying Mantissa Lookup Table based on 12T Triple Port SRAM. The proposed Hy-FPCIM architecture achieves 23.7 TFLOPS/W energy efficiency and 0.754 TFLOPS/mm2 area efficiency, with 617 Kb/mm2 memory density in 28 nm technology. With almost lossless architectures, the proposed Hy-FPCIM achieves an accuracy of 81.04% in recognition tasks on the ImageNet dataset using ViT, representing a 0.03% decrease compared to the software baseline. This research presents significant advantages in both accuracy and energy efficiency, providing critical technology for complex deep learning applications. 
653 |a Accuracy 
653 |a Computer memory 
653 |a Deep learning 
653 |a Computer architecture 
653 |a Artificial intelligence 
653 |a Lookup tables 
653 |a Multiplication 
653 |a Design 
653 |a Architecture 
653 |a Energy efficiency 
653 |a Arrays 
653 |a Algorithms 
653 |a Machine learning 
653 |a Workloads 
653 |a Floating point arithmetic 
700 1 |a Wang, Chunshan 
700 1 |a Chen, Qi 
700 1 |a Wang, Yifan 
700 1 |a Xie Yufeng 
773 0 |t Electronics  |g vol. 14, no. 22 (2025), p. 4414-4436 
786 0 |d ProQuest  |t Advanced Technologies & Aerospace Database 
856 4 1 |3 Citation/Abstract  |u https://www.proquest.com/docview/3275511600/abstract/embedded/L8HZQI7Z43R0LA5T?source=fedsrch 
856 4 0 |3 Full Text + Graphics  |u https://www.proquest.com/docview/3275511600/fulltextwithgraphics/embedded/L8HZQI7Z43R0LA5T?source=fedsrch 
856 4 0 |3 Full Text - PDF  |u https://www.proquest.com/docview/3275511600/fulltextPDF/embedded/L8HZQI7Z43R0LA5T?source=fedsrch