A High-Precision Hybrid Floating-Point Compute-in-Memory Architecture for Complex Deep Learning
Guardat en:
| Publicat a: | Electronics vol. 14, no. 22 (2025), p. 4414-4436 |
|---|---|
| Autor principal: | |
| Altres autors: | , , , |
| Publicat: |
MDPI AG
|
| Matèries: | |
| Accés en línia: | Citation/Abstract Full Text + Graphics Full Text - PDF |
| Etiquetes: |
Sense etiquetes, Sigues el primer a etiquetar aquest registre!
|
MARC
| LEADER | 00000nab a2200000uu 4500 | ||
|---|---|---|---|
| 001 | 3275511600 | ||
| 003 | UK-CbPIL | ||
| 022 | |a 2079-9292 | ||
| 024 | 7 | |a 10.3390/electronics14224414 |2 doi | |
| 035 | |a 3275511600 | ||
| 045 | 2 | |b d20250101 |b d20251231 | |
| 084 | |a 231458 |2 nlm | ||
| 100 | 1 | |a Ma Zizhao | |
| 245 | 1 | |a A High-Precision Hybrid Floating-Point Compute-in-Memory Architecture for Complex Deep Learning | |
| 260 | |b MDPI AG |c 2025 | ||
| 513 | |a Journal Article | ||
| 520 | 3 | |a As artificial intelligence (AI) advances, deep learning models are shifting from convolutional architectures to transformer-based structures, highlighting the importance of accurate floating-point (FP) calculations. Compute-in-memory (CIM) enhances matrix multiplication performance by breaking down the von Neumann architecture. However, many FPCIMs struggle to maintain high precision while achieving efficiency. This work proposes a high-precision hybrid floating-point compute-in-memory (Hy-FPCIM) architecture for Vision Transformer (ViT) through post-alignment with two different CIM macros: Bit-wise Exponent Macro (BEM) and Booth Mantissa Macro (BMM). The high-parallelism BEM efficiently implements exponent calculations in-memory with the Bit-Separated Exponent Summation Unit (BSESU) and the routing-efficient Bit-wise Max Finder (BMF). The high-precision BMM achieves nearly lossless mantissa computation in-memory with efficient Booth 4 encoding and the sensitivity-amplifier-free Flying Mantissa Lookup Table based on 12T Triple Port SRAM. The proposed Hy-FPCIM architecture achieves 23.7 TFLOPS/W energy efficiency and 0.754 TFLOPS/mm2 area efficiency, with 617 Kb/mm2 memory density in 28 nm technology. With almost lossless architectures, the proposed Hy-FPCIM achieves an accuracy of 81.04% in recognition tasks on the ImageNet dataset using ViT, representing a 0.03% decrease compared to the software baseline. This research presents significant advantages in both accuracy and energy efficiency, providing critical technology for complex deep learning applications. | |
| 653 | |a Accuracy | ||
| 653 | |a Computer memory | ||
| 653 | |a Deep learning | ||
| 653 | |a Computer architecture | ||
| 653 | |a Artificial intelligence | ||
| 653 | |a Lookup tables | ||
| 653 | |a Multiplication | ||
| 653 | |a Design | ||
| 653 | |a Architecture | ||
| 653 | |a Energy efficiency | ||
| 653 | |a Arrays | ||
| 653 | |a Algorithms | ||
| 653 | |a Machine learning | ||
| 653 | |a Workloads | ||
| 653 | |a Floating point arithmetic | ||
| 700 | 1 | |a Wang, Chunshan | |
| 700 | 1 | |a Chen, Qi | |
| 700 | 1 | |a Wang, Yifan | |
| 700 | 1 | |a Xie Yufeng | |
| 773 | 0 | |t Electronics |g vol. 14, no. 22 (2025), p. 4414-4436 | |
| 786 | 0 | |d ProQuest |t Advanced Technologies & Aerospace Database | |
| 856 | 4 | 1 | |3 Citation/Abstract |u https://www.proquest.com/docview/3275511600/abstract/embedded/L8HZQI7Z43R0LA5T?source=fedsrch |
| 856 | 4 | 0 | |3 Full Text + Graphics |u https://www.proquest.com/docview/3275511600/fulltextwithgraphics/embedded/L8HZQI7Z43R0LA5T?source=fedsrch |
| 856 | 4 | 0 | |3 Full Text - PDF |u https://www.proquest.com/docview/3275511600/fulltextPDF/embedded/L8HZQI7Z43R0LA5T?source=fedsrch |