Design and Implementation of a YOLOv2 Accelerator on a Zynq-7000 FPGA
Tallennettuna:
| Julkaisussa: | Sensors vol. 25, no. 20 (2025), p. 6359-6382 |
|---|---|
| Päätekijä: | |
| Muut tekijät: | |
| Julkaistu: |
MDPI AG
|
| Aiheet: | |
| Linkit: | Citation/Abstract Full Text + Graphics Full Text - PDF |
| Tagit: |
Ei tageja, Lisää ensimmäinen tagi!
|
MARC
| LEADER | 00000nab a2200000uu 4500 | ||
|---|---|---|---|
| 001 | 3265945697 | ||
| 003 | UK-CbPIL | ||
| 022 | |a 1424-8220 | ||
| 024 | 7 | |a 10.3390/s25206359 |2 doi | |
| 035 | |a 3265945697 | ||
| 045 | 2 | |b d20250101 |b d20251231 | |
| 084 | |a 231630 |2 nlm | ||
| 100 | 1 | |a Kim, Huimin | |
| 245 | 1 | |a Design and Implementation of a YOLOv2 Accelerator on a Zynq-7000 FPGA | |
| 260 | |b MDPI AG |c 2025 | ||
| 513 | |a Journal Article | ||
| 520 | 3 | |a You Only Look Once (YOLO) is a convolutional neural network-based object detection algorithm widely used in real-time vision applications. However, its high computational demand leads to significant power consumption and cost when deployed in graphics processing units. Field-programmable gate arrays offer a low-power alternative. However, their efficient implementation requires architecture-level optimization tailored to limited device resources. This study presents an optimized YOLOv2 accelerator for the Zynq-7000 system-on-chip (SoC). The design employs 16-bit integer quantization, a filter reuse structure, an input feature map reuse scheme using a line buffer, and tiling parameter optimization for the convolution and max pooling layers to maximize resource efficiency. In addition, a stall-based control mechanism is introduced to prevent structural hazards in the pipeline. The proposed accelerator was implemented on the Zynq-7000 SoC board, and a system-level evaluation confirmed a negligible accuracy drop of only 0.2% compared with the 32-bit floating-point baseline. Compared with previous YOLO accelerators on the same SoC, the design achieved up to 26% and 15% reductions in flip-flop and digital signal processor usage, respectively. This result demonstrates feasible deployment on XC7Z020 with DSP 57.27% and FF 16.55% utilization. | |
| 653 | |a Design | ||
| 653 | |a Architecture | ||
| 653 | |a Random access memory | ||
| 653 | |a Accuracy | ||
| 653 | |a Digital signal processors | ||
| 653 | |a Optimization techniques | ||
| 653 | |a Field programmable gate arrays | ||
| 653 | |a Efficiency | ||
| 700 | 1 | |a Kim Tae-Kyoung | |
| 773 | 0 | |t Sensors |g vol. 25, no. 20 (2025), p. 6359-6382 | |
| 786 | 0 | |d ProQuest |t Health & Medical Collection | |
| 856 | 4 | 1 | |3 Citation/Abstract |u https://www.proquest.com/docview/3265945697/abstract/embedded/75I98GEZK8WCJMPQ?source=fedsrch |
| 856 | 4 | 0 | |3 Full Text + Graphics |u https://www.proquest.com/docview/3265945697/fulltextwithgraphics/embedded/75I98GEZK8WCJMPQ?source=fedsrch |
| 856 | 4 | 0 | |3 Full Text - PDF |u https://www.proquest.com/docview/3265945697/fulltextPDF/embedded/75I98GEZK8WCJMPQ?source=fedsrch |