Design and Implementation of a YOLOv2 Accelerator on a Zynq-7000 FPGA

Tallennettuna:
Bibliografiset tiedot
Julkaisussa:Sensors vol. 25, no. 20 (2025), p. 6359-6382
Päätekijä: Kim, Huimin
Muut tekijät: Kim Tae-Kyoung
Julkaistu:
MDPI AG
Aiheet:
Linkit:Citation/Abstract
Full Text + Graphics
Full Text - PDF
Tagit: Lisää tagi
Ei tageja, Lisää ensimmäinen tagi!

MARC

LEADER 00000nab a2200000uu 4500
001 3265945697
003 UK-CbPIL
022 |a 1424-8220 
024 7 |a 10.3390/s25206359  |2 doi 
035 |a 3265945697 
045 2 |b d20250101  |b d20251231 
084 |a 231630  |2 nlm 
100 1 |a Kim, Huimin 
245 1 |a Design and Implementation of a YOLOv2 Accelerator on a Zynq-7000 FPGA 
260 |b MDPI AG  |c 2025 
513 |a Journal Article 
520 3 |a You Only Look Once (YOLO) is a convolutional neural network-based object detection algorithm widely used in real-time vision applications. However, its high computational demand leads to significant power consumption and cost when deployed in graphics processing units. Field-programmable gate arrays offer a low-power alternative. However, their efficient implementation requires architecture-level optimization tailored to limited device resources. This study presents an optimized YOLOv2 accelerator for the Zynq-7000 system-on-chip (SoC). The design employs 16-bit integer quantization, a filter reuse structure, an input feature map reuse scheme using a line buffer, and tiling parameter optimization for the convolution and max pooling layers to maximize resource efficiency. In addition, a stall-based control mechanism is introduced to prevent structural hazards in the pipeline. The proposed accelerator was implemented on the Zynq-7000 SoC board, and a system-level evaluation confirmed a negligible accuracy drop of only 0.2% compared with the 32-bit floating-point baseline. Compared with previous YOLO accelerators on the same SoC, the design achieved up to 26% and 15% reductions in flip-flop and digital signal processor usage, respectively. This result demonstrates feasible deployment on XC7Z020 with DSP 57.27% and FF 16.55% utilization. 
653 |a Design 
653 |a Architecture 
653 |a Random access memory 
653 |a Accuracy 
653 |a Digital signal processors 
653 |a Optimization techniques 
653 |a Field programmable gate arrays 
653 |a Efficiency 
700 1 |a Kim Tae-Kyoung 
773 0 |t Sensors  |g vol. 25, no. 20 (2025), p. 6359-6382 
786 0 |d ProQuest  |t Health & Medical Collection 
856 4 1 |3 Citation/Abstract  |u https://www.proquest.com/docview/3265945697/abstract/embedded/75I98GEZK8WCJMPQ?source=fedsrch 
856 4 0 |3 Full Text + Graphics  |u https://www.proquest.com/docview/3265945697/fulltextwithgraphics/embedded/75I98GEZK8WCJMPQ?source=fedsrch 
856 4 0 |3 Full Text - PDF  |u https://www.proquest.com/docview/3265945697/fulltextPDF/embedded/75I98GEZK8WCJMPQ?source=fedsrch