Performance evaluation of the inverse real-valued fast Fourier transform on field programmable gate array platforms using open computing language

Guardado en:

Detalles Bibliográficos
Publicado en:	PeerJ Computer Science (Nov 3, 2025)
Autor principal:	Liu, Li
Otros Autores:	Sida Yang, Tan, Haoyu, Zhou, Fengzhan, Yin, Jiantao, Cao, Zishen, Wang, Tianhao, Qian, Zhuo, Gan, Guoyou
Publicado:	PeerJ, Inc.
Materias:	Central processing units > CPUs Computation Performance evaluation Fourier transforms Graphics processing units Hardware Fast Fourier transformations Signal flow graphs Power management High level synthesis Medical equipment Design Algorithms Field programmable gate arrays Complexity High performance computing
Acceso en línea:	Citation/Abstract Full Text + Graphics
Etiquetas:	Agregar Etiqueta Sin Etiquetas, Sea el primero en etiquetar este registro!

Descripción
Resumen:	The real-valued fast Fourier transform (RFFT) is well-suited for high-speed, low-power FFT processors, as it requires approximately half the arithmetic operations compared to the traditional complex-valued FFT (CFFT). While RFFT can be computed using CFFT hardware, a dedicated RFFT implementation offers advantages such as lower hardware complexity, reduced power consumption, and higher throughput. However, unlike CFFT, the irregular signal flow graph of RFFT presents challenges in designing efficient pipelined architectures. In our previous work, we have proposed a high-level programming approach using Open Computing Language (OpenCL) to implement the forward RFFT architectures on Field-Programmable Gate Arrays (FPGAs). In this article, we propose a high-level programming approach to implement the inverse RFFT architectures on FPGAs. By identifying regular computational patterns in the inverse RFFT flow graph, our method efficiently expresses the algorithm using a for loop, which is later fully unrolled using high-level synthesis tools to automatically generate a pipelined architecture. Experiments show that for a 4,096-point inverse RFFT, the proposed method achieves a 2.36x speedup and 2.92x better energy efficiency over CUDA FFT (CUFFT) on Graphics Processing Units (GPUs), and a 24.91x speedup and 18.98x better energy efficiency over Fastest Fourier Transform in the West (FFTW) on Central Processing Units (CPUs) respectively. Compared to Intel’s CFFT design on the same FPGA, the proposed one reduces 9% logic resources while achieving a 1.39x speedup. These results highlight the effectiveness of our approach in optimizing RFFT performance on FPGA platforms.
ISSN:	2376-5992
DOI:	10.7717/peerj-cs.3313
Fuente:	Advanced Technologies & Aerospace Database