Temporal Transformer-Based Video Super-Resolution Reconstruction with Cross-Modal Attention
محفوظ في:
| الحاوية / القاعدة: | Informatica vol. 49, no. 10 (Feb 2025), p. 179 |
|---|---|
| المؤلف الرئيسي: | |
| مؤلفون آخرون: | |
| منشور في: |
Slovenian Society Informatika / Slovensko drustvo Informatika
|
| الموضوعات: | |
| الوصول للمادة أونلاين: | Citation/Abstract Full Text Full Text - PDF |
| الوسوم: |
لا توجد وسوم, كن أول من يضع وسما على هذه التسجيلة!
|
| مستخلص: | With the increasing demand for high-definition video, video super-resolution technology has become a key means to improve video picture quality. Traditional video super-resolution methods are limited by computational resources and model complexity, which struggle to meet the demands of modern video processing. In recent years, the rise of deep learning technology has brought a revolutionary breakthrough for video super-resolution. In this paper, we propose a deep learning-based video super-resolution reconstruction method that combines Transformer, cross-modal learning and fusion, and an attention mechanism. We design the Temporal Transformer-based Video Super-Resolution (TT-VSR) architecture, which significantly improves the accuracy and detail richness of video reconstruction by integrating the Transformer's self-attention mechanism with CNN's spatial feature extraction capabilities. The introduction of cross-modal learning and fusion, along with the cross-modal attention mechanism, further enhances the model's adaptability to complex scenes and detail recovery ability. Experimental results demonstrate that our model outperforms existing methods, achieving a PSNR ofXdB and an SSIM of Y, indicating substantial improvements in image quality. These results validate the efficacy of our approach and open a new path for the development of video super-resolution technology. |
|---|---|
| تدمد: | 0350-5596 1854-3871 |
| DOI: | 10.31449/inf.v49i10.7146 |
| المصدر: | Advanced Technologies & Aerospace Database |