Text-Guided Object Detection Accuracy Enhancement Method Based on Improved YOLO-World

Guardado en:
Bibliografiske detaljer
Udgivet i:Electronics vol. 14, no. 1 (2025), p. 133
Hovedforfatter: Ding, Qian
Andre forfattere: Zhang, Enzheng, Liu, Zhiguo, Yao, Xinhai, Pan, Gaofeng
Udgivet:
MDPI AG
Fag:
Online adgang:Citation/Abstract
Full Text + Graphics
Full Text - PDF
Tags: Tilføj Tag
Ingen Tags, Vær først til at tagge denne postø!

MARC

LEADER 00000nab a2200000uu 4500
001 3153798643
003 UK-CbPIL
022 |a 2079-9292 
024 7 |a 10.3390/electronics14010133  |2 doi 
035 |a 3153798643 
045 2 |b d20250101  |b d20251231 
084 |a 231458  |2 nlm 
100 1 |a Ding, Qian 
245 1 |a Text-Guided Object Detection Accuracy Enhancement Method Based on Improved YOLO-World 
260 |b MDPI AG  |c 2025 
513 |a Journal Article 
520 3 |a In intelligent human–robot interaction scenarios, rapidly and accurately searching and recognizing specific targets is essential for enhancing robot operation and navigation capabilities, as well as achieving effective human–robot collaboration. This paper proposes an improved YOLO-World method with an integrated attention mechanism for text-guided object detection, aiming to boost visual detection accuracy. The method incorporates SPD-Conv modules into the YOLOV8 backbone to enhance low-resolution image processing and feature representation for small and medium-sized targets. Additionally, EMA is introduced to improve the visual feature representation guided by the text, and spatial attention focuses the model on image areas related to the text, enhancing its perception of specific target regions described in the text. The improved YOLO-World method with attention mechanism is detailed in the paper. Comparative experiments with four advanced object detection algorithms on COCO and a custom dataset show that the proposed method not only significantly improves object detection accuracy but also exhibits good generalization capabilities in varying scenes. This research offers a reference for high-precision object detection and provides technical solutions for applications requiring accurate object detection, such as human–robot interaction and artificial intelligence robots. 
653 |a Accuracy 
653 |a Datasets 
653 |a Deep learning 
653 |a Image resolution 
653 |a Neural networks 
653 |a Target detection 
653 |a Robots 
653 |a Algorithms 
653 |a Telematics 
653 |a Object recognition 
653 |a Artificial intelligence 
653 |a Localization 
653 |a Image processing 
653 |a Visual perception driven algorithms 
653 |a Representations 
653 |a Efficiency 
653 |a Natural language 
700 1 |a Zhang, Enzheng 
700 1 |a Liu, Zhiguo 
700 1 |a Yao, Xinhai 
700 1 |a Pan, Gaofeng 
773 0 |t Electronics  |g vol. 14, no. 1 (2025), p. 133 
786 0 |d ProQuest  |t Advanced Technologies & Aerospace Database 
856 4 1 |3 Citation/Abstract  |u https://www.proquest.com/docview/3153798643/abstract/embedded/L8HZQI7Z43R0LA5T?source=fedsrch 
856 4 0 |3 Full Text + Graphics  |u https://www.proquest.com/docview/3153798643/fulltextwithgraphics/embedded/L8HZQI7Z43R0LA5T?source=fedsrch 
856 4 0 |3 Full Text - PDF  |u https://www.proquest.com/docview/3153798643/fulltextPDF/embedded/L8HZQI7Z43R0LA5T?source=fedsrch