Text-Guided Object Detection Accuracy Enhancement Method Based on Improved YOLO-World
Guardado en:
| Udgivet i: | Electronics vol. 14, no. 1 (2025), p. 133 |
|---|---|
| Hovedforfatter: | |
| Andre forfattere: | , , , |
| Udgivet: |
MDPI AG
|
| Fag: | |
| Online adgang: | Citation/Abstract Full Text + Graphics Full Text - PDF |
| Tags: |
Ingen Tags, Vær først til at tagge denne postø!
|
MARC
| LEADER | 00000nab a2200000uu 4500 | ||
|---|---|---|---|
| 001 | 3153798643 | ||
| 003 | UK-CbPIL | ||
| 022 | |a 2079-9292 | ||
| 024 | 7 | |a 10.3390/electronics14010133 |2 doi | |
| 035 | |a 3153798643 | ||
| 045 | 2 | |b d20250101 |b d20251231 | |
| 084 | |a 231458 |2 nlm | ||
| 100 | 1 | |a Ding, Qian | |
| 245 | 1 | |a Text-Guided Object Detection Accuracy Enhancement Method Based on Improved YOLO-World | |
| 260 | |b MDPI AG |c 2025 | ||
| 513 | |a Journal Article | ||
| 520 | 3 | |a In intelligent human–robot interaction scenarios, rapidly and accurately searching and recognizing specific targets is essential for enhancing robot operation and navigation capabilities, as well as achieving effective human–robot collaboration. This paper proposes an improved YOLO-World method with an integrated attention mechanism for text-guided object detection, aiming to boost visual detection accuracy. The method incorporates SPD-Conv modules into the YOLOV8 backbone to enhance low-resolution image processing and feature representation for small and medium-sized targets. Additionally, EMA is introduced to improve the visual feature representation guided by the text, and spatial attention focuses the model on image areas related to the text, enhancing its perception of specific target regions described in the text. The improved YOLO-World method with attention mechanism is detailed in the paper. Comparative experiments with four advanced object detection algorithms on COCO and a custom dataset show that the proposed method not only significantly improves object detection accuracy but also exhibits good generalization capabilities in varying scenes. This research offers a reference for high-precision object detection and provides technical solutions for applications requiring accurate object detection, such as human–robot interaction and artificial intelligence robots. | |
| 653 | |a Accuracy | ||
| 653 | |a Datasets | ||
| 653 | |a Deep learning | ||
| 653 | |a Image resolution | ||
| 653 | |a Neural networks | ||
| 653 | |a Target detection | ||
| 653 | |a Robots | ||
| 653 | |a Algorithms | ||
| 653 | |a Telematics | ||
| 653 | |a Object recognition | ||
| 653 | |a Artificial intelligence | ||
| 653 | |a Localization | ||
| 653 | |a Image processing | ||
| 653 | |a Visual perception driven algorithms | ||
| 653 | |a Representations | ||
| 653 | |a Efficiency | ||
| 653 | |a Natural language | ||
| 700 | 1 | |a Zhang, Enzheng | |
| 700 | 1 | |a Liu, Zhiguo | |
| 700 | 1 | |a Yao, Xinhai | |
| 700 | 1 | |a Pan, Gaofeng | |
| 773 | 0 | |t Electronics |g vol. 14, no. 1 (2025), p. 133 | |
| 786 | 0 | |d ProQuest |t Advanced Technologies & Aerospace Database | |
| 856 | 4 | 1 | |3 Citation/Abstract |u https://www.proquest.com/docview/3153798643/abstract/embedded/L8HZQI7Z43R0LA5T?source=fedsrch |
| 856 | 4 | 0 | |3 Full Text + Graphics |u https://www.proquest.com/docview/3153798643/fulltextwithgraphics/embedded/L8HZQI7Z43R0LA5T?source=fedsrch |
| 856 | 4 | 0 | |3 Full Text - PDF |u https://www.proquest.com/docview/3153798643/fulltextPDF/embedded/L8HZQI7Z43R0LA5T?source=fedsrch |