A Pool Drowning Detection Model Based on Improved YOLO
Guardado en:
| Publicado en: | Sensors vol. 25, no. 17 (2025), p. 5552-5571 |
|---|---|
| Autor principal: | |
| Otros Autores: | , |
| Publicado: |
MDPI AG
|
| Materias: | |
| Acceso en línea: | Citation/Abstract Full Text + Graphics Full Text - PDF |
| Etiquetas: |
Sin Etiquetas, Sea el primero en etiquetar este registro!
|
| Resumen: | <sec sec-type="highlights"> What are the main findings? <list list-type="bullet"> <list-item> </list-item>The proposed YOLO11-LiB achieves a high drowning class mean average precision (DmAP50) of 94.1% while being extremely lightweight (2.02 M parameters, 4.25 MB size). <list-item> Key innovations include the LGCBlock for efficient downsampling, the C2PSAiSCSA module for enhanced spatial–channel feature attention, and the BiFF-Net for improved multi-scale feature fusion. </list-item> What is the implication of the main finding? <list list-type="bullet"> <list-item> </list-item>Addresses critical limitations in real-time drowning detection: poor edge deployment efficiency, robustness in complex water environments, and multi-scale object challenges. <list-item> Provides a high-performance, computationally efficient solution enabling practical real-time surveillance in swimming pool scenarios. </list-item> Drowning constitutes the leading cause of injury-related fatalities among adolescents. In swimming pool environments, traditional manual surveillance exhibits limitations, while existing technologies suffer from poor adaptability of wearable devices. Vision models based on YOLO still face challenges in edge deployment efficiency, robustness in complex water conditions, and multi-scale object detection. To address these issues, we propose YOLO11-LiB, a drowning object detection model based on YOLO11n, featuring three key enhancements. First, we design the Lightweight Feature Extraction Module (LGCBlock), which integrates the Lightweight Attention Encoding Block (LAE) and effectively combines Ghost Convolution (GhostConv) with dynamic convolution (DynamicConv). This optimizes the downsampling structure and the C3k2 module in the YOLO11n backbone network, significantly reducing model parameters and computational complexity. Second, we introduce the Cross-Channel Position-aware Spatial Attention Inverted Residual with Spatial–Channel Separate Attention module (C2PSAiSCSA) into the backbone. This module embeds the Spatial–Channel Separate Attention (SCSA) mechanism within the Inverted Residual Mobile Block (iRMB) framework, enabling more comprehensive and efficient feature extraction. Finally, we redesign the neck structure as the Bidirectional Feature Fusion Network (BiFF-Net), which integrates the Bidirectional Feature Pyramid Network (BiFPN) and Frequency-Aware Feature Fusion (FreqFusion). The enhanced YOLO11-LiB model was validated against mainstream algorithms through comparative experiments, and ablation studies were conducted. Experimental results demonstrate that YOLO11-LiB achieves a drowning class mean average precision (DmAP50) of 94.1%, with merely 2.02 M parameters and a model size of 4.25 MB. This represents an effective balance between accuracy and efficiency, providing a high-performance solution for real-time drowning detection in swimming pool scenarios. |
|---|---|
| ISSN: | 1424-8220 |
| DOI: | 10.3390/s25175552 |
| Fuente: | Health & Medical Collection |