VOGUE-Based Approach for Segmenting Movement Epenthesis in Continuous Sign Language Recognition

Guardado en:

Detalles Bibliográficos
Publicado en:	Ingenierie des Systemes d'Information vol. 30, no. 11 (Nov 2025), p. 2949-2960
Autor principal:	Thillai, Sivakavi S
Otros Autores:	Minu, R I
Publicado:	International Information and Engineering Technology Association (IIETA)
Materias:	Problem solving Accuracy Dynamic programming Markov chains Deafness Algorithms Sign language Frames (data processing) Real time Conditional random fields Recognition
Acceso en línea:	Citation/Abstract Full Text - PDF
Etiquetas:	Agregar Etiqueta Sin Etiquetas, Sea el primero en etiquetar este registro!

Descripción
Resumen:	When developing a continuous sign language recognition (CSLR) system, a significant challenge lies in processing the vast number of video frames, which demands extensive time and computational resources during both the training and prediction phases. To address this, we propose an efficient and scalable methodology that integrates cluster-based key frame extraction with a VOGUE-based recognition model designed for continuous gestures. The key frame extraction strategy clusters visually similar frames to reduce redundancy while preserving only those with high semantic relevance. To further enhance recognition accuracy, we introduce the Key Curvature Maximum Point (KCMP) technique, which identifies pivotal motion points and captures essential hand trajectory changes inherent to sign language. These refined frames are subsequently used to train a VOGUE-based model that encodes spatial and temporal strokes dynamics, followed by probability distribution modeling for robust prediction. The proposed approach was evaluated using a custom-built Tamil Sign Language dataset. Performance was compared against several established baseline methods, including Dynamic Time Warping (DTW), Hidden Markov Models (HMM), and multiple Conditional Random Field (CRF) variants, as well as the VOM model. The system achieved a recognition accuracy of 86.78% and a sign error rate of 5.3%. A paired t-test confirmed that the improvements over baseline models were statistically significant (p < 0.05). These results demonstrate that the proposed framework provides improved efficiency and competitive accuracy, offering a promising solution for real-time CSLR applications, particularly in low-resource regional sign languages.
ISSN:	1633-1311 2116-7125 1290-2926
DOI:	10.18280/isi.301113
Fuente:	Engineering Database