A non-anatomical graph structure for boundary detection in continuous sign language

Enregistré dans:
Détails bibliographiques
Publié dans:Scientific Reports (Nature Publisher Group) vol. 15, no. 1 (2025), p. 25683
Auteur principal: Rastgoo, Razieh
Autres auteurs: Kiani, Kourosh, Escalera, Sergio
Publié:
Nature Publishing Group
Sujets:
Accès en ligne:Citation/Abstract
Full Text
Full Text - PDF
Tags: Ajouter un tag
Pas de tags, Soyez le premier à ajouter un tag!

MARC

LEADER 00000nab a2200000uu 4500
001 3230336632
003 UK-CbPIL
022 |a 2045-2322 
024 7 |a 10.1038/s41598-025-11598-3  |2 doi 
035 |a 3230336632 
045 2 |b d20250101  |b d20251231 
084 |a 274855  |2 nlm 
100 1 |a Rastgoo, Razieh  |u Semnan University, Electrical and Computer Engineering Department, Semnan, Iran (GRID:grid.412475.1) (ISNI:0000 0001 0506 807X) 
245 1 |a A non-anatomical graph structure for boundary detection in continuous sign language 
260 |b Nature Publishing Group  |c 2025 
513 |a Journal Article 
520 3 |a Recently, the challenge of the boundary detection of isolated signs in a continuous sign video has been studied by researchers. To enhance the model performance, replace the handcrafted feature extractor, and also consider the hand structure in these models, we propose a deep learning-based approach, including a combination of the Graph Convolutional Network (GCN) and the Transformer models, along with a post-processing mechanism for final boundary detection. More specifically, the proposed approach includes two main steps: Pre-training on the isolated sign videos and Deploying on the continuous sign videos. In the first step, the enriched spatial features obtained from the GCN model are fed to the Transformer model to push the temporal information in the video stream. This model in pre-trained only using the pre-processed isolated sign videos with same frame lengths. During the second step, the sliding window method with the pre-defined window size is moved on the continuous sign video, including the un-processed isolated sign videos with different frame lengths. More concretely, the content of each window is processed using the pre-trained model obtained from the first step and the class probabilities of the Fully Connected (FC) layer embedded in the Transformer model are fed to the post-processing module, which aims to detect the accurate boundary of the un-processed isolated signs. In addition, we propose to present a non-anatomical graph structure to better present the hand joints movements and relations during the signing. Relying on the proposed non-anatomical hand graph structure as well as the self-attention mechanism in the Transformer model, the proposed model can successfully tackle the challenges of boundary detection in continuous sign videos. Experimental results on two datasets show the superiority of the proposed model in dealing with isolated sign boundary detection in continuous sign sequences. 
653 |a Connectivity 
653 |a Deep learning 
653 |a Datasets 
653 |a Literature reviews 
653 |a Sign language 
653 |a Neural networks 
653 |a Environmental 
700 1 |a Kiani, Kourosh  |u Semnan University, Electrical and Computer Engineering Department, Semnan, Iran (GRID:grid.412475.1) (ISNI:0000 0001 0506 807X) 
700 1 |a Escalera, Sergio  |u University of Barcelona and Computer Vision Center, Department of Mathematics and Informatics, Barcelona, Spain (GRID:grid.5841.8) (ISNI:0000 0004 1937 0247) 
773 0 |t Scientific Reports (Nature Publisher Group)  |g vol. 15, no. 1 (2025), p. 25683 
786 0 |d ProQuest  |t Science Database 
856 4 1 |3 Citation/Abstract  |u https://www.proquest.com/docview/3230336632/abstract/embedded/6A8EOT78XXH2IG52?source=fedsrch 
856 4 0 |3 Full Text  |u https://www.proquest.com/docview/3230336632/fulltext/embedded/6A8EOT78XXH2IG52?source=fedsrch 
856 4 0 |3 Full Text - PDF  |u https://www.proquest.com/docview/3230336632/fulltextPDF/embedded/6A8EOT78XXH2IG52?source=fedsrch