A transformer model for boundary detection in continuous sign language

Sparad:
Bibliografiska uppgifter
I publikationen:Multimedia Tools and Applications vol. 83, no. 42 (Dec 2024), p. 89931
Utgiven:
Springer Nature B.V.
Ämnen:
Länkar:Citation/Abstract
Full Text - PDF
Taggar: Lägg till en tagg
Inga taggar, Lägg till första taggen!

MARC

LEADER 00000nab a2200000uu 4500
001 3149798171
003 UK-CbPIL
022 |a 1380-7501 
022 |a 1573-7721 
024 7 |a 10.1007/s11042-024-19079-x  |2 doi 
035 |a 3149798171 
045 2 |b d20241201  |b d20241231 
084 |a 108528  |2 nlm 
245 1 |a A transformer model for boundary detection in continuous sign language 
260 |b Springer Nature B.V.  |c Dec 2024 
513 |a Journal Article 
520 3 |a Sign Language Recognition (SLR) has garnered significant attention from researchers in recent years, particularly the intricate domain of Continuous Sign Language Recognition (CSLR), which presents heightened complexity compared to Isolated Sign Language Recognition (ISLR). One of the prominent challenges in CSLR pertains to accurately detecting the boundaries of isolated signs within a continuous video stream. Additionally, the reliance on handcrafted features in existing models poses a challenge to achieving optimal accuracy. To surmount these challenges, we propose a novel approach utilizing a Transformer-based model. Unlike traditional models, our approach focuses on enhancing accuracy while eliminating the need for handcrafted features. The Transformer model is employed for both ISLR and CSLR. The training process involves using isolated sign videos, where hand keypoint features extracted from the input video are enriched using the Transformer model. Subsequently, these enriched features are forwarded to the final classification layer. The trained model, coupled with a post-processing method, is then applied to detect isolated sign boundaries within continuous sign videos. The evaluation of our model is conducted on two distinct datasets, including both continuous signs and their corresponding isolated signs, demonstrates promising results. 
653 |a Feature extraction 
653 |a Accuracy 
653 |a Video data 
653 |a Video post-production 
653 |a Sign language 
653 |a Boundaries 
773 0 |t Multimedia Tools and Applications  |g vol. 83, no. 42 (Dec 2024), p. 89931 
786 0 |d ProQuest  |t ABI/INFORM Global 
856 4 1 |3 Citation/Abstract  |u https://www.proquest.com/docview/3149798171/abstract/embedded/6A8EOT78XXH2IG52?source=fedsrch 
856 4 0 |3 Full Text - PDF  |u https://www.proquest.com/docview/3149798171/fulltextPDF/embedded/6A8EOT78XXH2IG52?source=fedsrch