Spoken in Jest, Detected in Earnest: A Systematic Review of Sarcasm Recognition—Multimodal Fusion, Challenges, and Future Prospects

Guardado en:
Detalles Bibliográficos
Publicado en:IEEE Transactions on Affective Computing vol. 16, no. 3 (2025), p. 2526-2544
Autor principal: Gao, Xiyuan
Otros Autores: Nayak, Shekhar, Coler, Matt
Publicado:
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Materias:
Acceso en línea:Citation/Abstract
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!

MARC

LEADER 00000nab a2200000uu 4500
001 3276023716
003 UK-CbPIL
022 |a 1949-3045 
024 7 |a 10.1109/TAFFC.2025.3612205  |2 doi 
035 |a 3276023716 
045 2 |b d20250101  |b d20251231 
084 |a 267639  |2 nlm 
100 1 |a Gao, Xiyuan  |u Campus Fryslân, University of Groningen, Leeuwarden, CE, The Netherlands 
245 1 |a Spoken in Jest, Detected in Earnest: A Systematic Review of Sarcasm Recognition—Multimodal Fusion, Challenges, and Future Prospects 
260 |b The Institute of Electrical and Electronics Engineers, Inc. (IEEE)  |c 2025 
513 |a Journal Article 
520 3 |a Sarcasm, a common feature of human communication, poses challenges in interpersonal interactions and human-machine interactions. Linguistic research has highlighted the importance of prosodic cues, such as variations in pitch, speaking rate, and intonation, in conveying sarcastic intent. Although previous work has focused on text-based sarcasm detection, the role of speech data in recognizing sarcasm has been underexplored. Recent advancements in speech technology emphasize the growing importance of leveraging speech data for automatic sarcasm recognition, which can enhance social interactions for individuals with neurodegenerative conditions and improve machine understanding of complex human language use, leading to more nuanced interactions. This systematic review is the first to focus on speech-based sarcasm recognition, charting the evolution from unimodal to multimodal approaches. It covers datasets, feature extraction, and classification methods, and aims to bridge gaps across diverse research domains. The findings include limitations in datasets for sarcasm recognition in speech, the evolution of feature extraction techniques from traditional acoustic features to deep learning-based representations, and the progression of classification methods from unimodal approaches to multimodal fusion techniques. In so doing, we identify the need for greater emphasis on cross-cultural and multilingual sarcasm recognition, as well as the importance of addressing sarcasm as a multimodal phenomenon, rather than a text-based challenge. 
653 |a Linguistics 
653 |a Human communication 
653 |a Feature extraction 
653 |a Datasets 
653 |a Classification 
653 |a Machine learning 
653 |a Systematic review 
653 |a Personal communication 
653 |a Speech recognition 
700 1 |a Nayak, Shekhar  |u Campus Fryslân, University of Groningen, Leeuwarden, CE, The Netherlands 
700 1 |a Coler, Matt  |u Campus Fryslân, University of Groningen, Leeuwarden, CE, The Netherlands 
773 0 |t IEEE Transactions on Affective Computing  |g vol. 16, no. 3 (2025), p. 2526-2544 
786 0 |d ProQuest  |t Advanced Technologies & Aerospace Database 
856 4 1 |3 Citation/Abstract  |u https://www.proquest.com/docview/3276023716/abstract/embedded/75I98GEZK8WCJMPQ?source=fedsrch