A BERT-ResNet Cross-Attention Fusion Network and Modality Utilization Assessment for Multimodal Sentiment Classification
Guardado en:
| Publicado en: | ProQuest Dissertations and Theses (2025) |
|---|---|
| Autor principal: | |
| Publicado: |
ProQuest Dissertations & Theses
|
| Materias: | |
| Acceso en línea: | Citation/Abstract Full Text - PDF |
| Etiquetas: |
Sin Etiquetas, Sea el primero en etiquetar este registro!
|
| Resumen: | This study explores the growing field of Multimodal Sentiment Analysis (MSA), focusing on understanding how advanced fusion techniques can improve sentiment prediction in social media contexts. As platforms like X and TikTok continue to expand and facilitate sharing sentiment through digital media, there is an increasing need for neural network architectures that can accurately interpret sentiment across modalities. We implement a model using BERT for textual features and ResNet for visual features. A cross-attention fusion module aligns the modalities for joint representation. We conduct experiments on the MVSA-Single and MVSA-Multiple datasets, which contain over 5,000 and 17,000 labeled text-image pairs. Our research explores the interactions between modalities and proposes a sentiment classifier that builds upon and outperforms current baselines while quantifying the contribution of each modality through an intramodality utilization analysis. |
|---|---|
| ISBN: | 9798314880944 |
| Fuente: | ProQuest Dissertations & Theses Global |