A BERT-ResNet Cross-Attention Fusion Network and Modality Utilization Assessment for Multimodal Sentiment Classification

Guardado en:
Detalles Bibliográficos
Publicado en:ProQuest Dissertations and Theses (2025)
Autor principal: Gold, Ronen G.
Publicado:
ProQuest Dissertations & Theses
Materias:
Acceso en línea:Citation/Abstract
Full Text - PDF
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
Descripción
Resumen:This study explores the growing field of Multimodal Sentiment Analysis (MSA), focusing on understanding how advanced fusion techniques can improve sentiment prediction in social media contexts. As platforms like X and TikTok continue to expand and facilitate sharing sentiment through digital media, there is an increasing need for neural network architectures that can accurately interpret sentiment across modalities. We implement a model using BERT for textual features and ResNet for visual features. A cross-attention fusion module aligns the modalities for joint representation. We conduct experiments on the MVSA-Single and MVSA-Multiple datasets, which contain over 5,000 and 17,000 labeled text-image pairs. Our research explores the interactions between modalities and proposes a sentiment classifier that builds upon and outperforms current baselines while quantifying the contribution of each modality through an intramodality utilization analysis.
ISBN:9798314880944
Fuente:ProQuest Dissertations & Theses Global