Audio Transcript Segmentation Via Supervised Topic Modeling

Gardado en:
Detalles Bibliográficos
Publicado en:PQDT - Global (2025)
Autor Principal: Serra, Francisco Maria Lopes Pinto Pimentel
Publicado:
ProQuest Dissertations & Theses
Materias:
Acceso en liña:Citation/Abstract
Full Text - PDF
Etiquetas: Engadir etiqueta
Sen Etiquetas, Sexa o primeiro en etiquetar este rexistro!
Descripción
Resumo:The proliferation of video content across digital platforms demands automated methods for content segmentation, particularly in long-form broadcasts where traditional visual-based approaches inadequately capture subtle topical transitions. This thesis investigates audio transcript segmentation through supervised topic modeling, comparing clustering-based and transformer-based architectures when adapted for boundary detection tasks. This research develops a comprehensive pipeline that transforms raw broadcast transcripts into topically coherent segments, introducing a novel synthetic dataset generation methodology that addresses the scarcity of ground-truth annotations. The study implements and evaluates two distinct classification paradigms: BERTopic, which combines contextual embeddings with clustering algorithms, and fine-tuned RoBERTa, leveraging deep transformer representations. A paragraph-level sliding window approach facilitates the detection of topical boundaries. Experiments conducted on a corpus derived from broadcast news transcripts reveal counterintuitive findings regarding model transferability. While transformer-based models demonstrate superior performance in document-level topic classification, clustering-based approaches exhibit enhanced sensitivity to local discourse transitions, resulting in more accurate boundary detection. This performance inversion challenges conventional assumptions about the relationship between classification accuracy and segmentation effectiveness. The developed system successfully identifies topical shifts in broadcast content, with practical implications for news media, educational platforms, and streaming services. Integration of the segmentation pipeline into existing content management systems enables enhanced searchability, automated summarization, and improved user navigation. The findings establish empirical baselines for transcript-based segmentation and provide methodological insights for developing multimodal video analysis systems that balance global topical coherence with local transition sensitivity.
ISBN:9798265424792
Fonte:ProQuest Dissertations & Theses Global