Audio Transcript Segmentation Via Supervised Topic Modeling

I tiakina i:
Ngā taipitopito rārangi puna kōrero
I whakaputaina i:PQDT - Global (2025)
Kaituhi matua: Serra, Francisco Maria Lopes Pinto Pimentel
I whakaputaina:
ProQuest Dissertations & Theses
Ngā marau:
Urunga tuihono:Citation/Abstract
Full Text - PDF
Ngā Tūtohu: Tāpirihia he Tūtohu
Kāore He Tūtohu, Me noho koe te mea tuatahi ki te tūtohu i tēnei pūkete!
Whakaahuatanga
Whakarāpopotonga:The proliferation of video content across digital platforms demands automated methods for content segmentation, particularly in long-form broadcasts where traditional visual-based approaches inadequately capture subtle topical transitions. This thesis investigates audio transcript segmentation through supervised topic modeling, comparing clustering-based and transformer-based architectures when adapted for boundary detection tasks. This research develops a comprehensive pipeline that transforms raw broadcast transcripts into topically coherent segments, introducing a novel synthetic dataset generation methodology that addresses the scarcity of ground-truth annotations. The study implements and evaluates two distinct classification paradigms: BERTopic, which combines contextual embeddings with clustering algorithms, and fine-tuned RoBERTa, leveraging deep transformer representations. A paragraph-level sliding window approach facilitates the detection of topical boundaries. Experiments conducted on a corpus derived from broadcast news transcripts reveal counterintuitive findings regarding model transferability. While transformer-based models demonstrate superior performance in document-level topic classification, clustering-based approaches exhibit enhanced sensitivity to local discourse transitions, resulting in more accurate boundary detection. This performance inversion challenges conventional assumptions about the relationship between classification accuracy and segmentation effectiveness. The developed system successfully identifies topical shifts in broadcast content, with practical implications for news media, educational platforms, and streaming services. Integration of the segmentation pipeline into existing content management systems enables enhanced searchability, automated summarization, and improved user navigation. The findings establish empirical baselines for transcript-based segmentation and provide methodological insights for developing multimodal video analysis systems that balance global topical coherence with local transition sensitivity.
ISBN:9798265424792
Puna:ProQuest Dissertations & Theses Global