Audio Transcript Segmentation Via Supervised Topic Modeling
保存先:
| 出版年: | PQDT - Global (2025) |
|---|---|
| 第一著者: | |
| 出版事項: |
ProQuest Dissertations & Theses
|
| 主題: | |
| オンライン・アクセス: | Citation/Abstract Full Text - PDF |
| タグ: |
タグなし, このレコードへの初めてのタグを付けませんか!
|
MARC
| LEADER | 00000nab a2200000uu 4500 | ||
|---|---|---|---|
| 001 | 3275478367 | ||
| 003 | UK-CbPIL | ||
| 020 | |a 9798265424792 | ||
| 035 | |a 3275478367 | ||
| 045 | 2 | |b d20250101 |b d20251231 | |
| 084 | |a 189128 |2 nlm | ||
| 100 | 1 | |a Serra, Francisco Maria Lopes Pinto Pimentel | |
| 245 | 1 | |a Audio Transcript Segmentation Via Supervised Topic Modeling | |
| 260 | |b ProQuest Dissertations & Theses |c 2025 | ||
| 513 | |a Dissertation/Thesis | ||
| 520 | 3 | |a The proliferation of video content across digital platforms demands automated methods for content segmentation, particularly in long-form broadcasts where traditional visual-based approaches inadequately capture subtle topical transitions. This thesis investigates audio transcript segmentation through supervised topic modeling, comparing clustering-based and transformer-based architectures when adapted for boundary detection tasks. This research develops a comprehensive pipeline that transforms raw broadcast transcripts into topically coherent segments, introducing a novel synthetic dataset generation methodology that addresses the scarcity of ground-truth annotations. The study implements and evaluates two distinct classification paradigms: BERTopic, which combines contextual embeddings with clustering algorithms, and fine-tuned RoBERTa, leveraging deep transformer representations. A paragraph-level sliding window approach facilitates the detection of topical boundaries. Experiments conducted on a corpus derived from broadcast news transcripts reveal counterintuitive findings regarding model transferability. While transformer-based models demonstrate superior performance in document-level topic classification, clustering-based approaches exhibit enhanced sensitivity to local discourse transitions, resulting in more accurate boundary detection. This performance inversion challenges conventional assumptions about the relationship between classification accuracy and segmentation effectiveness. The developed system successfully identifies topical shifts in broadcast content, with practical implications for news media, educational platforms, and streaming services. Integration of the segmentation pipeline into existing content management systems enables enhanced searchability, automated summarization, and improved user navigation. The findings establish empirical baselines for transcript-based segmentation and provide methodological insights for developing multimodal video analysis systems that balance global topical coherence with local transition sensitivity. | |
| 653 | |a Sparsity | ||
| 653 | |a Data mining | ||
| 653 | |a Real time | ||
| 653 | |a Information overload | ||
| 653 | |a Batch processing | ||
| 653 | |a Clustering | ||
| 653 | |a Artificial intelligence | ||
| 653 | |a Voice recognition | ||
| 653 | |a Neural networks | ||
| 653 | |a Natural language processing | ||
| 653 | |a Multilingualism | ||
| 653 | |a Linguistics | ||
| 653 | |a Streaming media | ||
| 653 | |a Speech | ||
| 653 | |a Semantics | ||
| 653 | |a Bilingual education | ||
| 653 | |a Film studies | ||
| 653 | |a Engineering | ||
| 773 | 0 | |t PQDT - Global |g (2025) | |
| 786 | 0 | |d ProQuest |t ProQuest Dissertations & Theses Global | |
| 856 | 4 | 1 | |3 Citation/Abstract |u https://www.proquest.com/docview/3275478367/abstract/embedded/IZYTEZ3DIR4FRXA2?source=fedsrch |
| 856 | 4 | 0 | |3 Full Text - PDF |u https://www.proquest.com/docview/3275478367/fulltextPDF/embedded/IZYTEZ3DIR4FRXA2?source=fedsrch |