VTG-GPT: Tuning-Free Zero-Shot Video Temporal Grounding with GPT
Na minha lista:
| Publicado no: | Applied Sciences vol. 14, no. 5 (2024), p. 1894 |
|---|---|
| Autor principal: | |
| Outros Autores: | , , , |
| Publicado em: |
MDPI AG
|
| Assuntos: | |
| Acesso em linha: | Citation/Abstract Full Text + Graphics Full Text - PDF |
| Tags: |
Sem tags, seja o primeiro a adicionar uma tag!
|
MARC
| LEADER | 00000nab a2200000uu 4500 | ||
|---|---|---|---|
| 001 | 2955469495 | ||
| 003 | UK-CbPIL | ||
| 022 | |a 2076-3417 | ||
| 024 | 7 | |a 10.3390/app14051894 |2 doi | |
| 035 | |a 2955469495 | ||
| 045 | 2 | |b d20240101 |b d20241231 | |
| 084 | |a 231338 |2 nlm | ||
| 100 | 1 | |a Xu, Yifang |u School of Electronic Science and Engineering, Nanjing University, Nanjing 210093, China; <email>xyf@smail.nju.edu.cn</email> (Y.X.); <email>xze@smail.nju.edu.cn</email> (Z.X.); <email>zbx@smail.nju.edu.cn</email> (B.Z.) | |
| 245 | 1 | |a VTG-GPT: Tuning-Free Zero-Shot Video Temporal Grounding with GPT | |
| 260 | |b MDPI AG |c 2024 | ||
| 513 | |a Journal Article | ||
| 520 | 3 | |a Video temporal grounding (VTG) aims to locate specific temporal segments from an untrimmed video based on a linguistic query. Most existing VTG models are trained on extensive annotated video-text pairs, a process that not only introduces human biases from the queries but also incurs significant computational costs. To tackle these challenges, we propose VTG-GPT, a GPT-based method for zero-shot VTG without training or fine-tuning. To reduce prejudice in the original query, we employ Baichuan2 to generate debiased queries. To lessen redundant information in videos, we apply MiniGPT-v2 to transform visual content into more precise captions. Finally, we devise the proposal generator and post-processing to produce accurate segments from debiased queries and image captions. Extensive experiments demonstrate that VTG-GPT significantly outperforms SOTA methods in zero-shot settings and surpasses unsupervised approaches. More notably, it achieves competitive performance comparable to supervised methods. The code is available on GitHub. | |
| 653 | |a Language | ||
| 653 | |a Design | ||
| 653 | |a Methods | ||
| 653 | |a Linguistics | ||
| 653 | |a Annotations | ||
| 653 | |a Queries | ||
| 653 | |a Proposals | ||
| 653 | |a Natural language | ||
| 653 | |a Bias | ||
| 653 | |a Prejudice | ||
| 700 | 1 | |a Sun, Yunzhuo |u School of Physics and Electronics, Hubei Normal University, Huangshi 435002, China; <email>sunyunzhuo98@gmail.com</email> | |
| 700 | 1 | |a Xie, Zien |u School of Electronic Science and Engineering, Nanjing University, Nanjing 210093, China; <email>xyf@smail.nju.edu.cn</email> (Y.X.); <email>xze@smail.nju.edu.cn</email> (Z.X.); <email>zbx@smail.nju.edu.cn</email> (B.Z.) | |
| 700 | 1 | |a Zhai, Benxiang |u School of Electronic Science and Engineering, Nanjing University, Nanjing 210093, China; <email>xyf@smail.nju.edu.cn</email> (Y.X.); <email>xze@smail.nju.edu.cn</email> (Z.X.); <email>zbx@smail.nju.edu.cn</email> (B.Z.) | |
| 700 | 1 | |a Du, Sidan |u School of Electronic Science and Engineering, Nanjing University, Nanjing 210093, China; <email>xyf@smail.nju.edu.cn</email> (Y.X.); <email>xze@smail.nju.edu.cn</email> (Z.X.); <email>zbx@smail.nju.edu.cn</email> (B.Z.) | |
| 773 | 0 | |t Applied Sciences |g vol. 14, no. 5 (2024), p. 1894 | |
| 786 | 0 | |d ProQuest |t Publicly Available Content Database | |
| 856 | 4 | 1 | |3 Citation/Abstract |u https://www.proquest.com/docview/2955469495/abstract/embedded/L8HZQI7Z43R0LA5T?source=fedsrch |
| 856 | 4 | 0 | |3 Full Text + Graphics |u https://www.proquest.com/docview/2955469495/fulltextwithgraphics/embedded/L8HZQI7Z43R0LA5T?source=fedsrch |
| 856 | 4 | 0 | |3 Full Text - PDF |u https://www.proquest.com/docview/2955469495/fulltextPDF/embedded/L8HZQI7Z43R0LA5T?source=fedsrch |