Diffusion Models for Open-Vocabulary Segmentation

Furkejuvvon:
Bibliográfalaš dieđut
Publikašuvnnas:arXiv.org (Sep 30, 2024), p. n/a
Váldodahkki: Karazija, Laurynas
Eará dahkkit: Iro Laina, Vedaldi, Andrea, Rupprecht, Christian
Almmustuhtton:
Cornell University Library, arXiv.org
Fáttát:
Liŋkkat:Citation/Abstract
Full text outside of ProQuest
Fáddágilkorat: Lasit fáddágilkoriid
Eai fáddágilkorat, Lasit vuosttaš fáddágilkora!

MARC

LEADER 00000nab a2200000uu 4500
001 2826537230
003 UK-CbPIL
022 |a 2331-8422 
035 |a 2826537230 
045 0 |b d20240930 
100 1 |a Karazija, Laurynas 
245 1 |a Diffusion Models for Open-Vocabulary Segmentation 
260 |b Cornell University Library, arXiv.org  |c Sep 30, 2024 
513 |a Working Paper 
520 3 |a Open-vocabulary segmentation is the task of segmenting anything that can be named in an image. Recently, large-scale vision-language modelling has led to significant advances in open-vocabulary segmentation, but at the cost of gargantuan and increasing training and annotation efforts. Hence, we ask if it is possible to use existing foundation models to synthesise on-demand efficient segmentation algorithms for specific class sets, making them applicable in an open-vocabulary setting without the need to collect further data, annotations or perform training. To that end, we present OVDiff, a novel method that leverages generative text-to-image diffusion models for unsupervised open-vocabulary segmentation. OVDiff synthesises support image sets for arbitrary textual categories, creating for each a set of prototypes representative of both the category and its surrounding context (background). It relies solely on pre-trained components and outputs the synthesised segmenter directly, without training. Our approach shows strong performance on a range of benchmarks, obtaining a lead of more than 5% over prior work on PASCAL VOC. 
653 |a Feature extraction 
653 |a Image segmentation 
653 |a Ambiguity 
653 |a Pascal (programming language) 
653 |a Benchmarks 
653 |a Training 
700 1 |a Iro Laina 
700 1 |a Vedaldi, Andrea 
700 1 |a Rupprecht, Christian 
773 0 |t arXiv.org  |g (Sep 30, 2024), p. n/a 
786 0 |d ProQuest  |t Engineering Database 
856 4 1 |3 Citation/Abstract  |u https://www.proquest.com/docview/2826537230/abstract/embedded/Y2VX53961LHR7RE6?source=fedsrch 
856 4 0 |3 Full text outside of ProQuest  |u http://arxiv.org/abs/2306.09316