Diffusion Models for Open-Vocabulary Segmentation
Furkejuvvon:
| Publikašuvnnas: | arXiv.org (Sep 30, 2024), p. n/a |
|---|---|
| Váldodahkki: | |
| Eará dahkkit: | , , |
| Almmustuhtton: |
Cornell University Library, arXiv.org
|
| Fáttát: | |
| Liŋkkat: | Citation/Abstract Full text outside of ProQuest |
| Fáddágilkorat: |
Eai fáddágilkorat, Lasit vuosttaš fáddágilkora!
|
MARC
| LEADER | 00000nab a2200000uu 4500 | ||
|---|---|---|---|
| 001 | 2826537230 | ||
| 003 | UK-CbPIL | ||
| 022 | |a 2331-8422 | ||
| 035 | |a 2826537230 | ||
| 045 | 0 | |b d20240930 | |
| 100 | 1 | |a Karazija, Laurynas | |
| 245 | 1 | |a Diffusion Models for Open-Vocabulary Segmentation | |
| 260 | |b Cornell University Library, arXiv.org |c Sep 30, 2024 | ||
| 513 | |a Working Paper | ||
| 520 | 3 | |a Open-vocabulary segmentation is the task of segmenting anything that can be named in an image. Recently, large-scale vision-language modelling has led to significant advances in open-vocabulary segmentation, but at the cost of gargantuan and increasing training and annotation efforts. Hence, we ask if it is possible to use existing foundation models to synthesise on-demand efficient segmentation algorithms for specific class sets, making them applicable in an open-vocabulary setting without the need to collect further data, annotations or perform training. To that end, we present OVDiff, a novel method that leverages generative text-to-image diffusion models for unsupervised open-vocabulary segmentation. OVDiff synthesises support image sets for arbitrary textual categories, creating for each a set of prototypes representative of both the category and its surrounding context (background). It relies solely on pre-trained components and outputs the synthesised segmenter directly, without training. Our approach shows strong performance on a range of benchmarks, obtaining a lead of more than 5% over prior work on PASCAL VOC. | |
| 653 | |a Feature extraction | ||
| 653 | |a Image segmentation | ||
| 653 | |a Ambiguity | ||
| 653 | |a Pascal (programming language) | ||
| 653 | |a Benchmarks | ||
| 653 | |a Training | ||
| 700 | 1 | |a Iro Laina | |
| 700 | 1 | |a Vedaldi, Andrea | |
| 700 | 1 | |a Rupprecht, Christian | |
| 773 | 0 | |t arXiv.org |g (Sep 30, 2024), p. n/a | |
| 786 | 0 | |d ProQuest |t Engineering Database | |
| 856 | 4 | 1 | |3 Citation/Abstract |u https://www.proquest.com/docview/2826537230/abstract/embedded/Y2VX53961LHR7RE6?source=fedsrch |
| 856 | 4 | 0 | |3 Full text outside of ProQuest |u http://arxiv.org/abs/2306.09316 |