RetCompletion:High-Speed Inference Image Completion with Retentive Network
Sábháilte in:
| Foilsithe in: | arXiv.org (Dec 4, 2024), p. n/a |
|---|---|
| Príomhchruthaitheoir: | |
| Rannpháirtithe: | , , , , |
| Foilsithe / Cruthaithe: |
Cornell University Library, arXiv.org
|
| Ábhair: | |
| Rochtain ar líne: | Citation/Abstract Full text outside of ProQuest |
| Clibeanna: |
Níl clibeanna ann, Bí ar an gcéad duine le clib a chur leis an taifead seo!
|
MARC
| LEADER | 00000nab a2200000uu 4500 | ||
|---|---|---|---|
| 001 | 3141232306 | ||
| 003 | UK-CbPIL | ||
| 022 | |a 2331-8422 | ||
| 035 | |a 3141232306 | ||
| 045 | 0 | |b d20241204 | |
| 100 | 1 | |a Cang, Yueyang | |
| 245 | 1 | |a RetCompletion:High-Speed Inference Image Completion with Retentive Network | |
| 260 | |b Cornell University Library, arXiv.org |c Dec 4, 2024 | ||
| 513 | |a Working Paper | ||
| 520 | 3 | |a Time cost is a major challenge in achieving high-quality pluralistic image completion. Recently, the Retentive Network (RetNet) in natural language processing offers a novel approach to this problem with its low-cost inference capabilities. Inspired by this, we apply RetNet to the pluralistic image completion task in computer vision. We present RetCompletion, a two-stage framework. In the first stage, we introduce Bi-RetNet, a bidirectional sequence information fusion model that integrates contextual information from images. During inference, we employ a unidirectional pixel-wise update strategy to restore consistent image structures, achieving both high reconstruction quality and fast inference speed. In the second stage, we use a CNN for low-resolution upsampling to enhance texture details. Experiments on ImageNet and CelebA-HQ demonstrate that our inference speed is 10\(\times\) faster than ICT and 15\(\times\) faster than RePaint. The proposed RetCompletion significantly improves inference speed and delivers strong performance. | |
| 653 | |a Image restoration | ||
| 653 | |a Data integration | ||
| 653 | |a Computer vision | ||
| 653 | |a Image resolution | ||
| 653 | |a Image quality | ||
| 653 | |a Image reconstruction | ||
| 653 | |a Image enhancement | ||
| 653 | |a Natural language processing | ||
| 653 | |a Inference | ||
| 700 | 1 | |a Hu, Pingge | |
| 700 | 1 | |a Zhang, Xiaoteng | |
| 700 | 1 | |a Wang, Xingtong | |
| 700 | 1 | |a Liu, Yuhang | |
| 700 | 1 | |a Shi, Li | |
| 773 | 0 | |t arXiv.org |g (Dec 4, 2024), p. n/a | |
| 786 | 0 | |d ProQuest |t Engineering Database | |
| 856 | 4 | 1 | |3 Citation/Abstract |u https://www.proquest.com/docview/3141232306/abstract/embedded/6A8EOT78XXH2IG52?source=fedsrch |
| 856 | 4 | 0 | |3 Full text outside of ProQuest |u http://arxiv.org/abs/2410.04056 |