Object Detection and Panoptic Segmentation Through Likelihood Optimizations

Guardado en:
Bibliografiske detaljer
Udgivet i:ProQuest Dissertations and Theses (2024)
Hovedforfatter: Fan, Angzhi
Udgivet:
ProQuest Dissertations & Theses
Fag:
Online adgang:Citation/Abstract
Full Text - PDF
Tags: Tilføj Tag
Ingen Tags, Vær først til at tagge denne postø!

MARC

LEADER 00000nab a2200000uu 4500
001 3063217630
003 UK-CbPIL
020 |a 9798382775173 
035 |a 3063217630 
045 2 |b d20240101  |b d20241231 
084 |a 66569  |2 nlm 
100 1 |a Fan, Angzhi 
245 1 |a Object Detection and Panoptic Segmentation Through Likelihood Optimizations 
260 |b ProQuest Dissertations & Theses  |c 2024 
513 |a Dissertation/Thesis 
520 3 |a This thesis focuses on two pivotal subjects within the domain of Computer Vision: object detection and panoptic segmentation. Fueled by deep neural networks, substantial advancements have been witnessed in these fields in recent years. Many efforts in object detection and panoptic segmentation rely on feed-forward approaches, lacking a probabilistic interpretation. In response to this, the present thesis puts forth three innovative algorithms: the Detection Selection Algorithm, the Detection Selection Algorithm with Mask, and the Maximizing the Posterior for Panoptic Segmentation Algorithm. The initial algorithm is tailored for object detection, while the latter two are specifically devised for panoptic segmentation.These three algorithms are rooted in three distinct probabilistic frameworks. Notwithstanding, they still depend on feed-forward models like Faster R-CNN and Mask R-CNN to generate raw object detections and instance segmentations. Given an image and a hypothesis regarding object configuration and latent codes, the probabilistic frameworks define their respective likelihoods. The primary objective of these algorithms is to identify a configuration hypothesis that maximizes these likelihoods. They employ greedy search procedures to mitigate computational complexity. These three algorithms differ in their approaches to maximizing likelihoods, with some maximizing a log joint probability and another maximizing a posterior probability.The computation of likelihoods necessitates auxiliary tools, including Deep Generative Models that capture the distribution of object appearances. In the case of these three algorithms, we employ the Variational Autoencoder, VAE with flow prior, and Generative Latent Flow, respectively. To conduct inference on the distribution of latent codes, Single Reconstruction Algorithms are designed. Additionally, Whole Reconstruction Algorithms are introduced to amalgamate the probability model of individual objects into a comprehensive probability model for the entire image. They necessitate occlusion relationship reasoning methods to identify the visible components of objects. Experimental results demonstrate that our algorithms yield improvements in tasks such as object counting and enhancement of Panoptic Quality scores. This thesis aims to showcase the potency of probabilistic modeling in the world of contemporary machine learning. 
653 |a Computer science 
653 |a Computer engineering 
773 0 |t ProQuest Dissertations and Theses  |g (2024) 
786 0 |d ProQuest  |t ProQuest Dissertations & Theses Global 
856 4 1 |3 Citation/Abstract  |u https://www.proquest.com/docview/3063217630/abstract/embedded/L8HZQI7Z43R0LA5T?source=fedsrch 
856 4 0 |3 Full Text - PDF  |u https://www.proquest.com/docview/3063217630/fulltextPDF/embedded/L8HZQI7Z43R0LA5T?source=fedsrch