The Many Hats of Pixels: Supporting Human Interaction and Hierarchical Understanding in Segmentation

Guardado en:
Detalles Bibliográficos
Publicado en:ProQuest Dissertations and Theses (2025)
Autor principal: Myers-Dean, Josh
Publicado:
ProQuest Dissertations & Theses
Materias:
Acceso en línea:Citation/Abstract
Full Text - PDF
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!

MARC

LEADER 00000nab a2200000uu 4500
001 3244235408
003 UK-CbPIL
020 |a 9798291574058 
035 |a 3244235408 
045 2 |b d20250101  |b d20251231 
084 |a 66569  |2 nlm 
100 1 |a Myers-Dean, Josh 
245 1 |a The Many Hats of Pixels: Supporting Human Interaction and Hierarchical Understanding in Segmentation 
260 |b ProQuest Dissertations & Theses  |c 2025 
513 |a Dissertation/Thesis 
520 3 |a Image segmentation, the task of delineating meaningful regions in visual data, has persisted as a central problem in computer vision. While recent advances using deep learning and transformer-based architectures have improved segmentation accuracy, current systems remain limited in their ability to adapt to diverse user interactions and represent the hierarchical, context-dependent nature of scenes. In practice, a single pixel may belong to an object, part, or subpart depending on task or user intent; yet most segmentation models operate at a fixed level of abstraction and rely on rigid input modalities.This dissertation introduces segmentation methods that are hierarchical, interaction-aware, and user-centric. Motivated by practical research experiences in data annotation, human-computer interaction (HCI), and creative tools, the work addresses two key threads: (1) enabling flexible, multimodal user interaction in hybrid human-machine partnerships, and (2) modeling hierarchical relationships in natural images. The first thread (i.e., supporting human-machine partnerships) of this dissertation presents a new dataset and model supporting varied input types (e.g., clicks, scribbles, shapes), enabling more intuitive interactions without requiring explicit user annotations. Additionally, a weakly-supervised fine-tuning framework for interactive segmentation is presented in this dissertation and improves segmentation consistency across user inputs, reducing cognitive load in creative workflows. The second thread (i.e., modeling hierarchical relationships) introduces the first hierarchical semantic segmentation dataset with annotations at object, part, and subpart levels. Building on this, this dissertation proposes the first model that leverages specialized tokens within a large language model to capture “is-part-of” relationships in a single inference pass. Together, these contributions aim to reframe segmentation as a collaborative, context-aware process that better aligns with human perception and real-world needs. 
653 |a Artificial intelligence 
653 |a Computer engineering 
653 |a Computer science 
653 |a Information technology 
773 0 |t ProQuest Dissertations and Theses  |g (2025) 
786 0 |d ProQuest  |t ProQuest Dissertations & Theses Global 
856 4 1 |3 Citation/Abstract  |u https://www.proquest.com/docview/3244235408/abstract/embedded/L8HZQI7Z43R0LA5T?source=fedsrch 
856 4 0 |3 Full Text - PDF  |u https://www.proquest.com/docview/3244235408/fulltextPDF/embedded/L8HZQI7Z43R0LA5T?source=fedsrch