The Many Hats of Pixels: Supporting Human Interaction and Hierarchical Understanding in Segmentation
Guardado en:
| Publicado en: | ProQuest Dissertations and Theses (2025) |
|---|---|
| Autor principal: | |
| Publicado: |
ProQuest Dissertations & Theses
|
| Materias: | |
| Acceso en línea: | Citation/Abstract Full Text - PDF |
| Etiquetas: |
Sin Etiquetas, Sea el primero en etiquetar este registro!
|
MARC
| LEADER | 00000nab a2200000uu 4500 | ||
|---|---|---|---|
| 001 | 3244235408 | ||
| 003 | UK-CbPIL | ||
| 020 | |a 9798291574058 | ||
| 035 | |a 3244235408 | ||
| 045 | 2 | |b d20250101 |b d20251231 | |
| 084 | |a 66569 |2 nlm | ||
| 100 | 1 | |a Myers-Dean, Josh | |
| 245 | 1 | |a The Many Hats of Pixels: Supporting Human Interaction and Hierarchical Understanding in Segmentation | |
| 260 | |b ProQuest Dissertations & Theses |c 2025 | ||
| 513 | |a Dissertation/Thesis | ||
| 520 | 3 | |a Image segmentation, the task of delineating meaningful regions in visual data, has persisted as a central problem in computer vision. While recent advances using deep learning and transformer-based architectures have improved segmentation accuracy, current systems remain limited in their ability to adapt to diverse user interactions and represent the hierarchical, context-dependent nature of scenes. In practice, a single pixel may belong to an object, part, or subpart depending on task or user intent; yet most segmentation models operate at a fixed level of abstraction and rely on rigid input modalities.This dissertation introduces segmentation methods that are hierarchical, interaction-aware, and user-centric. Motivated by practical research experiences in data annotation, human-computer interaction (HCI), and creative tools, the work addresses two key threads: (1) enabling flexible, multimodal user interaction in hybrid human-machine partnerships, and (2) modeling hierarchical relationships in natural images. The first thread (i.e., supporting human-machine partnerships) of this dissertation presents a new dataset and model supporting varied input types (e.g., clicks, scribbles, shapes), enabling more intuitive interactions without requiring explicit user annotations. Additionally, a weakly-supervised fine-tuning framework for interactive segmentation is presented in this dissertation and improves segmentation consistency across user inputs, reducing cognitive load in creative workflows. The second thread (i.e., modeling hierarchical relationships) introduces the first hierarchical semantic segmentation dataset with annotations at object, part, and subpart levels. Building on this, this dissertation proposes the first model that leverages specialized tokens within a large language model to capture “is-part-of” relationships in a single inference pass. Together, these contributions aim to reframe segmentation as a collaborative, context-aware process that better aligns with human perception and real-world needs. | |
| 653 | |a Artificial intelligence | ||
| 653 | |a Computer engineering | ||
| 653 | |a Computer science | ||
| 653 | |a Information technology | ||
| 773 | 0 | |t ProQuest Dissertations and Theses |g (2025) | |
| 786 | 0 | |d ProQuest |t ProQuest Dissertations & Theses Global | |
| 856 | 4 | 1 | |3 Citation/Abstract |u https://www.proquest.com/docview/3244235408/abstract/embedded/L8HZQI7Z43R0LA5T?source=fedsrch |
| 856 | 4 | 0 | |3 Full Text - PDF |u https://www.proquest.com/docview/3244235408/fulltextPDF/embedded/L8HZQI7Z43R0LA5T?source=fedsrch |