Shape-Constrained Food Image Generation and Mechanistic Insights Into GAN-Based Image-to-Image Translation

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	ProQuest Dissertations and Theses (2025)
1. Verfasser:	Chen, Guangzong
Veröffentlicht:	ProQuest Dissertations & Theses
Schlagworte:	Computer science Computer engineering Artificial intelligence Nutrition
Online-Zugang:	Citation/Abstract Full Text - PDF
Tags:	Tag hinzufügen Keine Tags, Fügen Sie das erste Tag hinzu!

Beschreibung
Abstract:	This dissertation presents research findings that significantly advance computer vision and artificial intelligence, specifically in automatic diet assessment. The work introduces novel methodologies across three key areas: robust image classification, a generative model for image-to-image translation, and a foundational analysis of generative adversarial networks (GANs). The dissertation proposes a novel GAN-based structure specifically designed for shape-preserving image-to-image translation of food images. This architecture, inspired by recent advancements, ensures that generated food images not only appear visually realistic but also maintain the essential shape and structure of the original food items. By integrating a specialized shape preservation module, this architecture enables the synthesis of diverse food images while retaining the original forms. It improves training data sets and improves downstream tasks such as food recognition and volume estimation in automated dietary assessment systems. This dissertation also provides a detailed analysis that clarifies the fundamental mechanisms that a vanilla GAN could effectively perform image-to-image translation tasks with appropriate loss functions. This investigation aligns with insights into the inherent relationship between GANs and autoencoders. It demonstrates how the adversarial training process compels the generator to learn mappings that preserve common structural and content features between the input and target domains. When properly configured with a sufficiently capable discriminator, the training process could work without complex additional penalty terms. This analysis highlights the powerful, often understated, role of core GAN components in facilitating image-to-image translation. Finally, a novel image classification algorithm is developed for the real-world dietary assessment, addressing the critical challenge of accurately identifying food items in complex images. This work aims to distinguish images that are particularly captured in low- and middle-income countries (LMICs). Building on existing work, this algorithm leverages a composite machine learning approach, combining deep neural networks (DNNs) with shallow learning networks (SLNs) via a probabilistic interface. This hybrid architecture effectively handles variations in illumination, resolution, and diverse food presentations. Significantly reduces the manual burden of data review by filtering non-food content. Collectively, this dissertation contributes significantly to enhancing the accuracy, efficiency, and fundamental understanding of AI-driven solutions for automatic dietary assessment.
ISBN:	9798293808908
Quelle:	ProQuest Dissertations & Theses Global