Advancing Software Development and Evolution Through Multimodal Learning

Guardado en:

Detalles Bibliográficos
Publicado en:	ProQuest Dissertations and Theses (2025)
Autor principal:	Yan, Yanfu
Publicado:	ProQuest Dissertations & Theses
Materias:	Computer science Computer engineering Artificial intelligence
Acceso en línea:	Citation/Abstract Full Text - PDF
Etiquetas:	Agregar Etiqueta Sin Etiquetas, Sea el primero en etiquetar este registro!

Descripción
Resumen:	Software systems are integral to modern society, powering everything from mobile apps to enterprise platforms and critical infrastructure. As these systems grow in scale and complexity, ensuring their quality, reliability, and maintainability poses significant challenges. Key tasks, such as bug triaging, code search, and change impact analysis, require a deep understanding of diverse software artifacts, including source code, documentation, graphical user interfaces (GUIs), and structural dependencies. Traditional learning-based methods, often limited to single modalities, fall short in capturing the rich, interconnected context needed for effective analysis.This dissertation explores multimodal learning as a principled solution to these challenges. By combining diverse modalities, such as code, natural language, visual data, and program structure, this research advances the automation, accuracy, and robustness of software maintenance tasks. It leverages state-of-the-art deep learning models, including vision transformers, graph neural networks, and code language models, to construct rich representations of complex software systems.One line of work focuses on detecting duplicate video-based bug reports in GUI-centric mobile applications. It combines vision transformer-based scene understanding with sequential frame alignment to capture fine-grained visual, textual, and sequential patterns. Evaluated on an extended real-world benchmark, it improves detection accuracy by 9% over prior work, with added interpretability through hierarchical GUI representations. Another effort addresses change impact analysis without relying on historical or dynamic data. It fuses conceptual coupling, extracted via deep code embeddings, with structural dependencies from program dependence graphs, enabling more accurate and fine-grained predictions. A new benchmark built on untangled fine-grained commits demonstrates that the proposed approach outperforms state-of-the-art baselines by over 10%. Further, this dissertation investigates the reliability of pre-trained code models in the presence of out-of-distribution (OOD) inputs. As these models are deployed in open-world environments, unexpected inputs can lead to performance degradation. To mitigate this, the proposed multimodal OOD detection frameworks, COOD and COOD+, incorporate contrastive learning and rejection mechanisms across both code and comment modalities. These models effectively identify OOD inputs and recover downstream task performance, such as in code search, under OOD conditions.Together, these contributions show how multimodal learning can overcome limitations of traditional SE tools by capturing the full spectrum of software artifacts, code, text, visuals, and structural relations. The proposed models, systems, and benchmarks provide a foundation for more scalable, trustworthy, and context-aware intelligent software development tools.
ISBN:	9798291579282
Fuente:	ProQuest Dissertations & Theses Global