Advancing Software Development and Evolution Through Multimodal Learning

Guardado en:
Bibliografiske detaljer
Udgivet i:ProQuest Dissertations and Theses (2025)
Hovedforfatter: Yan, Yanfu
Udgivet:
ProQuest Dissertations & Theses
Fag:
Online adgang:Citation/Abstract
Full Text - PDF
Tags: Tilføj Tag
Ingen Tags, Vær først til at tagge denne postø!

MARC

LEADER 00000nab a2200000uu 4500
001 3244791941
003 UK-CbPIL
020 |a 9798291579282 
035 |a 3244791941 
045 2 |b d20250101  |b d20251231 
084 |a 66569  |2 nlm 
100 1 |a Yan, Yanfu 
245 1 |a Advancing Software Development and Evolution Through Multimodal Learning 
260 |b ProQuest Dissertations & Theses  |c 2025 
513 |a Dissertation/Thesis 
520 3 |a Software systems are integral to modern society, powering everything from mobile apps to enterprise platforms and critical infrastructure. As these systems grow in scale and complexity, ensuring their quality, reliability, and maintainability poses significant challenges. Key tasks, such as bug triaging, code search, and change impact analysis, require a deep understanding of diverse software artifacts, including source code, documentation, graphical user interfaces (GUIs), and structural dependencies. Traditional learning-based methods, often limited to single modalities, fall short in capturing the rich, interconnected context needed for effective analysis.This dissertation explores multimodal learning as a principled solution to these challenges. By combining diverse modalities, such as code, natural language, visual data, and program structure, this research advances the automation, accuracy, and robustness of software maintenance tasks. It leverages state-of-the-art deep learning models, including vision transformers, graph neural networks, and code language models, to construct rich representations of complex software systems.One line of work focuses on detecting duplicate video-based bug reports in GUI-centric mobile applications. It combines vision transformer-based scene understanding with sequential frame alignment to capture fine-grained visual, textual, and sequential patterns. Evaluated on an extended real-world benchmark, it improves detection accuracy by 9% over prior work, with added interpretability through hierarchical GUI representations. Another effort addresses change impact analysis without relying on historical or dynamic data. It fuses conceptual coupling, extracted via deep code embeddings, with structural dependencies from program dependence graphs, enabling more accurate and fine-grained predictions. A new benchmark built on untangled fine-grained commits demonstrates that the proposed approach outperforms state-of-the-art baselines by over 10%. Further, this dissertation investigates the reliability of pre-trained code models in the presence of out-of-distribution (OOD) inputs. As these models are deployed in open-world environments, unexpected inputs can lead to performance degradation. To mitigate this, the proposed multimodal OOD detection frameworks, COOD and COOD+, incorporate contrastive learning and rejection mechanisms across both code and comment modalities. These models effectively identify OOD inputs and recover downstream task performance, such as in code search, under OOD conditions.Together, these contributions show how multimodal learning can overcome limitations of traditional SE tools by capturing the full spectrum of software artifacts, code, text, visuals, and structural relations. The proposed models, systems, and benchmarks provide a foundation for more scalable, trustworthy, and context-aware intelligent software development tools. 
653 |a Computer science 
653 |a Computer engineering 
653 |a Artificial intelligence 
773 0 |t ProQuest Dissertations and Theses  |g (2025) 
786 0 |d ProQuest  |t ProQuest Dissertations & Theses Global 
856 4 1 |3 Citation/Abstract  |u https://www.proquest.com/docview/3244791941/abstract/embedded/L8HZQI7Z43R0LA5T?source=fedsrch 
856 4 0 |3 Full Text - PDF  |u https://www.proquest.com/docview/3244791941/fulltextPDF/embedded/L8HZQI7Z43R0LA5T?source=fedsrch