Neural Modeling of Reasoning About Program Behaviors

Guardado en:
Bibliografiske detaljer
Udgivet i:ProQuest Dissertations and Theses (2025)
Hovedforfatter: Yadavally, Aashish
Udgivet:
ProQuest Dissertations & Theses
Fag:
Online adgang:Citation/Abstract
Full Text - PDF
Tags: Tilføj Tag
Ingen Tags, Vær først til at tagge denne postø!
Beskrivelse
Resumen:Programming languages, much like natural languages, exhibit a high degree of repetitiveness and regularity, often referred to as the naturalness of software. This characteristic, combined with the improved capabilities of neural language models (NLMs) to statistically learn from such patterns, has led to their widespread adoption in software engineering (SE) tasks ranging from code generation to automated bug detection and program repair. While these applications of automated software engineering offer a useful proxy for assessing the downstream performance of NLMs, their ability to reason about intrinsic program properties, such as structure, semantics, and execution behaviors, remains underexplored. This dissertation addresses this gap through the lens of program analysis, using the latter’s formalisms to probe the reasoning capabilities of NLMs over intrinsic program behaviors. In general, analyzing programs entails either examining all possible behaviors based on program semantics (i.e., static) or establishing precise execution behaviors by running the entire test suite (i.e., dynamic), each with trade-offs in generalizability and scalability. As an alternative, we introduce a new paradigm of predictive program analysis, which aims to learn to analyze program behaviors from similar analyses of open-source software repositories. This approximation helps extend such analyses to partial programs, enables a static estimation of runtime behaviors, and facilitates multilingual program analysis, all at scale. Using dependence analysis as a representative setting, this dissertation investigates how NLMs can model program structure, semantics, and execution behaviors across three key dimensions: (i) the granularity of dependencies, ranging from inter-statement and variable-statement to inter-constraint dependencies; (ii) nature of reasoning, spanning both static and dynamic program behaviors; and (iii) reasoning modality, which involves reasoning in the latent space or through verbalized natural language explanations. Overall, these contributions show that predictive analysis can generalize, bridging the gap between static and dynamic analysis, while offering insights into how language models internalize reasoning about program behaviors.
ISBN:9798265455963
Fuente:ProQuest Dissertations & Theses Global