PROGRMR: Grammar-Based Input Generation with Programmable Annotations

Sparad:
Bibliografiska uppgifter
I publikationen:ProQuest Dissertations and Theses (2025)
Huvudupphov: Hu, Andrew Lee
Utgiven:
ProQuest Dissertations & Theses
Ämnen:
Länkar:Citation/Abstract
Full Text - PDF
Taggar: Lägg till en tagg
Inga taggar, Lägg till första taggen!
Beskrivning
Abstrakt:Fuzzing is an important technique for generating inputs, and grammar-based fuzzing is used for constraining input generation for specialized domains that use context-free grammars. However, additional semantic constraints that are context-sensitive cannot be easily handled by grammar-based fuzzers. For example, in the C grammar, there is no way to specify that all variables must be defined before they are used.Inspired by attribute grammars, we propose a lightweight DSL called PROGRMR that can extend the expressiveness of a grammar using programmable annotations. These annotations introduce concepts from imperative programming languages, such as program state, preconditions, and postconditions, that allow users to constrain generation based on context. This enables PROGRMR to constrain future expansions of the derivation tree using context from what has been expanded so far and provides developers with a familiar interface for writing semantic constraints. It can then be compiled into a custom input generator capable of producing well-formed and diverse inputs efficiently.We evaluated PROGRMR against the grammar-only generator Grammarinator and the SMT-based constrained input generator ISLa, and showed that PROGRMR is able to compactly express semantic constraints and achieves high throughput and diversity. Across five input domains of Scriptsize-C, CSV, MLIR, Restructured Text, and XML that contain semantic constraints beyond the expressibility of context-free grammars, PROGRMR requires an average of 22.2 annotations to encode all semantic constraints.It generates fully well-formed inputs for all domains, with high throughput and input diversity. Compared to ISLa, PROGRMR achieves significant improvements, with up to 4016.60× higher throughput on the CSV domain and 29.24× more diversity on the Scriptsize-C domain.
ISBN:9798265451545
Källa:ProQuest Dissertations & Theses Global