PROGRMR: Grammar-Based Input Generation with Programmable Annotations

Enregistré dans:
Détails bibliographiques
Publié dans:ProQuest Dissertations and Theses (2025)
Auteur principal: Hu, Andrew Lee
Publié:
ProQuest Dissertations & Theses
Sujets:
Accès en ligne:Citation/Abstract
Full Text - PDF
Tags: Ajouter un tag
Pas de tags, Soyez le premier à ajouter un tag!

MARC

LEADER 00000nab a2200000uu 4500
001 3278434895
003 UK-CbPIL
020 |a 9798265451545 
035 |a 3278434895 
045 2 |b d20250101  |b d20251231 
084 |a 66569  |2 nlm 
100 1 |a Hu, Andrew Lee 
245 1 |a PROGRMR: Grammar-Based Input Generation with Programmable Annotations 
260 |b ProQuest Dissertations & Theses  |c 2025 
513 |a Dissertation/Thesis 
520 3 |a Fuzzing is an important technique for generating inputs, and grammar-based fuzzing is used for constraining input generation for specialized domains that use context-free grammars. However, additional semantic constraints that are context-sensitive cannot be easily handled by grammar-based fuzzers. For example, in the C grammar, there is no way to specify that all variables must be defined before they are used.Inspired by attribute grammars, we propose a lightweight DSL called PROGRMR that can extend the expressiveness of a grammar using programmable annotations. These annotations introduce concepts from imperative programming languages, such as program state, preconditions, and postconditions, that allow users to constrain generation based on context. This enables PROGRMR to constrain future expansions of the derivation tree using context from what has been expanded so far and provides developers with a familiar interface for writing semantic constraints. It can then be compiled into a custom input generator capable of producing well-formed and diverse inputs efficiently.We evaluated PROGRMR against the grammar-only generator Grammarinator and the SMT-based constrained input generator ISLa, and showed that PROGRMR is able to compactly express semantic constraints and achieves high throughput and diversity. Across five input domains of Scriptsize-C, CSV, MLIR, Restructured Text, and XML that contain semantic constraints beyond the expressibility of context-free grammars, PROGRMR requires an average of 22.2 annotations to encode all semantic constraints.It generates fully well-formed inputs for all domains, with high throughput and input diversity. Compared to ISLa, PROGRMR achieves significant improvements, with up to 4016.60× higher throughput on the CSV domain and 29.24× more diversity on the Scriptsize-C domain. 
653 |a Computer science 
653 |a Engineering 
653 |a Computer engineering 
773 0 |t ProQuest Dissertations and Theses  |g (2025) 
786 0 |d ProQuest  |t ProQuest Dissertations & Theses Global 
856 4 1 |3 Citation/Abstract  |u https://www.proquest.com/docview/3278434895/abstract/embedded/6A8EOT78XXH2IG52?source=fedsrch 
856 4 0 |3 Full Text - PDF  |u https://www.proquest.com/docview/3278434895/fulltextPDF/embedded/6A8EOT78XXH2IG52?source=fedsrch