AI-Assisted Formal Specification and Verification in Dafny: A Multi-Provider VS Code Extension

Guardado en:
Detalles Bibliográficos
Publicado en:PQDT - Global (2025)
Autor principal: Honorato, Vinicius Correia
Publicado:
ProQuest Dissertations & Theses
Materias:
Acceso en línea:Citation/Abstract
Full Text - PDF
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!

MARC

LEADER 00000nab a2200000uu 4500
001 3275477232
003 UK-CbPIL
020 |a 9798265422071 
035 |a 3275477232 
045 2 |b d20250101  |b d20251231 
084 |a 189128  |2 nlm 
100 1 |a Honorato, Vinicius Correia 
245 1 |a AI-Assisted Formal Specification and Verification in Dafny: A Multi-Provider VS Code Extension 
260 |b ProQuest Dissertations & Theses  |c 2025 
513 |a Dissertation/Thesis 
520 3 |a Formal software verification represents an essential approach for ensuring correctness and reliability of critical systems, yet faces significant barriers due to the complexity of manually specifying method contracts, loop invariants, and other verification constructs. This dissertation addresses this limitation through the development of an advanced Visual Studio Code extension that automates the generation of formal specification and verification constructs using multiple artificial intelligence providers.The work extends the original implementation of a VSCode extension for Dafny, transforming an initial prototype into a robust and scalable platform for formal verification assistance. The solution implements a multi-provider architecture that integrates Artificial Intelligence services including OpenAI, Claude, and DeepSeek through a unified interface, with intelligent fallback mechanisms ensuring continuous availability even during individual service disruptions. Firebase integration enables dynamic configuration management and secure credential storage, making it suitable for both academic and industrial environments.In an experiment with a dataset of 100 programs in Dafny of varying complexity, a multiprovider ensemble combining Claude Sonnet 4, DeepSeek-V3 and GPT-4.1 achieved the best performance, being able to generate correct specification and verification constructs (pre/postconditions, loop invariants, auxiliary predicates and functions) for 80% of the programs in 1 attempt and 85% of the programs in 3 attempts. Generated pre/post-conditions are checked by the Dafny verifier against a set of test assertions. Generated loop invariants are checked against methods pre/post-conditions.Experimental validation through a controlled crossover design study involving 7 users and a total of 42 formal specification and verification tasks demonstrates the effectiveness of the proposed approach. Results reveal a System Usability Scale score of 74.6/100, indicating good user acceptance. Comparative analysis shows significantly superior success rates for AI-assisted approaches (85.7%) compared to manual methods (42.1%). Effort evaluation demonstrates 43.8% productivity gains, with average completion times of 6.25 minutes for tool-assisted approaches versus 11.12 minutes for manual methods.This work further demonstrates the viability of multi-provider architectures for AI-assisted development tools, and establishes replicable experimental methodologies for evaluating formal verification tools. The results suggest that intelligent integration of multiple AI providers can significantly improve the accessibility and effectiveness of formal software verification. 
653 |a Software development 
653 |a Application programming interface 
653 |a Writing 
653 |a Artificial intelligence 
653 |a Quality control 
653 |a Large language models 
653 |a Software engineering 
653 |a Prime numbers 
653 |a Mathematics 
653 |a Engineering 
773 0 |t PQDT - Global  |g (2025) 
786 0 |d ProQuest  |t ProQuest Dissertations & Theses Global 
856 4 1 |3 Citation/Abstract  |u https://www.proquest.com/docview/3275477232/abstract/embedded/75I98GEZK8WCJMPQ?source=fedsrch 
856 4 0 |3 Full Text - PDF  |u https://www.proquest.com/docview/3275477232/fulltextPDF/embedded/75I98GEZK8WCJMPQ?source=fedsrch