An Empirical Evaluation of Pre-trained Large Language Models for Repairing Declarative Formal Specifications

Guardado en:
Detalles Bibliográficos
Publicado en:arXiv.org (Apr 17, 2024), p. n/a
Autor principal: Alhanahnah, Mohannad
Otros Autores: Hasan, Md Rashedul, Bagheri, Hamid
Publicado:
Cornell University Library, arXiv.org
Materias:
Acceso en línea:Citation/Abstract
Full text outside of ProQuest
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!

MARC

LEADER 00000nab a2200000uu 4500
001 3040953706
003 UK-CbPIL
022 |a 2331-8422 
035 |a 3040953706 
045 0 |b d20240417 
100 1 |a Alhanahnah, Mohannad 
245 1 |a An Empirical Evaluation of Pre-trained Large Language Models for Repairing Declarative Formal Specifications 
260 |b Cornell University Library, arXiv.org  |c Apr 17, 2024 
513 |a Working Paper 
520 3 |a Automatic Program Repair (APR) has garnered significant attention as a practical research domain focused on automatically fixing bugs in programs. While existing APR techniques primarily target imperative programming languages like C and Java, there is a growing need for effective solutions applicable to declarative software specification languages. This paper presents a systematic investigation into the capacity of Large Language Models (LLMs) for repairing declarative specifications in Alloy, a declarative formal language used for software specification. We propose a novel repair pipeline that integrates a dual-agent LLM framework, comprising a Repair Agent and a Prompt Agent. Through extensive empirical evaluation, we compare the effectiveness of LLM-based repair with state-of-the-art Alloy APR techniques on a comprehensive set of benchmarks. Our study reveals that LLMs, particularly GPT-4 variants, outperform existing techniques in terms of repair efficacy, albeit with a marginal increase in runtime and token usage. This research contributes to advancing the field of automatic repair for declarative specifications and highlights the promising potential of LLMs in this domain. 
653 |a Large language models 
653 |a Imperative programming 
653 |a Repair 
653 |a Software 
653 |a Programming languages 
653 |a Specification and description languages 
653 |a Formal specifications 
700 1 |a Hasan, Md Rashedul 
700 1 |a Bagheri, Hamid 
773 0 |t arXiv.org  |g (Apr 17, 2024), p. n/a 
786 0 |d ProQuest  |t Engineering Database 
856 4 1 |3 Citation/Abstract  |u https://www.proquest.com/docview/3040953706/abstract/embedded/7BTGNMKEMPT1V9Z2?source=fedsrch 
856 4 0 |3 Full text outside of ProQuest  |u http://arxiv.org/abs/2404.11050