An Empirical Evaluation of Pre-trained Large Language Models for Repairing Declarative Formal Specifications
Guardado en:
| Publicado en: | arXiv.org (Apr 17, 2024), p. n/a |
|---|---|
| Autor principal: | |
| Otros Autores: | , |
| Publicado: |
Cornell University Library, arXiv.org
|
| Materias: | |
| Acceso en línea: | Citation/Abstract Full text outside of ProQuest |
| Etiquetas: |
Sin Etiquetas, Sea el primero en etiquetar este registro!
|
MARC
| LEADER | 00000nab a2200000uu 4500 | ||
|---|---|---|---|
| 001 | 3040953706 | ||
| 003 | UK-CbPIL | ||
| 022 | |a 2331-8422 | ||
| 035 | |a 3040953706 | ||
| 045 | 0 | |b d20240417 | |
| 100 | 1 | |a Alhanahnah, Mohannad | |
| 245 | 1 | |a An Empirical Evaluation of Pre-trained Large Language Models for Repairing Declarative Formal Specifications | |
| 260 | |b Cornell University Library, arXiv.org |c Apr 17, 2024 | ||
| 513 | |a Working Paper | ||
| 520 | 3 | |a Automatic Program Repair (APR) has garnered significant attention as a practical research domain focused on automatically fixing bugs in programs. While existing APR techniques primarily target imperative programming languages like C and Java, there is a growing need for effective solutions applicable to declarative software specification languages. This paper presents a systematic investigation into the capacity of Large Language Models (LLMs) for repairing declarative specifications in Alloy, a declarative formal language used for software specification. We propose a novel repair pipeline that integrates a dual-agent LLM framework, comprising a Repair Agent and a Prompt Agent. Through extensive empirical evaluation, we compare the effectiveness of LLM-based repair with state-of-the-art Alloy APR techniques on a comprehensive set of benchmarks. Our study reveals that LLMs, particularly GPT-4 variants, outperform existing techniques in terms of repair efficacy, albeit with a marginal increase in runtime and token usage. This research contributes to advancing the field of automatic repair for declarative specifications and highlights the promising potential of LLMs in this domain. | |
| 653 | |a Large language models | ||
| 653 | |a Imperative programming | ||
| 653 | |a Repair | ||
| 653 | |a Software | ||
| 653 | |a Programming languages | ||
| 653 | |a Specification and description languages | ||
| 653 | |a Formal specifications | ||
| 700 | 1 | |a Hasan, Md Rashedul | |
| 700 | 1 | |a Bagheri, Hamid | |
| 773 | 0 | |t arXiv.org |g (Apr 17, 2024), p. n/a | |
| 786 | 0 | |d ProQuest |t Engineering Database | |
| 856 | 4 | 1 | |3 Citation/Abstract |u https://www.proquest.com/docview/3040953706/abstract/embedded/7BTGNMKEMPT1V9Z2?source=fedsrch |
| 856 | 4 | 0 | |3 Full text outside of ProQuest |u http://arxiv.org/abs/2404.11050 |