Exploring the Use of LLMs for SQL Equivalence Checking

Збережено в:

Бібліографічні деталі
Опубліковано в::	arXiv.org (Dec 7, 2024), p. n/a
Автор:	Singh, Rajat
Інші автори:	Bedathur, Srikanta
Опубліковано:	Cornell University Library, arXiv.org
Предмети:	Data processing Prompt engineering Large language models Queries Equivalence Query languages Human performance
Онлайн доступ:	Citation/Abstract Full text outside of ProQuest
Теги:	Додати тег Немає тегів, Будьте першим, хто поставить тег для цього запису!

MARC


LEADER	00000nab a2200000uu 4500
001	3142731798
003	UK-CbPIL
022			\|a 2331-8422
035			\|a 3142731798
045	0		\|b d20241207
100	1		\|a Singh, Rajat
245	1		\|a Exploring the Use of LLMs for SQL Equivalence Checking
260			\|b Cornell University Library, arXiv.org \|c Dec 7, 2024
513			\|a Working Paper
520	3		\|a Equivalence checking of two SQL queries is an intractable problem encountered in diverse contexts ranging from grading student submissions in a DBMS course to debugging query rewriting rules in an optimizer, and many more. While a lot of progress has been made in recent years in developing practical solutions for this problem, the existing methods can handle only a small subset of SQL, even for bounded equivalence checking. They cannot support sophisticated SQL expressions one encounters in practice. At the same time, large language models (LLMs) -- such as GPT-4 -- have emerged as power generators of SQL from natural language specifications. This paper explores whether LLMs can also demonstrate the ability to reason with SQL queries and help advance SQL equivalence checking. Towards this, we conducted a detailed evaluation of several LLMs over collections with SQL pairs of varying levels of complexity. We explored the efficacy of different prompting techniques, the utility of synthetic examples & explanations, as well as logical plans generated by query parsers. Our main finding is that with well-designed prompting using an unoptimized SQL Logical Plan, LLMs can perform equivalence checking beyond the capabilities of current techniques, achieving nearly 100% accuracy for equivalent pairs and up to 70% for non-equivalent pairs of SQL queries. While LLMs lack the ability to generate formal proofs, their synthetic examples and human-readable explanations offer valuable insights to students (& instructors) in a classroom setting and to database administrators (DBAs) managing large database installations. Additionally, we also show that with careful fine-tuning, we can close the performance gap between smaller (and efficient) models and larger models such as GPT, thus paving the way for potential LLM-integration in standalone data processing systems.
653			\|a Data processing
653			\|a Prompt engineering
653			\|a Large language models
653			\|a Queries
653			\|a Equivalence
653			\|a Query languages
653			\|a Human performance
700	1		\|a Bedathur, Srikanta
773	0		\|t arXiv.org \|g (Dec 7, 2024), p. n/a
786	0		\|d ProQuest \|t Engineering Database
856	4	1	\|3 Citation/Abstract \|u https://www.proquest.com/docview/3142731798/abstract/embedded/ZKJTFFSVAI7CB62C?source=fedsrch
856	4	0	\|3 Full text outside of ProQuest \|u http://arxiv.org/abs/2412.05561