CodeLutra: Boosting LLM Code Generation via Preference-Guided Refinement

Guardado en:

Detalles Bibliográficos
Publicado en:	arXiv.org (Dec 19, 2024), p. n/a
Autor principal:	Leitian Tao
Otros Autores:	Chen, Xiang, Yu, Tong, Tung Mai, Rossi, Ryan, Li, Yixuan, Mitra, Saayan
Publicado:	Cornell University Library, arXiv.org
Materias:	Data analysis Source code Codes Bridge failure Large language models
Acceso en línea:	Citation/Abstract Full text outside of ProQuest
Etiquetas:	Agregar Etiqueta Sin Etiquetas, Sea el primero en etiquetar este registro!

MARC


LEADER	00000nab a2200000uu 4500
001	3126804423
003	UK-CbPIL
022			\|a 2331-8422
035			\|a 3126804423
045	0		\|b d20241219
100	1		\|a Leitian Tao
245	1		\|a CodeLutra: Boosting LLM Code Generation via Preference-Guided Refinement
260			\|b Cornell University Library, arXiv.org \|c Dec 19, 2024
513			\|a Working Paper
520	3		\|a Large Language Models (LLMs) have revolutionized code generation but require significant resources and often over-generalize, limiting their task-specific efficiency. Fine-tuning smaller, open-source LLMs provides a cost-effective alternative. However, standard supervised approaches rely only on correct examples, missing valuable insights from failures. We introduce CodeLutra, a framework that leverages both correct and incorrect code attempts. Instead of using only correct solutions, CodeLutra applies iterative preference-based refinement, comparing successful and failed outputs to better approximate desired results. This approach narrows the performance gap with state-of-the-art larger models without requiring massive datasets or auxiliary models. For instance, on a challenging data science coding task, using only 500 samples improved Llama-3-8B's accuracy from 28.2% to 48.6%, approaching GPT-4's level. By learning from both successes and mistakes, CodeLutra provides a scalable and efficient path to high-quality code generation, making smaller open-source models more competitive with leading closed-source alternatives.
653			\|a Data analysis
653			\|a Source code
653			\|a Codes
653			\|a Bridge failure
653			\|a Large language models
700	1		\|a Chen, Xiang
700	1		\|a Yu, Tong
700	1		\|a Tung Mai
700	1		\|a Rossi, Ryan
700	1		\|a Li, Yixuan
700	1		\|a Mitra, Saayan
773	0		\|t arXiv.org \|g (Dec 19, 2024), p. n/a
786	0		\|d ProQuest \|t Engineering Database
856	4	1	\|3 Citation/Abstract \|u https://www.proquest.com/docview/3126804423/abstract/embedded/ZKJTFFSVAI7CB62C?source=fedsrch
856	4	0	\|3 Full text outside of ProQuest \|u http://arxiv.org/abs/2411.05199