CodeLutra: Boosting LLM Code Generation via Preference-Guided Refinement

Guardado en:
Detalles Bibliográficos
Publicado en:arXiv.org (Dec 19, 2024), p. n/a
Autor principal: Leitian Tao
Otros Autores: Chen, Xiang, Yu, Tong, Tung Mai, Rossi, Ryan, Li, Yixuan, Mitra, Saayan
Publicado:
Cornell University Library, arXiv.org
Materias:
Acceso en línea:Citation/Abstract
Full text outside of ProQuest
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!

MARC

LEADER 00000nab a2200000uu 4500
001 3126804423
003 UK-CbPIL
022 |a 2331-8422 
035 |a 3126804423 
045 0 |b d20241219 
100 1 |a Leitian Tao 
245 1 |a CodeLutra: Boosting LLM Code Generation via Preference-Guided Refinement 
260 |b Cornell University Library, arXiv.org  |c Dec 19, 2024 
513 |a Working Paper 
520 3 |a Large Language Models (LLMs) have revolutionized code generation but require significant resources and often over-generalize, limiting their task-specific efficiency. Fine-tuning smaller, open-source LLMs provides a cost-effective alternative. However, standard supervised approaches rely only on correct examples, missing valuable insights from failures. We introduce CodeLutra, a framework that leverages both correct and incorrect code attempts. Instead of using only correct solutions, CodeLutra applies iterative preference-based refinement, comparing successful and failed outputs to better approximate desired results. This approach narrows the performance gap with state-of-the-art larger models without requiring massive datasets or auxiliary models. For instance, on a challenging data science coding task, using only 500 samples improved Llama-3-8B's accuracy from 28.2% to 48.6%, approaching GPT-4's level. By learning from both successes and mistakes, CodeLutra provides a scalable and efficient path to high-quality code generation, making smaller open-source models more competitive with leading closed-source alternatives. 
653 |a Data analysis 
653 |a Source code 
653 |a Codes 
653 |a Bridge failure 
653 |a Large language models 
700 1 |a Chen, Xiang 
700 1 |a Yu, Tong 
700 1 |a Tung Mai 
700 1 |a Rossi, Ryan 
700 1 |a Li, Yixuan 
700 1 |a Mitra, Saayan 
773 0 |t arXiv.org  |g (Dec 19, 2024), p. n/a 
786 0 |d ProQuest  |t Engineering Database 
856 4 1 |3 Citation/Abstract  |u https://www.proquest.com/docview/3126804423/abstract/embedded/ZKJTFFSVAI7CB62C?source=fedsrch 
856 4 0 |3 Full text outside of ProQuest  |u http://arxiv.org/abs/2411.05199