CodeLutra: Boosting LLM Code Generation via Preference-Guided Refinement
Guardado en:
| Publicado en: | arXiv.org (Dec 19, 2024), p. n/a |
|---|---|
| Autor principal: | |
| Otros Autores: | , , , , , |
| Publicado: |
Cornell University Library, arXiv.org
|
| Materias: | |
| Acceso en línea: | Citation/Abstract Full text outside of ProQuest |
| Etiquetas: |
Sin Etiquetas, Sea el primero en etiquetar este registro!
|
MARC
| LEADER | 00000nab a2200000uu 4500 | ||
|---|---|---|---|
| 001 | 3126804423 | ||
| 003 | UK-CbPIL | ||
| 022 | |a 2331-8422 | ||
| 035 | |a 3126804423 | ||
| 045 | 0 | |b d20241219 | |
| 100 | 1 | |a Leitian Tao | |
| 245 | 1 | |a CodeLutra: Boosting LLM Code Generation via Preference-Guided Refinement | |
| 260 | |b Cornell University Library, arXiv.org |c Dec 19, 2024 | ||
| 513 | |a Working Paper | ||
| 520 | 3 | |a Large Language Models (LLMs) have revolutionized code generation but require significant resources and often over-generalize, limiting their task-specific efficiency. Fine-tuning smaller, open-source LLMs provides a cost-effective alternative. However, standard supervised approaches rely only on correct examples, missing valuable insights from failures. We introduce CodeLutra, a framework that leverages both correct and incorrect code attempts. Instead of using only correct solutions, CodeLutra applies iterative preference-based refinement, comparing successful and failed outputs to better approximate desired results. This approach narrows the performance gap with state-of-the-art larger models without requiring massive datasets or auxiliary models. For instance, on a challenging data science coding task, using only 500 samples improved Llama-3-8B's accuracy from 28.2% to 48.6%, approaching GPT-4's level. By learning from both successes and mistakes, CodeLutra provides a scalable and efficient path to high-quality code generation, making smaller open-source models more competitive with leading closed-source alternatives. | |
| 653 | |a Data analysis | ||
| 653 | |a Source code | ||
| 653 | |a Codes | ||
| 653 | |a Bridge failure | ||
| 653 | |a Large language models | ||
| 700 | 1 | |a Chen, Xiang | |
| 700 | 1 | |a Yu, Tong | |
| 700 | 1 | |a Tung Mai | |
| 700 | 1 | |a Rossi, Ryan | |
| 700 | 1 | |a Li, Yixuan | |
| 700 | 1 | |a Mitra, Saayan | |
| 773 | 0 | |t arXiv.org |g (Dec 19, 2024), p. n/a | |
| 786 | 0 | |d ProQuest |t Engineering Database | |
| 856 | 4 | 1 | |3 Citation/Abstract |u https://www.proquest.com/docview/3126804423/abstract/embedded/ZKJTFFSVAI7CB62C?source=fedsrch |
| 856 | 4 | 0 | |3 Full text outside of ProQuest |u http://arxiv.org/abs/2411.05199 |