We Have a Package for You! A Comprehensive Analysis of Package Hallucinations by Code Generating LLMs

Salvato in:
Dettagli Bibliografici
Pubblicato in:arXiv.org (Sep 24, 2024), p. n/a
Autore principale: Spracklen, Joseph
Altri autori: Wijewickrama, Raveen, A H M Nazmus Sakib, Maiti, Anindya, Viswanath, Bimal, Murtuza Jadliwala
Pubblicazione:
Cornell University Library, arXiv.org
Soggetti:
Accesso online:Citation/Abstract
Full text outside of ProQuest
Tags: Aggiungi Tag
Nessun Tag, puoi essere il primo ad aggiungerne!!

MARC

LEADER 00000nab a2200000uu 4500
001 3069344054
003 UK-CbPIL
022 |a 2331-8422 
035 |a 3069344054 
045 0 |b d20240924 
100 1 |a Spracklen, Joseph 
245 1 |a We Have a Package for You! A Comprehensive Analysis of Package Hallucinations by Code Generating LLMs 
260 |b Cornell University Library, arXiv.org  |c Sep 24, 2024 
513 |a Working Paper 
520 3 |a The reliance of popular programming languages such as Python and JavaScript on centralized package repositories and open-source software, combined with the emergence of code-generating Large Language Models (LLMs), has created a new type of threat to the software supply chain: package hallucinations. These hallucinations, which arise from fact-conflicting errors when generating code using LLMs, represent a novel form of package confusion attack that poses a critical threat to the integrity of the software supply chain. This paper conducts a rigorous and comprehensive evaluation of package hallucinations across different programming languages, settings, and parameters, exploring how a diverse set of models and configurations affect the likelihood of generating erroneous package recommendations and identifying the root causes of this phenomenon. Using 16 popular LLMs for code generation and two unique prompt datasets, we generate 576,000 code samples in two programming languages that we analyze for package hallucinations. Our findings reveal that that the average percentage of hallucinated packages is at least 5.2% for commercial models and 21.7% for open-source models, including a staggering 205,474 unique examples of hallucinated package names, further underscoring the severity and pervasiveness of this threat. To overcome this problem, we implement several hallucination mitigation strategies and show that they are able to significantly reduce the number of package hallucinations while maintaining code quality. Our experiments and findings highlight package hallucinations as a persistent and systemic phenomenon while using state-of-the-art LLMs for code generation, and a significant challenge which deserves the research community's urgent attention. 
653 |a Computer program integrity 
653 |a Supply chains 
653 |a Programming languages 
653 |a Python 
653 |a Large language models 
653 |a Open source software 
653 |a Software 
700 1 |a Wijewickrama, Raveen 
700 1 |a A H M Nazmus Sakib 
700 1 |a Maiti, Anindya 
700 1 |a Viswanath, Bimal 
700 1 |a Murtuza Jadliwala 
773 0 |t arXiv.org  |g (Sep 24, 2024), p. n/a 
786 0 |d ProQuest  |t Engineering Database 
856 4 1 |3 Citation/Abstract  |u https://www.proquest.com/docview/3069344054/abstract/embedded/H09TXR3UUZB2ISDL?source=fedsrch 
856 4 0 |3 Full text outside of ProQuest  |u http://arxiv.org/abs/2406.10279