LLM-Assisted Static Analysis for Detecting Security Vulnerabilities

Guardado en:

Bibliografiske detaljer
Udgivet i:	arXiv.org (Nov 11, 2024), p. n/a
Hovedforfatter:	Li, Ziyang
Andre forfattere:	Dutta, Saikat, Naik, Mayur
Udgivet:	Cornell University Library, arXiv.org
Fag:	Repositories Program verification (computers) Large language models Security Specifications Software Reasoning Human performance
Online adgang:	Citation/Abstract Full text outside of ProQuest
Tags:	Tilføj Tag Ingen Tags, Vær først til at tagge denne postø!

MARC


LEADER	00000nab a2200000uu 4500
001	3127999998
003	UK-CbPIL
022			\|a 2331-8422
035			\|a 3127999998
045	0		\|b d20241111
100	1		\|a Li, Ziyang
245	1		\|a LLM-Assisted Static Analysis for Detecting Security Vulnerabilities
260			\|b Cornell University Library, arXiv.org \|c Nov 11, 2024
513			\|a Working Paper
520	3		\|a Software is prone to security vulnerabilities. Program analysis tools to detect them have limited effectiveness in practice due to their reliance on human labeled specifications. Large language models (or LLMs) have shown impressive code generation capabilities but they cannot do complex reasoning over code to detect such vulnerabilities especially since this task requires whole-repository analysis. We propose IRIS, a neuro-symbolic approach that systematically combines LLMs with static analysis to perform whole-repository reasoning for security vulnerability detection. Specifically, IRIS leverages LLMs to infer taint specifications and perform contextual analysis, alleviating needs for human specifications and inspection. For evaluation, we curate a new dataset, CWE-Bench-Java, comprising 120 manually validated security vulnerabilities in real-world Java projects. A state-of-the-art static analysis tool CodeQL detects only 27 of these vulnerabilities whereas IRIS with GPT-4 detects 55 (+28) and improves upon CodeQL's average false discovery rate by 5% points. Furthermore, IRIS identifies 6 previously unknown vulnerabilities which cannot be found by existing tools.
653			\|a Repositories
653			\|a Program verification (computers)
653			\|a Large language models
653			\|a Security
653			\|a Specifications
653			\|a Software
653			\|a Reasoning
653			\|a Human performance
700	1		\|a Dutta, Saikat
700	1		\|a Naik, Mayur
773	0		\|t arXiv.org \|g (Nov 11, 2024), p. n/a
786	0		\|d ProQuest \|t Engineering Database
856	4	1	\|3 Citation/Abstract \|u https://www.proquest.com/docview/3127999998/abstract/embedded/L8HZQI7Z43R0LA5T?source=fedsrch
856	4	0	\|3 Full text outside of ProQuest \|u http://arxiv.org/abs/2405.17238