Static Code Analysis: On Detection of Security Vulnerabilities and Classification of Warning Messages

Đã lưu trong:
Chi tiết về thư mục
Xuất bản năm:ProQuest Dissertations and Theses (2015)
Tác giả chính: Perhinschi, Andrei M.
Được phát hành:
ProQuest Dissertations & Theses
Những chủ đề:
Truy cập trực tuyến:Citation/Abstract
Full Text - PDF
Các nhãn: Thêm thẻ
Không có thẻ, Là người đầu tiên thẻ bản ghi này!

MARC

LEADER 00000nab a2200000uu 4500
001 1682044860
003 UK-CbPIL
020 |a 978-1-321-73041-8 
035 |a 1682044860 
045 0 |b d20150101 
084 |a 66569  |2 nlm 
100 1 |a Perhinschi, Andrei M. 
245 1 |a Static Code Analysis: On Detection of Security Vulnerabilities and Classification of Warning Messages 
260 |b ProQuest Dissertations & Theses  |c 2015 
513 |a Dissertation/Thesis 
520 3 |a This thesis addresses several aspects of using static code analysis tools for detection of security vulnerabilities and faults within source code. First, the performance of three widely used static code analysis tools with respect to detection of security vulnerabilities is evaluated. This is done with the help of a large benchmarking suite designed to test static code analysis tools' performance regarding security vulnerabilities. The performance of the three tools is also evaluated using three open source software projects with known security vulnerabilities. The main results of the first part of this thesis showed that the three evaluated tools do not have significantly different performance in detecting security vulnerabilities. 27% of C/C++ vulnerabilities along with 11% of Java vulnerabilities were not detected by any of the three tools. Furthermore, overall recall values for all three tools were close to or below 50% indicating performance comparable or worse than random guessing. These results were corroborated by the tools' performance on the three real software projects. The second part of this thesis is focused on machine-learning based classification of messages extracted from static code analysis reports. This work is based on data from five real NASA software projects. A classifier is trained on increasing percentages of labeled data in order to emulate an on-going analysis effort for each of the five datasets. Results showed that classification performance is highly dependent on the distribution of true and false positives among source code files. One of the five datasets yielded good predictive classification regarding true positives. One more dataset led to acceptable performance, while the remaining three datasets failed to yield good results. Investigating the distribution of true and false positives revealed that messages were classified successfully when either only real faults and/or only false faults were clustered in files or were flagged by the same checker. The high percentages of false positive singletons (files or checkers that produced 0 true positives and 1 false negative) were found to negatively affect the classifier's performance. 
653 |a Computer science 
773 0 |t ProQuest Dissertations and Theses  |g (2015) 
786 0 |d ProQuest  |t ProQuest Dissertations & Theses Global 
856 4 1 |3 Citation/Abstract  |u https://www.proquest.com/docview/1682044860/abstract/embedded/L8HZQI7Z43R0LA5T?source=fedsrch 
856 4 0 |3 Full Text - PDF  |u https://www.proquest.com/docview/1682044860/fulltextPDF/embedded/L8HZQI7Z43R0LA5T?source=fedsrch