Preserving Privacy in Software Composition Analysis: A Study of Technical Solutions and Enhancements

Saved in:
Bibliographic Details
Published in:arXiv.org (Dec 1, 2024), p. n/a
Main Author: Wang, Huaijin
Other Authors: Liu, Zhibo, Dai, Yanbo, Wang, Shuai, Tang, Qiyi, Nie, Sen, Wu, Shi
Published:
Cornell University Library, arXiv.org
Subjects:
Online Access:Citation/Abstract
Full text outside of ProQuest
Tags: Add Tag
No Tags, Be the first to tag this record!

MARC

LEADER 00000nab a2200000uu 4500
001 3138990945
003 UK-CbPIL
022 |a 2331-8422 
035 |a 3138990945 
045 0 |b d20241201 
100 1 |a Wang, Huaijin 
245 1 |a Preserving Privacy in Software Composition Analysis: A Study of Technical Solutions and Enhancements 
260 |b Cornell University Library, arXiv.org  |c Dec 1, 2024 
513 |a Working Paper 
520 3 |a Software composition analysis (SCA) denotes the process of identifying open-source software components in an input software application. SCA has been extensively developed and adopted by academia and industry. However, we notice that the modern SCA techniques in industry scenarios still need to be improved due to privacy concerns. Overall, SCA requires the users to upload their applications' source code to a remote SCA server, which then inspects the applications and reports the component usage to users. This process is privacy-sensitive since the applications may contain sensitive information, such as proprietary source code, algorithms, trade secrets, and user data. Privacy concerns have prevented the SCA technology from being used in real-world scenarios. Therefore, academia and the industry demand privacy-preserving SCA solutions. For the first time, we analyze the privacy requirements of SCA and provide a landscape depicting possible technical solutions with varying privacy gains and overheads. In particular, given that de facto SCA frameworks are primarily driven by code similarity-based techniques, we explore combining several privacy-preserving protocols to encapsulate the similarity-based SCA framework. Among all viable solutions, we find that multi-party computation (MPC) offers the strongest privacy guarantee and plausible accuracy; it, however, incurs high overhead (184 times). We optimize the MPC-based SCA framework by reducing the amount of crypto protocol transactions using program analysis techniques. The evaluation results show that our proposed optimizations can reduce the MPC-based SCA overhead to only 8.5% without sacrificing SCA's privacy guarantee or accuracy. 
653 |a Demand analysis 
653 |a Trade secrets 
653 |a Similarity 
653 |a Algorithms 
653 |a Source code 
653 |a Industrial development 
653 |a Privacy 
653 |a Open source software 
653 |a Composition 
653 |a User requirements 
653 |a Landscape preservation 
700 1 |a Liu, Zhibo 
700 1 |a Dai, Yanbo 
700 1 |a Wang, Shuai 
700 1 |a Tang, Qiyi 
700 1 |a Nie, Sen 
700 1 |a Wu, Shi 
773 0 |t arXiv.org  |g (Dec 1, 2024), p. n/a 
786 0 |d ProQuest  |t Engineering Database 
856 4 1 |3 Citation/Abstract  |u https://www.proquest.com/docview/3138990945/abstract/embedded/ZKJTFFSVAI7CB62C?source=fedsrch 
856 4 0 |3 Full text outside of ProQuest  |u http://arxiv.org/abs/2412.00898