PhoneBit: Efficient GPU-Accelerated Binary Neural Network Inference Engine for Mobile Phones

Պահպանված է:
Մատենագիտական մանրամասներ
Հրատարակված է:arXiv.org (Dec 5, 2019), p. n/a
Հիմնական հեղինակ: Chen, Gang
Այլ հեղինակներ: He, Shengyu, Meng, Haitao, Huang, Kai
Հրապարակվել է:
Cornell University Library, arXiv.org
Խորագրեր:
Առցանց հասանելիություն:Citation/Abstract
Full text outside of ProQuest
Ցուցիչներ: Ավելացրեք ցուցիչ
Չկան պիտակներ, Եղեք առաջինը, ով նշում է այս գրառումը!

MARC

LEADER 00000nab a2200000uu 4500
001 2323281108
003 UK-CbPIL
022 |a 2331-8422 
035 |a 2323281108 
045 0 |b d20191205 
100 1 |a Chen, Gang 
245 1 |a PhoneBit: Efficient GPU-Accelerated Binary Neural Network Inference Engine for Mobile Phones 
260 |b Cornell University Library, arXiv.org  |c Dec 5, 2019 
513 |a Working Paper 
520 3 |a Over the last years, a great success of deep neural networks (DNNs) has been witnessed in computer vision and other fields. However, performance and power constraints make it still challenging to deploy DNNs on mobile devices due to their high computational complexity. Binary neural networks (BNNs) have been demonstrated as a promising solution to achieve this goal by using bit-wise operations to replace most arithmetic operations. Currently, existing GPU-accelerated implementations of BNNs are only tailored for desktop platforms. Due to architecture differences, mere porting of such implementations to mobile devices yields suboptimal performance or is impossible in some cases. In this paper, we propose PhoneBit, a GPU-accelerated BNN inference engine for Android-based mobile devices that fully exploits the computing power of BNNs on mobile GPUs. PhoneBit provides a set of operator-level optimizations including locality-friendly data layout, bit packing with vectorization and layers integration for efficient binary convolution. We also provide a detailed implementation and parallelization optimization for PhoneBit to optimally utilize the memory bandwidth and computing power of mobile GPUs. We evaluate PhoneBit with AlexNet, YOLOv2 Tiny and VGG16 with their binary version. Our experiment results show that PhoneBit can achieve significant speedup and energy efficiency compared with state-of-the-art frameworks for mobile devices. 
653 |a Energy management 
653 |a Parallel processing 
653 |a Wireless networks 
653 |a Neural networks 
653 |a Smartphones 
653 |a Electronic devices 
653 |a Convolution 
653 |a Optimization 
653 |a Mobile computing 
653 |a Inference 
653 |a Power management 
653 |a Vector processing (computers) 
653 |a Computer vision 
700 1 |a He, Shengyu 
700 1 |a Meng, Haitao 
700 1 |a Huang, Kai 
773 0 |t arXiv.org  |g (Dec 5, 2019), p. n/a 
786 0 |d ProQuest  |t Engineering Database 
856 4 1 |3 Citation/Abstract  |u https://www.proquest.com/docview/2323281108/abstract/embedded/7BTGNMKEMPT1V9Z2?source=fedsrch 
856 4 0 |3 Full text outside of ProQuest  |u http://arxiv.org/abs/1912.04050