Dynamic DNN Decomposition for Lossless Synergistic Inference

Uloženo v:

Podrobná bibliografie
Vydáno v:	arXiv.org (Jan 15, 2021), p. n/a
Hlavní autor:	Zhang, Beibei
Další autoři:	Tian Xiang, Zhang, Hongxuan, Li, Te, Zhu, Shiqiang, Gu, Jianjun
Vydáno:	Cornell University Library, arXiv.org
Témata:	Parallel processing Data processing Servers Electronic devices Artificial neural networks Partitions Cloud computing Edge computing Nodes Inference Feature maps Algorithms Data transmission Vertical separation Automatic pilots Heuristic methods Computer networks Run time (computers) Decomposition
On-line přístup:	Citation/Abstract Full text outside of ProQuest
Tagy:	Přidat tag Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!

MARC


LEADER	00000nab a2200000uu 4500
001	2478673144
003	UK-CbPIL
022			\|a 2331-8422
035			\|a 2478673144
045	0		\|b d20210115
100	1		\|a Zhang, Beibei
245	1		\|a Dynamic DNN Decomposition for Lossless Synergistic Inference
260			\|b Cornell University Library, arXiv.org \|c Jan 15, 2021
513			\|a Working Paper
520	3		\|a Deep neural networks (DNNs) sustain high performance in today's data processing applications. DNN inference is resource-intensive thus is difficult to fit into a mobile device. An alternative is to offload the DNN inference to a cloud server. However, such an approach requires heavy raw data transmission between the mobile device and the cloud server, which is not suitable for mission-critical and privacy-sensitive applications such as autopilot. To solve this problem, recent advances unleash DNN services using the edge computing paradigm. The existing approaches split a DNN into two parts and deploy the two partitions to computation nodes at two edge computing tiers. Nonetheless, these methods overlook collaborative device-edge-cloud computation resources. Besides, previous algorithms demand the whole DNN re-partitioning to adapt to computation resource changes and network dynamics. Moreover, for resource-demanding convolutional layers, prior works do not give a parallel processing strategy without loss of accuracy at the edge side. To tackle these issues, we propose D3, a dynamic DNN decomposition system for synergistic inference without precision loss. The proposed system introduces a heuristic algorithm named horizontal partition algorithm to split a DNN into three parts. The algorithm can partially adjust the partitions at run time according to processing time and network conditions. At the edge side, a vertical separation module separates feature maps into tiles that can be independently run on different edge nodes in parallel. Extensive quantitative evaluation of five popular DNNs illustrates that D3 outperforms the state-of-the-art counterparts up to 3.4 times in end-to-end DNN inference time and reduces backbone network communication overhead up to 3.68 times.
653			\|a Parallel processing
653			\|a Data processing
653			\|a Servers
653			\|a Electronic devices
653			\|a Artificial neural networks
653			\|a Partitions
653			\|a Cloud computing
653			\|a Edge computing
653			\|a Nodes
653			\|a Inference
653			\|a Feature maps
653			\|a Algorithms
653			\|a Data transmission
653			\|a Vertical separation
653			\|a Automatic pilots
653			\|a Heuristic methods
653			\|a Computer networks
653			\|a Run time (computers)
653			\|a Decomposition
700	1		\|a Tian Xiang
700	1		\|a Zhang, Hongxuan
700	1		\|a Li, Te
700	1		\|a Zhu, Shiqiang
700	1		\|a Gu, Jianjun
773	0		\|t arXiv.org \|g (Jan 15, 2021), p. n/a
786	0		\|d ProQuest \|t Engineering Database
856	4	1	\|3 Citation/Abstract \|u https://www.proquest.com/docview/2478673144/abstract/embedded/7BTGNMKEMPT1V9Z2?source=fedsrch
856	4	0	\|3 Full text outside of ProQuest \|u http://arxiv.org/abs/2101.05952