NUDIF: A Non-Uniform Deployment Framework for Distributed Inference in Heterogeneous Edge Clusters
Guardado en:
| Publicado en: | Future Internet vol. 17, no. 4 (2025), p. 168 |
|---|---|
| Autor principal: | |
| Otros Autores: | , |
| Publicado: |
MDPI AG
|
| Materias: | |
| Acceso en línea: | Citation/Abstract Full Text + Graphics Full Text - PDF |
| Etiquetas: |
Sin Etiquetas, Sea el primero en etiquetar este registro!
|
MARC
| LEADER | 00000nab a2200000uu 4500 | ||
|---|---|---|---|
| 001 | 3194606736 | ||
| 003 | UK-CbPIL | ||
| 022 | |a 1999-5903 | ||
| 024 | 7 | |a 10.3390/fi17040168 |2 doi | |
| 035 | |a 3194606736 | ||
| 045 | 2 | |b d20250401 |b d20250430 | |
| 084 | |a 231464 |2 nlm | ||
| 100 | 1 | |a Li, Peng |u National Key Laboratory of Complex Aviation System Simulation, Chengdu 610036, China; qingchen_1@cetc.com.cn | |
| 245 | 1 | |a NUDIF: A Non-Uniform Deployment Framework for Distributed Inference in Heterogeneous Edge Clusters | |
| 260 | |b MDPI AG |c 2025 | ||
| 513 | |a Journal Article | ||
| 520 | 3 | |a Distributed inference in resource-constrained heterogeneous edge clusters is fundamentally limited by disparities in device capabilities and load imbalance issues. Existing methods predominantly focus on optimizing single-pipeline allocation schemes for partitioned sub-models. However, such approaches often lead to load imbalance and suboptimal resource utilization under concurrent batch processing scenarios. To address these challenges, we propose a non-uniform deployment inference framework (NUDIF), which achieves high-throughput distributed inference service by adapting to heterogeneous resources and balancing inter-stage processing capabilities. Formulated as a mixed-integer nonlinear programming (MINLP) problem, NUDIF is responsible for planning the number of instances for each sub-model and determining the specific devices for deploying these instances, while considering computational capacity, memory constraints, and communication latency. This optimization minimizes inter-stage processing discrepancies and maximizes resource utilization. Experimental evaluations demonstrate that NUDIF enhances system throughput by an average of 9.95% compared to traditional single-pipeline optimization methods under various scales of cluster device configurations. | |
| 653 | |a Collaboration | ||
| 653 | |a Dynamic programming | ||
| 653 | |a Edge computing | ||
| 653 | |a Communication | ||
| 653 | |a Bandwidths | ||
| 653 | |a Optimization | ||
| 653 | |a Neural networks | ||
| 653 | |a Inference | ||
| 653 | |a Adaptation | ||
| 653 | |a Unmanned aerial vehicles | ||
| 653 | |a Linear programming | ||
| 653 | |a Batch processing | ||
| 653 | |a Algorithms | ||
| 653 | |a Mixed integer | ||
| 653 | |a Clusters | ||
| 653 | |a Resource utilization | ||
| 653 | |a Energy consumption | ||
| 653 | |a Large language models | ||
| 653 | |a Nonlinear programming | ||
| 653 | |a Load balancing | ||
| 700 | 1 | |a Chen, Qing |u National Key Laboratory of Complex Aviation System Simulation, Chengdu 610036, China; qingchen_1@cetc.com.cn | |
| 700 | 1 | |a Liu, Hao |u School of Computer Science (National Pilot Software Engineering School), Beijing University of Posts and Telecommunications (BUPT), Beijing 100876, China; liuhao@bupt.edu.cn | |
| 773 | 0 | |t Future Internet |g vol. 17, no. 4 (2025), p. 168 | |
| 786 | 0 | |d ProQuest |t ABI/INFORM Global | |
| 856 | 4 | 1 | |3 Citation/Abstract |u https://www.proquest.com/docview/3194606736/abstract/embedded/7BTGNMKEMPT1V9Z2?source=fedsrch |
| 856 | 4 | 0 | |3 Full Text + Graphics |u https://www.proquest.com/docview/3194606736/fulltextwithgraphics/embedded/7BTGNMKEMPT1V9Z2?source=fedsrch |
| 856 | 4 | 0 | |3 Full Text - PDF |u https://www.proquest.com/docview/3194606736/fulltextPDF/embedded/7BTGNMKEMPT1V9Z2?source=fedsrch |