NUDIF: A Non-Uniform Deployment Framework for Distributed Inference in Heterogeneous Edge Clusters

Guardado en:
Detalles Bibliográficos
Publicado en:Future Internet vol. 17, no. 4 (2025), p. 168
Autor principal: Li, Peng
Otros Autores: Chen, Qing, Liu, Hao
Publicado:
MDPI AG
Materias:
Acceso en línea:Citation/Abstract
Full Text + Graphics
Full Text - PDF
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
Descripción
Resumen:Distributed inference in resource-constrained heterogeneous edge clusters is fundamentally limited by disparities in device capabilities and load imbalance issues. Existing methods predominantly focus on optimizing single-pipeline allocation schemes for partitioned sub-models. However, such approaches often lead to load imbalance and suboptimal resource utilization under concurrent batch processing scenarios. To address these challenges, we propose a non-uniform deployment inference framework (NUDIF), which achieves high-throughput distributed inference service by adapting to heterogeneous resources and balancing inter-stage processing capabilities. Formulated as a mixed-integer nonlinear programming (MINLP) problem, NUDIF is responsible for planning the number of instances for each sub-model and determining the specific devices for deploying these instances, while considering computational capacity, memory constraints, and communication latency. This optimization minimizes inter-stage processing discrepancies and maximizes resource utilization. Experimental evaluations demonstrate that NUDIF enhances system throughput by an average of 9.95% compared to traditional single-pipeline optimization methods under various scales of cluster device configurations.
ISSN:1999-5903
DOI:10.3390/fi17040168
Fuente:ABI/INFORM Global