Processing math: 100%
Towards Real-Time Inference Offloading With Distributed Edge Computing: The Framework and Algorithms | IEEE Journals & Magazine | IEEE Xplore

Towards Real-Time Inference Offloading With Distributed Edge Computing: The Framework and Algorithms


Abstract:

By combining edge computing and parallel computing, distributed edge computing has emerged as a new paradigm to exploit the booming IoT devices at the edge. To accelerate...Show More

Abstract:

By combining edge computing and parallel computing, distributed edge computing has emerged as a new paradigm to exploit the booming IoT devices at the edge. To accelerate computation at the edge, i.e., the inference tasks for DNN-driven applications, the parallelism of both computation and communication needs to be considered for distributed edge computing, and thus, the problem of Minimum Latency joint Communication and Computation Scheduling (MLCCS) is proposed. However, existing works have rigid assumptions that the communication time of each device is fixed and the workload can be split arbitrarily small. Aiming at making the work more practical and general, the MLCCS problem without the above assumptions is studied in this paper. First, the MLCCS problem under a general model is formulated and proved to be NP-hard. Second, a pyramid-based computing model is proposed to consider the parallelism of communication and computation jointly, which has an approximation ratio of 1+\delta, where \delta is related to devices’ communication rates. An interesting property under such a computing model is identified and proved, i.e., the optimal latency can be obtained under arbitrary scheduling order when all the devices share the same communication rate. When the workload cannot be split arbitrarily, an approximation algorithm with a ratio of at most 2\cdot (1+\delta) is proposed. Additionally, for handling the dynamically changing network scenarios, several algorithms are also proposed accordingly. Finally, the theoretical analysis and simulation results verify that the proposed algorithm has high performance in terms of latency. Two testbed experiments are also conducted, which show that the proposed method outperforms the existing methods, reducing the latency by up to 29.2% for inference tasks at the edge.
Published in: IEEE Transactions on Mobile Computing ( Volume: 23, Issue: 7, July 2024)
Page(s): 7552 - 7571
Date of Publication: 28 November 2023

ISSN Information:

Funding Agency:


I. Introduction

Recently, by combining edge computing and parallel computing, Distributed Edge Computing (DEC) has emerged as a promising paradigm to exploit the computation resource of the booming IoT devices connected at the network edge [1], [2], [3], [4]. This novel computing paradigm is originated from the following twofold facts. On one hand, due to the emerging applications requiring for low-latency and heavyweight communication services, more and more tasks need to be computed at network edge which is close to the users [5], [6], [7]. On the other hand, it is enabled by the proliferation of the smart IoT devices with computation resources at network edge, which are connected wirelessly in a distributed manner and are not fully utilized. Meanwhile, many computation tasks and workloads can be divided and distributed to multiple edge devices to perform computing cooperatively in parallel. For instance, in the application of graphic rendering [8] and video inference [9], [10], the graphics and video can be partitioned into several segments and then distributed to the idle devices nearby to accelerate the computation process. Considering the example shown in Fig. 1, where a user wants to find out whether a person appears in a long video captured by a remote camera monitoring system. To execute such a computation-intensive task, it will be quite beneficial to split the video to multiple segments and distribute them into the idle nearby edge devices to compute in parallel to reduce the latency.

Contact IEEE to Subscribe

References

References is not available for this document.