Abstract
Partitioning and deploying Deep Neural Networks (DNNs) across edge nodes may be used to meet performance objectives of applications. However, the failure of a single node may result in cascading failures that will adversely impact the delivery of the service and will result in failure to meet specific objectives. The impact of these failures needs to be minimised at runtime. Three techniques are explored in this paper, namely repartitioning, early-exit and skip-connection. When an edge node fails, the repartitioning technique will repartition and redeploy the DNN thus avoiding the failed nodes. The early-exit technique makes provision for a request to exit (early) before the failed node. The skip connection technique dynamically routes the request by skipping the failed nodes. This paper will leverage trade-offs in accuracy, end-to-end latency and downtime for selecting the best technique given user-defined objectives (accuracy, latency and downtime thresholds) when an edge node fails. To this end, CONTINUER is developed. Two key activities of the framework are estimating the accuracy and latency when using the techniques for distributed DNNs and selecting the best technique. It is demonstrated on a lab-based experimental testbed that CONTINUER estimates accuracy and latency when using the techniques with no more than an average error of 0.28% and 13.06%, respectively, and selects the suitable technique with a low overhead of no more than 16.82 milliseconds and an accuracy of up to 99.86%.
Original language | English |
---|---|
Title of host publication | Proceedings of the 2022 IEEE International Conference on Edge Computing and Communications (IEEE EDGE) |
Editors | Claudio Agostino Ardagna, Hongyi Bian, Carl K. Chang, Rong N. Chang, Ernesto Damiani, Gabriele Elia, Qiang He, Vicenç Puig, Robert Ward, Fatos Xhafa, Jia Zhang |
Publisher | Institute of Electrical and Electronics Engineers Inc. |
Pages | 143-152 |
Number of pages | 10 |
ISBN (Electronic) | 9781665481403 |
ISBN (Print) | 9781665481410 |
DOIs | |
Publication status | Published - 24 Aug 2022 |
Event | 2022 IEEE International Conference On Edge Computing & Communications - Barcelona, Spain Duration: 11 Jul 2022 → 15 Jul 2022 https://conferences.computer.org/services/2022/ |
Publication series
Name | IEEE International Conference on Edge Computing (EDGE) |
---|---|
ISSN (Print) | 2767-990X |
ISSN (Electronic) | 2767-9918 |
Conference
Conference | 2022 IEEE International Conference On Edge Computing & Communications |
---|---|
Abbreviated title | IEEE EDGE 2022 |
Country/Territory | Spain |
City | Barcelona |
Period | 11/07/2022 → 15/07/2022 |
Internet address |
Fingerprint
Dive into the research topics of 'CONTINUER: maintaining distributed DNN services during edge failures'. Together they form a unique fingerprint.Student theses
-
Strategies for maintaining efficiency of edge services
Abdul Majeed, A. (Author), Spence, I. (Supervisor) & Varghese, B. (Supervisor), Jul 2023Student thesis: Doctoral Thesis › Doctor of Philosophy
File