Distributed deep learning inference in fog networks
Loading...
URL
Journal Title
Journal ISSN
Volume Title
Perustieteiden korkeakoulu |
Master's thesis
Unless otherwise stated, all rights belong to the author. You may download, display and print this publication for Your own personal use. Commercial use is prohibited.
Authors
Date
2020-08-18
Department
Major/Subject
Security and Cloud Computing
Mcode
SCI3084
Degree programme
Master’s Programme in Security and Cloud Computing (SECCLO)
Language
en
Pages
82
Series
Abstract
Today's smart devices are equipped with powerful integrated chips and built-in heterogeneous sensors that can leverage their potential to execute heavy computation and produce a large amount of sensor data. For instance, modern smart cameras integrate artificial intelligence to capture images that detect any objects in the scene and change parameters, such as contrast and color based on environmental conditions. The accuracy of the object recognition and classification achieved by intelligent applications has improved due to recent advancements in artificial intelligence (AI) and machine learning (ML), particularly, deep neural networks (DNNs). Despite the capability to carry out some AI/ML computation, smart devices have limited battery power and computing resources. Therefore, DNN computation is generally offloaded to powerful computing nodes such as cloud servers. However, it is challenging to satisfy latency, reliability, and bandwidth constraints in cloud-based AI. Thus, in recent years, AI services and tasks have been pushed closer to the end-users by taking advantage of the fog computing paradigm to meet these requirements. Generally, the trained DNN models are offloaded to the fog devices for DNN inference. This is accomplished by partitioning the DNN and distributing the computation in fog networks. This thesis addresses offloading DNN inference by dividing and distributing a pre-trained network onto heterogeneous embedded devices. Specifically, it implements the adaptive partitioning and offloading algorithm based on matching theory proposed in an article, titled "Distributed inference acceleration with adaptive dnn partitioning and offloading". The implementation was evaluated in a fog testbed, including Nvidia Jetson nano devices. The obtained results show that the adaptive solution outperforms other schemes (Random and Greedy) with respect to computation time and communication latency.Description
Supervisor
Di Francesco, MarioThesis advisor
Mohammed, ThahaKeywords
DNN inference, task partitioning, task offloading, distributed algorithm, DNN framework and architectures