Deep neural networks provide magnificent results for vision tasks. However, their complex structures and numerous parameters make them computationally and memory-wise intensive. Execution of these models on resource-constrained, camera-equipped devices such as smartphones (edge devices) is challenging due to memory constraints and prolonged runtime caused by limited computational resources. On the other hand, communicating data to cloud servers to execute the models raises privacy concerns, increases transmission costs, and results in extended runtime. This thesis aims to provide hierarchical, edge-cloud solutions to address the privacy, runtime, and communication cost issues with minimal pressure on edge devices. This goal is pursued through the introduction of novel edge processes, termed smart edge processing. First, considering the growing demand for AI inference as a service, we address privacy concerns related to model inference on cloud servers. We propose a method that removes sensitive content from visual signals while preserving task-relevant information in the user's edge device that acquires the data. This is achieved by an adversarially trained optical convolution kernel that filters out the sensitive content before it reaches the sensor, making it irretrievable by privacy attacks. Moreover, this optical convolution imposes no additional computational and memory burden on the edge device. Our experiments show that this method reduces sensitive content by approximately 65% with negligible performance impact on the user's task. Second, we consider that many users rely on cloud services to train their vision models. Training on cloud servers distant from the edge devices that acquire the data increases communication costs, training runtime, and privacy risks. To tackle this issue, we leverage the idea of early exiting to propose a novel hierarchical training method that divides the backward pass between an edge device and a cloud server. In contrast to existing methods, our approach leverages both edge and cloud workers concurrently, avoids transmitting private input images to the cloud, and eliminates the need for communication during the backward pass. Extensive experiments show this method reduces training runtime by 60%-80% with minimal accuracy loss. Third, although the proposed hierarchical training method reduces privacy concerns by avoiding raw image data transmission to the cloud, it may still be vulnerable to attacks that infer the sensitive content or reconstruct the original input from the communicated feature maps. We address this challenge by benefiting from the ideas of adversarial training for privacy and early exits together. We implement adversarial early exits to remove sensitive content at the edge and transmit task-relevant information for training to the cloud. Moreover, we incorporate noise addition during training to achieve a differential privacy guarantee. Our results show that this method effectively removes sensitive content in face datasets while causing less than 3% degradation in task accuracy. In summary, the proposed smart processes at the edge make the hierarchical edge-cloud system a promising solution for executing deep neural networks, addressing the challenges of both cloud and edge computing paradigms. The long-term prospect of this work is towards artificial intelligence models that can be easily implemented on users' devices while safely benefiting from cloud services.
EPFL_TH10729.pdf
Main Document
openaccess
N/A
29.69 MB
Adobe PDF
bb05adf1a72f6915c5e3c2b3a212db8e