Cloud versus Edge Deployment Strategies of Real-Time Face Recognition Inference
Ref: CISTER-TR-210203 Publication Date: 8, Feb, 2021
Cloud versus Edge Deployment Strategies of Real-Time Face Recognition InferenceRef: CISTER-TR-210203 Publication Date: 8, Feb, 2021
In this paper, we present a real-world case study on deploying a face recognition application, using MTCNN detector and FaceNet recognizer. We report the challenges faced to decide on the best deployment strategy. We propose three inference architectures for the deployment, including cloud-based, edge-based, and hybrid. Furthermore, we evaluate the performance of face recognition inference on different cloud-based and edge-based GPU platforms. We consider different types of Jetson boards for the edge, and various GPUs for the cloud. We also investigate the effect of deep learning model optimization using TensorRT and TFLite compared to a standard Tensorflow GPU model, and the effect of input resolution. We provide a benchmarking study for all these devices in terms of frame per second, execution times, energy and memory usages. After conducting a total of 294 experiments, the results demonstrate that the TensorRT optimization provides the fastest execution on all cloud and edge devices, at the expense of a significantly larger energy consumption (up to +40% and +35% for edge and cloud devices respectively, compared to Tensorflow). Whereas TFLite is the most efficient framework in terms of memory and power consumption, while providing significantly less (-4% to -62%) processing acceleration than TensorRT.
Published in IEEE Transactions on Network Science and Engineering, IEEE.
Notes: Early Access Open-source results and interactive dashboards available on this link: https://www.riotu-lab.org/face/