Skip to Main Content
Modern smartphones use heterogeneous multi-core SoC which includes CPU, GPU, DSP and various application-specific accelerators. It provides opportunities to realize compute-intensive applications on a battery-powered and resource-limited mobile device by assigning each sub-task to the most suitable computing core. To meet the performance requirement with minimized energy consumption, the algorithm also needs to be characterized to identify its adaptability to the performance and energy/power trade-off. In this paper, we use face recognition as an application driver and Nvidia's Tegra SoC/platform as a target platform to explore the strategies of application-to-platform mapping for energy minimization and performance optimization. We demonstrate that tuning the algorithms for the platform can significantly reduce the computational complexity to meet the real-time performance requirement with very little compromise in the recognition accuracy. We further demonstrate that utilizing the mobile GPU inside the Tegra SoC for feature extraction, the most compute-intensive task in this application, can achieve 51% reduction in runtime and 50% reduction in total energy consumption, in comparison with an implementation which uses the CPU only.