Abstract:
Diligent utilization of hardware resources when dealing with computationally intensive jobs like machine learning (ML) that have a huge scope of compiler optimizations ar...Show MoreMetadata
Abstract:
Diligent utilization of hardware resources when dealing with computationally intensive jobs like machine learning (ML) that have a huge scope of compiler optimizations are often neglected due to the complexity of its implementation. The main reasons for its complexity is the wide range of architectures and the difference between the development and deployment environments. This leads to poor utilization of resources such as memory, hardware and increased execution time. These problems can be tackled using Apache-TVM - a compiler specifically designed to tune and optimize machine-learning models for specific hardware. We have implemented matrix multiplication on two types of hardware, x86 and Hexagon Digital Signal Processor (DSP), and have optimized it for specific hardware. Apache-TVM also supports tuning of whole ML models by applying various graph-level and operator-level optimizations. TVM can also automate the optimization of low-level programs to specific hardware characteristics using autoTVM which is a cost-based model for exploration of the search space for code optimization. We have obtained a significant reduction of upto 32.32% for Emotion FerPlus model and more than 150 times for matrix multiplication on hexagon DSP in execution time without reducing the accuracy or the performance.
Published in: 2023 14th International Conference on Computing Communication and Networking Technologies (ICCCNT)
Date of Conference: 06-08 July 2023
Date Added to IEEE Xplore: 23 November 2023
ISBN Information: