Performance Optimization of Machine Learning Inference under Latency and Server Power Constraints

Performance Optimization of Machine Learning Inference under Latency and Server Power Constraints | IEEE Conference Publication | IEEE Xplore