CNN Acceleration with Joint Optimization of Practical PIM and GPU on Embedded Devices | IEEE Conference Publication | IEEE Xplore