Re-architecting the on-chip memory sub-system of machine-learning accelerator for embedded devices | IEEE Conference Publication | IEEE Xplore