VLSI Implementation of Pipelined PE Systolic Array-Based 3x3 Matrix Multiplication for Deep Neural Network Accelerator | IEEE Conference Publication | IEEE Xplore