In this paper, we present a novel systolic architecture for high-throughput computation of 3-dimensional (3-D) discrete wavelet transform (DWT). The entire 3-D DWT computation is decomposed into three distinct stages and implemented concurrently in a linear array of fully pipelined processing elements (PE). The proposed structure for 3-D DWT provides higher throughput than the existing architecture; and involves nearly half or less the number of multipliers and adders; and less on-chip memory (when normalized for unit throughput rate) than the other. Most importantly, the proposed one does not require any frame buffer unlike the other to perform inter-frame DWT computation. The proposed structure has a small latency and can perform 3-D DWT computation with 100% hardware unitization efficiency.