Discrete wavelet transform (DWT) is increasingly recognized in image/video compression standards, as indicated by its use in JPEG2000. The lifting scheme algorithm is an alternative DWT implementation that has a lower computational complexity. In this paper, a new high performance lifting-based architecture is presented for the 9/7 DWT engine. The proposed architecture has a balanced pipeline and improves both the computational error and hardware complexity for any given working frequency. In the proposed architecture, the constant coefficients are modified by introducing new variables to the conventional lifting structure to minimize hardware cost and computational error, imposed by quantization of coefficients. Simulation results indicate a quality improvement of up to 15 dB when compared to an architecture using the standard coefficients that has the same hardware cost and working frequency. Similarly, the hardware cost is reduced by about 20% when both architectures deliver the same PSNR when operating at the same frequency.