Efficient 2-D line-based pipeline architecture for 5/3 and 9/7 DWT is proposed in this paper. The input sequence of the inter-row and intra-row samples is reordered and the folding technique is employed to reduce the hardware cost of the 1-D architecture, which achieves the critical path of one multiplier delay and can operate at over 180 MHz under SMIC 0.18um. For an N*N image, the 2-D DWT architecture requires only 3.5 N internal buffer for 5/3 DWT and 5.5 N for 9/7 DWT with delay registers of the 1-D DWT replaced by the temporal buffer. Compared with the others, the proposed architectures can achieve the same delay constraint with less arithmetic resource and internal buffer. The design is regular and well suited for VLSI implementation.