Skip to Main Content
In contrast to the macroblock-based in-loop deblocking filters, the filters of VC-1 perform all horizontal edges (for in-loop filtering) or vertical edges (for overlap smoothing) first and then the vertical edges (for in-loop filtering) or horizontal edges (for overlap smoothing) within a frame, field, or slice. These two filters of VC-1 perform filtering operations on many edges among reconstructed blocks in different processing orders. The entire procedure is very time-consuming and involves high memory access loading for the whole system. This paper analyzes the behavior of VC-1 filters and presents several efficient methods and an integrated architecture design, which involves an overlapped 12times12 block that combines overlap smoothing with in-loop filtering for performance and cost by sharing circuits and system resources. In order to go a step further to efficiently utilize system resources, this paper also presents two other efficient methods, multiple processing order and modified chrominance processing order, which greatly reduce external memory cycles and on-chip memory size for filtered and temporal reconstructed pixels. The specification of the proposed architecture implemented with TSMC 90-nm multithreshold voltage technique has capability to process HDTV1080p 30-fps video and HDTV 2048 × 1536 24-fps video at 200 MHz. The same concept is applicable to other video processing algorithms, especially in deblocking filter for video post-processing in a frame-based order.