By Topic

Circuits and Systems for Video Technology, IEEE Transactions on

Issue 4 • Date April 2009

Filter Results

Displaying Results 1 - 19 of 19
  • Table of contents

    Page(s): C1
    Save to Project icon | Request Permissions | PDF file iconPDF (126 KB)  
    Freely Available from IEEE
  • IEEE Transactions on Circuits and Systems for Video Technology publication information

    Page(s): C2
    Save to Project icon | Request Permissions | PDF file iconPDF (39 KB)  
    Freely Available from IEEE
  • Efficient and Low-Complexity Surveillance Video Compression Using Backward-Channel Aware Wyner-Ziv Video Coding

    Page(s): 453 - 465
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1713 KB) |  | HTML iconHTML  

    Video surveillance has been widely used in recent years to enhance public safety and privacy protection. A video surveillance system that deals with content analysis and activity monitoring needs efficient transmission and storage of the surveillance video data. Video compression techniques can be used to achieve this goal by reducing the size of the video with no or small quality loss. State-of-the-art video compression methods such as H.264/AVC often lead to high computational complexity at the encoder, which is generally implemented in a video camera in a surveillance system. This can significantly increase the cost of a surveillance system, especially when a mass deployment of end cameras is needed. In this paper, we discuss the specific considerations for surveillance video compression. We present a surveillance video compression system with low-complexity encoder based on Wyner-Ziv coding principles to address the tradeoff between computational complexity and coding efficiency. In addition, we propose a backward-channel aware Wyner-Ziv (BCAWZ) video coding approach to improve the coding efficiency while maintaining low complexity at the encoder. The experimental results show that for surveillance video contents, BCAWZ can achieve significantly higher coding efficiency than H.264/AVC intra coding as well as existing Wyner-Ziv video coding methods and is close to H.264/AVC inter coding, while maintaining similar coding complexity with intra coding. This shows that the low motion characteristics of many surveillance video contents and the low-complexity encoding requirement make our scheme a particularly suitable candidate for surveillance video compression. We further propose an error resilience scheme for BCAWZ to address the concern of reliable transmission in the backward-channel, which is essential to the quality of video data for real-time and reliable object detection and event analysis. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A Configurable Motion Estimation Architecture for Block-Matching Algorithms

    Page(s): 466 - 477
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (550 KB) |  | HTML iconHTML  

    This paper introduces a configurable motion estimation architecture for a wide range of fast block-matching algorithms (BMAs). Contemporary motion estimation architectures are either too rigid for multiple BMAs or the flexibility in them is implemented at the cost of reduced performance. The proposed architecture overcomes both of these limitations. The configurability of the proposed architecture is based on a new BMA framework that can be adjusted to support the desired set of BMAs. The chosen framework configuration is implemented by an intelligent control logic which is integrated to an efficient parallel memory system and distortion computation unit. The flexibility of the framework is demonstrated by mapping five different BMAs (BBGDS, DS, CDS, HEXBS, and TSS) to the architecture. The total execution time of the mapped BMAs is shown to be almost directly proportional to the number of tested checking points in the search area, so the architecture is very tolerant of different BMA-specific search strategies and search patterns. In addition, a run-time switching between supported BMAs can be done without performance compromises. With a 0.13-mum CMOS technology, the proposed architecture configured for HEXBS, BBGDS, and TSS requires only 14.2 kgates and 2.5 KB of memory at 200 MHz operating frequency. A performance comparison to the reference programmable architectures reveals that only the proposed implementation is able to process real-time (30 fps) fixed block-size motion estimation (1 reference frame) at full HDTV resolution (1920 times1080). View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Complexity-Constrained H.264 Video Encoding

    Page(s): 477 - 490
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1490 KB) |  | HTML iconHTML  

    In this paper, a joint complexity-distortion optimization approach is proposed for real-time H.264 video encoding under the power-constrained environment. The power consumption is first translated to the encoding computation costs measured by the number of scaled computation units consumed by basic operations. The solved problem is then specified to be the allocation and utilization of the computational resources. A computation allocation model (CAM) with virtual computation buffers is proposed to optimally allocate the computational resources to each video frame. In particular, the proposed CAM and the traditional hypothetical reference decoder model have the same temporal phase in operations. Further, to fully utilize the allocated computational resources, complexity-configurable motion estimation (CAME) and complexity-configurable mode decision (CAMD) algorithms are proposed for H.264 video encoding. In particular, the CAME is performed to select the path of motion search at the frame level, and the CAMD is performed to select the order of mode search at the macroblock level. Based on the hierarchical adjusting approach, the adaptive allocation of computational resources and the fine scalability of complexity control can be achieved. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Fast Mode Decision for H.264/AVC Based on Macroblock Motion Activity

    Page(s): 491 - 499
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (546 KB) |  | HTML iconHTML  

    The intra-mode and inter-mode predictions have been made available in H.264/AVC for effectively improving coding efficiency. However, exhaustively checking for all the prediction modes for identifying the best one (commonly referred to as exhaustive mode decision) greatly increases computational complexity. In this paper, a fast mode decision algorithm, called the motion activity-based mode decision (MAMD), is proposed to speed up the encoding process by reducing the number of modes required to be checked in a hierarchical manner, and is as follows. For each macroblock, the proposed MAMD algorithm always starts with checking the rate-distortion (RD) cost computed at the SKIP mode for a possible early termination, once the RD cost value is below a predetermined ldquolowrdquo threshold. On the other hand, if the RD cost exceeds another ldquohighrdquo threshold, then this indicates that only the intra modes are worthwhile to be checked. If the computed RD cost falls between the above-mentioned two thresholds, the remaining seven modes, which are classified into three motion activity classes in our work, will be examined, and only one of the three classes will be chosen for further mode checking. The above-mentioned motion activity can be quantitatively measured, which is equal to the maximum city-block length of the motion vector taken from a set of adjacent macroblocks (i.e., region of support, ROS). This measurement is then used to determine the most possible motion-activity class for the current macroblock. Experimental results have shown that, on average, the proposed MAMD algorithm reduces the computational complexity by 62.96%, while incurring only 0.059 dB loss in PSNR (peak signal-to-noise ratio) and 0.19% increment on the total bit rate compared to that of exhaustive mode decision, which is a default approach set in the JM reference software. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • H.264/Advanced Video Coding (AVC) Backward-Compatible Bit-Depth Scalable Coding

    Page(s): 500 - 510
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2173 KB) |  | HTML iconHTML  

    This paper presents a bit-depth scalable coding solution that is compatible with the scalable extension of H.264/advanced video coding (AVC), also referred to as scalable video coding (SVC). The proposed solution is capable of providing an 8-bit AVC main profile or high-profile base layer-coded bitstream multiplexed with a higher bit-depth-enhancement layer coded bitstream generated through macroblock level inter-layer bit-depth prediction. New decoding processes for inter-layer prediction are introduced to enable bit-depth scalability. Compatibility with other types of scalability in the SVC standard-temporal, spatial, and SNR scalability-is ensured. It also supports the single-loop decoding required in the SVC specification. Furthermore, it supports adaptive inter-layer prediction to determine whether or not the inter-layer bit-depth prediction shall be invoked. This solution is implemented on the basis of the SVC reference software Joint Scalable Video Model version 8.12. Experimental results are presented on 8-bit to 10-bit bit-depth scalability and also combined bit-depth and spatial scalability. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Multiple Description Video Coding Based on Hierarchical B Pictures

    Page(s): 511 - 521
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1089 KB) |  | HTML iconHTML  

    A general multiple description video coding (MDVC) framework based on hierarchical B pictures is proposed in this paper. Two or more descriptions are generated by employing the hierarchical B pictures of H.264/AVC scalable extension, where temporal-level-based key pictures are selected in a staggered way among different descriptions. Based on this hierarchical and staggered structure, inter-description redundancy control is studied to achieve a good central/side-distortion-rate tradeoff. Moreover, to better exploit multiple complementary descriptions, a linear combination of received descriptions is employed to optimize decoding results. This proposed MDVC framework is H.264/AVC-compliant for each temporal scalable description. Some existing temporal-splitting MDVC techniques can be considered as a degraded case in the proposed structure. Experimental results validate the effectiveness of the proposed design for MDVC. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Region-Level Motion-Based Foreground Segmentation Under a Bayesian Network

    Page(s): 522 - 532
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1192 KB) |  | HTML iconHTML  

    This paper presents a probabilistic approach for automatically segmenting foreground objects from a video sequence. In order to save computation time and be robust to noise effects, a region detection algorithm incorporating edge information is first proposed to identify the regions of interest, within which the spatial relationships are represented by a region adjacency graph. Next, we consider the motion of the foreground objects and, hence, utilize the temporal coherence property in the regions detected. Thus, the foreground segmentation problem is formulated as follows. Given two consecutive image frames and the segmentation result priorly obtained, we simultaneously estimate the motion vector field and the foreground segmentation mask in a mutually supporting manner by maximizing the conditional joint probability density function of these two elements. To represent the conditional joint probability density function in a compact form, a Bayesian network is adopted, which is derived to model the interdependency of these two elements. Experimental results for several video sequences are provided to demonstrate the effectiveness of the proposed approach. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Virtual View Specification and Synthesis for Free Viewpoint Television

    Page(s): 533 - 546
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2428 KB) |  | HTML iconHTML  

    Free viewpoint television (FTV) is a new concept that aims at giving viewers the flexibility to select a novel viewpoint by employing multiple video streams as the input. Current proposed solutions for FTV include those based on ray-space resampling which demand at least dozens of cameras and large storage and transmission resources for those video streams. Image-based rendering (IBR) methods that rely on dense depth map estimation also face practical difficulties since accurate depth map estimation remains a challenging problem. This paper proposes a framework for FTV based on IBR that relieves the need for an accurate depth map by introducing a hybrid virtual view synthesis method. The framework also includes an intuitive method for virtual view specification in uncalibrated views. We present both simulation and real data experiments to validate the proposed framework and the component algorithms. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Trajectory Tree as an Object-Oriented Hierarchical Representation for Video

    Page(s): 547 - 560
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1857 KB) |  | HTML iconHTML  

    This paper presents the trajectory tree as a hierarchical region-based representation for video sequences. Motion, as well as spatial features from multiple frames are used to generate a set of temporal regions structured within a hierarchy of scale and motion coherency. The resulting representations offer a global description of the entire video sequence and enhance semantic analysis potential. A multiscale segmentation strategy is proposed whereby region-merging criteria of progressively greater complexity are used to define partition layers of increasing aptitude for object detection. A novel data structure, called the trajectory adjacency graph, is defined for the long-term analysis of partition sequences. Furthermore, mechanisms for assessing connectivity, verifying temporal continuity, and proposing merging operations based on color, affine, and translational motion homogeneity characteristics over the entire sequence are also introduced. Finally, as demonstrated through experimental results, the trajectory tree offers a concise yet versatile support for video object segmentation, description and retrieval tasks. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Error Propagation Algorithm for Reduction of Errors Due to Total Load and Line Load in a Plasma Panel Display

    Page(s): 561 - 573
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1527 KB) |  | HTML iconHTML  

    The displayed gray level of a plasma display panel (PDP) is often different from the intended target gray level because the displayed level is affected by the load of the panel. The effect of the load on the displayed gray level is decomposed into three factors: total load; line load magnitude; and line load distribution. Previous attempts to reduce this difference (i.e., error) have achieved only limited success because they have not taken three factors into consideration altogether. This paper proposes a new method that compensates for the error caused by all three factors. The proposed method first attempts to compensate for the error caused by line load magnitude and distribution. To this end, the error generated by one sub-field of a PDP is propagated to and compensated for by the remaining sub-fields. The amount of the error is reduced for every incidence of error propagation and, consequently, the final amount of error is reduced to a negligible level. To reduce the computational complexity for the evaluation of the amount of the error caused by the line load distribution, an iterative method is proposed to derive the effect of line load distribution on each cell from its adjacent cell. When sub-field coding is completed for a whole cells on a panel, total load of each sub-field is obtained and the error caused by total load is compensated for by controlling the number of sustain pulses for each sub-field. A significant error reduction is achieved, which shown by simulations and experiments with a 42-in. PDP set, including 91.2% reduction of the mean absolute error obtained through simulations and 69.7% reduction of luminance variation range caused by the load variation in experiments. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Image Enhancement for Backlight-Scaled TFT-LCD Displays

    Page(s): 574 - 583
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2242 KB) |  | HTML iconHTML  

    One common way to extend the battery life of a portable device is to reduce the LCD backlight intensity. In contrast to previous approaches that minimize the power consumption by adjusting the backlight intensity frame by frame to reach a specified image quality, the proposed method optimizes the image quality for a given backlight intensity. Image is enhanced by performing brightness compensation and local contrast enhancement. For brightness compensation, global image statistics and backlight level are considered to maintain the overall brightness of the image. For contrast enhancement, the local contrast property of human visual system (HVS) is exploited to enhance the local image details. In addition, a brightness prediction scheme is proposed to speed up the algorithm for display of video sequences. Experimental results are presented to show the performance of the algorithm. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Efficient Fast 1-D 8 ,\times, 8 Inverse Integer Transform for VC-1 Application

    Page(s): 584 - 590
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (150 KB) |  | HTML iconHTML  

    In this paper, the one-dimensional (1-D) fast 8times8 inverse integer transform algorithm for Windows Media Video 9 (WMV-9/VC-1) is proposed. Based on the symmetric property of the integer transform matrix and the matrix operations, which denote the row/column permutations and the matrix decompositions, the efficient fast 1-D 8times8 inverse integer transform is developed. Therefore, the computational complexities of the proposed fast inverse transform are smaller than those of the direct method and the previous fast method. With low complexity, the proposed fast algorithm is suitable to accelerate the video coding computations. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Efficient Implementation Techniques for Maximum Likelihood-Based Error Correction for JPEG2000

    Page(s): 591 - 596
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (347 KB) |  | HTML iconHTML  

    Features of the JPEG2000 compression standard include coding efficiency at low bit rates. However, its compressed bit stream is sensitive to transmission error. This paper presents three techniques to reduce both the computational complexity and the memory requirement in the ternary MQ arithmetic decoding. Such coders introduce a controlled degree of redundancy during the encoding process, which can be exploited at the decoder side in order to detect and correct errors. Our proposed techniques result in a substantial saving of decoding time and memory usage, with no or little degradation in the PSNR metric. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Fast and Robust Face Detection on a Parallel Optimized Architecture Implemented on FPGA

    Page(s): 597 - 602
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (662 KB) |  | HTML iconHTML  

    In this paper, we present a parallel architecture for fast and robust face detection implemented on FPGA hardware. We propose the first implementation that meets both real-time requirements in an embedded context and face detection robustness within complex backgrounds. The chosen face detection method is the Convolutional Face Finder (CFF) algorithm, which consists of a pipeline of convolution and subsampling operations, followed by a multilayer perceptron. We present the design methodology of our face detection processor element (PE). This methodology was followed in order to optimize our implementation in terms of memory usage and parallelization efficiency. We then built a parallel architecture composed of a PE ring and an FIFO memory, resulting in a scalable system capable of processing images of different sizes. A ring of 25 PEs running at 80 MHz is able to process 127 QVGA images per second and performing real-time face detection on VGA images (35 images per second). View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Temporal Feature Modulation for Video Watermarking

    Page(s): 603 - 608
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (328 KB) |  | HTML iconHTML  

    We propose two temporal feature modulation algorithms that extract a feature from each video frame and modulate the features for a series of frames to embed a watermark codeword. In the first algorithm, the existence of a frame is used as the frame feature, and a watermark codeword is embedded into the original video by skipping selected frames. In the second algorithm, the centers of gravity of blocks in a frame are used as the frame feature. By modifying the centers of gravity, we embed 1-bit information into the frame. Simulation results demonstrate that the proposed algorithms are robust against compression and temporal attacks. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • IEEE Circuits and Systems Society Information

    Page(s): C3
    Save to Project icon | Request Permissions | PDF file iconPDF (33 KB)  
    Freely Available from IEEE
  • IEEE Transactions on Circuits and Systems for Video Technology Information for authors

    Page(s): C4
    Save to Project icon | Request Permissions | PDF file iconPDF (33 KB)  
    Freely Available from IEEE

Aims & Scope

The emphasis is focused on, but not limited to:
1. Video A/D and D/ A
2. Video Compression Techniques and Signal Processing
3. Multi-Dimensional Filters and Transforms
4. High Speed Real-Tune Circuits
5. Multi-Processors Systems—Hardware and Software
6. VLSI Architecture and Implementation for Video Technology 

 

Full Aims & Scope

Meet Our Editors

Editor-in-Chief
Dan Schonfeld
Multimedia Communications Laboratory
ECE Dept. (M/C 154)
University of Illinois at Chicago (UIC)
Chicago, IL 60607-7053
tcsvt-eic@tcad.polito.it

Managing Editor
Jaqueline Zelkowitz
tcsvt@tcad.polito.it