By Topic

Circuits and Systems for Video Technology, IEEE Transactions on

Issue 6 • Date June 2011

Filter Results

Displaying Results 1 - 20 of 20
  • Table of contents

    Page(s): C1
    Save to Project icon | Request Permissions | PDF file iconPDF (66 KB)  
    Freely Available from IEEE
  • IEEE Transactions on Circuits and Systems for Video Technology publication information

    Page(s): C2
    Save to Project icon | Request Permissions | PDF file iconPDF (41 KB)  
    Freely Available from IEEE
  • Rectification-Based View Interpolation and Extrapolation for Multiview Video Coding

    Page(s): 693 - 707
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1105 KB) |  | HTML iconHTML  

    In this paper, we first develop improved projective rectification-based view interpolation and extrapolation methods, and apply them to view synthesis prediction-based multiview video coding (MVC). A geometric model for these view synthesis methods is then developed. We also propose an improved model to study the rate-distortion (R-D) performances of various practical MVC schemes, including the current joint multiview video coding standard. Experimental results show that our schemes achieve superior view synthesis results, and can lead to better R-D performance in MVC. Simulation results with the theoretical models help explaining the experimental results. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Efficient Inter Mode Prediction Based on Model Selection and Rate Feedback for H.264/AVC

    Page(s): 708 - 716
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (697 KB) |  | HTML iconHTML  

    H.264/AVC is a standard developed for various low-complexity video applications and high-definition television. To improve coding performance, H.264/AVC may optionally adopt the rate-distortion optimization (RDO) method to find the best encoding mode among various inter and intra modes. However, the exhaustive RDO search among different modes increases the H.264/AVC encoder complexity and limits its application. In this paper, we propose an inter mode prediction algorithm for P slices based on spatial and temporal consistency analysis to reduce the complexity of the RDO computation. We apply the stochastic method to analyze the spatial consistency and use rate information for temporal consistency. The experimental results show a 0.03 peak signal-to-noise ratio loss, a 0.87% bit rate increase, and a 58.39% encoding time reduction on average. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • On Combining Fractional-Pixel Interpolation and Motion Estimation: A Cost-Effective Approach

    Page(s): 717 - 728
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (895 KB) |  | HTML iconHTML  

    The additional complexity of the adoption of fractional-pixel motion compensation technology arises from two aspects: fractional-pixel interpolation (FPI) and fractional-pixel motion estimation (FPME). Different from current fast algorithms, we use the internal link between FPME and FPI as a factor in considering optimization by integrally manipulating them rather than attempting to speed them up separately. In this paper, a refinement search order for FPME is proposed to satisfy the criteria of cost/performance efficiency. And then, some strategies, i.e., FPME skipping, early termination and search pattern pruning, are also given for reducing the number of search positions with negligible coding loss. We also propose a FPI algorithm to save redundant interpolation as well as reduce duplicate calculation. Experimental results show that our integrated algorithm significantly improves the overall speed of FPME and FPI. Compared with the FFPS+XFPI and CBFPS+XFPI, the proposed algorithm has already reduced the speed by a factor of 65% and 32%. Additionally, our FPI algorithm can be used to cooperate with any fast FPME algorithms to greatly reduce the computational time of FPI. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Rate Distortion Data Hiding of Motion Vector Competition Information in Chroma and Luma Samples for Video Compression

    Page(s): 729 - 741
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (568 KB) |  | HTML iconHTML  

    New standardization activities have been recently launched by the JCT-VC experts group in order to challenge the current video compression standard H.264/AVC. Several improvements of this standard, previously integrated in the JM key technical area software, are already known and gathered in the high efficiency video coding test model. In particular, competition-based motion vector prediction has proved its efficiency. However, the targeted 50% bitrate saving for equivalent quality is not yet achieved. In this context, this paper proposes to reduce the signaling information resulting from this motion vector competition, by using data hiding techniques. As data hiding and video compression traditionally have contradictory goals, an advanced study of data hiding schemes is first performed. Then, an original way of using data hiding for video compression is proposed. The main idea of this paper is to hide the competition index into appropriately selected chroma and luma transform coefficients. To minimize the prediction errors, the transform coefficients modification is performed via a rate-distortion optimization. The proposed scheme is evaluated on several low and high resolution sequences. Objective improvements (up to 2.40% bitrate saving) and subjective assessment of the chroma loss are reported. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Layer Assignment Based on Depth Data Distribution for Multiview-Plus-Depth Scalable Video Coding

    Page(s): 742 - 754
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1480 KB) |  | HTML iconHTML  

    Three dimensional (3-D) video is experiencing a rapid growth in a number of areas, including 3-D cinema, 3-D TV, and mobile phones. Several problems must be addressed to display captured 3-D video at another location. One problem is how to represent the data. The multiview plus depth representation of a scene requires a lower bit rate than transmitting all views required by an application and provides more information than a 2-D-plus-depth sequence. Another problem is how to handle transmission in a heterogeneous network. Scalable video coding enables adaption of a 3-D video sequence to the conditions at the receiver. In this paper, we present a scheme that combines scalability based on the position in depth of the data and the distance to the center view. The general scheme preserves the center view data, whereas the data of the remaining views are extracted in enhancement layers depending on distance to the viewer and to the center camera. The data is assigned into enhancement layers within a view based on depth data distribution. Strategies concerning the layer assignment between adjacent views are proposed. In general, each extracted enhancement layer increases the visual quality and peak signal-to-noise ratio compared to only using center view data. The bit-rate per layer can be further decreased if depth data is distributed over the enhancement layers. The choice of strategy to assign layers between adjacent views depends on whether quality of the fore-most objects in the scene or the quality of the views close to the center is important. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Down-Sampling Based Video Coding Using Super-Resolution Technique

    Page(s): 755 - 765
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1079 KB) |  | HTML iconHTML  

    It has been reported that oversampling a still image before compression does not guarantee a good image quality. Similarly, down-sampling before video compression in low bit rate video coding may alleviate the blocking effect and improve peak signal-to-noise ratio of the decoded frames. When the number of discrete cosine transform coefficients is reduced in such a down-sampling based coding (DBC), the bit budget of each coefficient will increase, thus reduce the quantization error. A DBC video coding scheme is proposed in this paper, where a super-resolution technique is employed to restore the down-sampled frames to their original resolutions. The performance improvement of the proposed DBC scheme is analyzed at low bit rates, and verified by experiments. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Advanced H.264/AVC-Based Perceptual Video Coding: Architecture, Tools, and Assessment

    Page(s): 766 - 782
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1241 KB) |  | HTML iconHTML  

    The characteristics of the human visual system may be further exploited in lossy video coding to improve the video compression efficiency beyond the state-of-the-art H.264/AVC standard. Although the literature is rich in solutions to model the human visual system characteristics, the performance and real benefits brought by these models have not been fully integrated and assessed yet. Moreover, the rate-distortion (RD) performance is usually measured by means of methodologies that do not account for the implicit variability of the observers when rating the video quality. In this context, the novelty brought by this paper is threefold: first, it proposes novel perceptual video coding tools, notably decoder side just noticeable distortion (JND) model estimation to perceptually allocate the available rate with the finest level of granularity while avoiding the extra rate associated to coding the varying quantization steps. Second, it proposes an integrated, powerful H.264/AVC-based perceptual video coding architecture embedding a state-of-the-art JND model based on spatio-temporal human visual system masking mechanisms; this model is exploited for both the aforementioned rate allocation as well as to perceptually weight the distortion used in the motion estimation and RD optimization. Finally, it proposes a relative assessment methodology to measure the RD performance of a perceptual video codec (PVC) with respect to another codec taken as reference. The methodology considers the implicit observers variability when rating video quality which leads to a nonlinear sensitivity of the objective metrics used for quality assessment. The obtained RD performance, measured according to this methodology, shows an average bitrate reduction of up to 30% when the proposed PVC is compared with the H.264/AVC High profile at the same objective quality level. Moreover, the proposed perceptual codec outperforms an alternative perceptual codec recently published in the literature. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Adaptive Lagrange Multiplier Selection Using Classification-Maximization and Its Application to Chroma QP Offset Decision

    Page(s): 783 - 791
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (360 KB) |  | HTML iconHTML  

    In this paper, we propose bit allocation between luma samples and chroma samples using chroma quantization parameter (QP) offsets for Cb and Cr. For this work, we propose an efficient adaptive Lagrange multiplier selection method using classification-maximization, and then apply the proposed adaptive Lagrange multiplier selection to decide chroma QP offsets for Cb and Cr. To our knowledge, this is the first proposal to adaptively decide chroma QP offsets. Because the default mapping function between a chroma QP and a luma QP in H.264 is unbalanced at especially low QPs, the proposed chroma QP offset decision achieves improvement up to 0.8 dB from the experimental results. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Algorithm and Architecture Design of Image Inpainting Engine for Video Error Concealment Applications

    Page(s): 792 - 803
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1783 KB) |  | HTML iconHTML  

    Error concealment techniques can improve subjective video quality in a decoder when the video bitstream is corrupted during transmission. In this paper, to achieve perceptually pleasant results, an image inpainting technique in which structure information generated from edge information is adopted as the spatial error concealment method. In addition, a modified boundary matching algorithm for temporal error concealment is proposed for temporal frames. To maintain low hardware costs as regards the error concealment engine, the processing iteration number of each macroblock is limited to four based on the proposed inpainting algorithm. Block-based pipeline scheduling is also proposed to reduce the number of processing cycles and the on-chip memory size. Moreover, a cache-based data reuse scheme is developed to reduce the processing cycles and external bandwidth. Moreover, the two concealment modes share the same computational core to reduce hardware costs. A prototype chip is implemented by using the UMC 90 nm process. The total gate count is approximately 121 k at 200 MHz. The maximum processing capability can support 244.8 k macroblocks per second or 1920 × 1080 4:2:0 30 Hz video. The core size is 1.30 × 1.30 mm2. The average power dissipation is 131.4 mW at 200 MHz. Compared to other error concealment methods, the proposed design can achieve better perceptual quality at an acceptable additional hardware cost. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Hierarchical Method for Foreground Detection Using Codebook Model

    Page(s): 804 - 815
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1405 KB) |  | HTML iconHTML  

    This paper presents a hierarchical scheme with block-based and pixel-based codebooks for foreground detection. The codebook is mainly used to compress information to achieve a high efficient processing speed. In the block-based stage, 12 intensity values are employed to represent a block. The algorithm extends the concept of the block truncation coding, and thus it can further improve the processing efficiency by enjoying its low complexity advantage. In detail, the block-based stage can remove most of the backgrounds without reducing the true positive rate, yet it has low precision. To overcome this problem, the pixel-based stage is adopted to enhance the precision, which also can reduce the false positive rate. Moreover, the short-term information is employed to improve background updating for adaptive environments. As documented in the experimental results, the proposed algorithm can provide superior performance to that of the former related approaches. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Cooperative Wireless Broadcast for Scalable Video Coding

    Page(s): 816 - 824
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (803 KB) |  | HTML iconHTML  

    A special type of cooperative wireless transmission scheme is proposed for the broadcasting of scalable video sources. In the proposed system, a transmitter with multiple antennas sends out signals by encoding layered space-time codes. The receiver alone without cooperation can decode the base layer to get the basic quality of the video. For the cooperative terminals, with the help of relayed information, the enhanced layers can be retrieved and then the visual quality can be refined. Simulation results with video bitstreams encoded by H.264/scalable video coding show that the proposed system can enhance the scalable functionality of the video. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An Efficient Pass-Parallel Architecture for Embedded Block Coder in JPEG 2000

    Page(s): 825 - 836
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1040 KB) |  | HTML iconHTML  

    The embedded block coding with optimized truncation (EBCOT) is a key algorithm in JPEG 2000 image compression system. Various applications, such as medical imaging, satellite imagery, digital cinema, and others, require high speed, high performance EBCOT architecture. Though efficient EBCOT architectures have been proposed, hardware requirement of these existing architectures is very high and throughput is low. To solve this problem, we investigated rate of concurrent context generation. Our paper revealed that in an image rate of four or more context pairs generation is about 68.9%. Therefore, to encode all samples in a stripe-column, concurrently a new technique named as compact context coding is devised. As a consequence, high throughput is attained and hardware requirement is also cut down. The performance of the matrix quantizer coder is improved by operating renormalization and byte out stages concurrently. The entire design of EBCOT encoder is tested on the field programmable gate array platform. The implementation results show that throughput of the proposed architecture is 163.59 MSamples/s which is equivalent to encoding 1920p (1920 × 1080, 4:2:2) high-definition TV picture sequence at 39 f/s. However, only bit plane coder (BPC) architecture operates at 315.06 MHz which implies that it is 2.86 times faster than the fastest BPC design available so far. Moreover, it is capable of encoding digital cinema size (2048 × 1080) at 42 f/s. Thus, it satisfies the requirement of applications like cartography, medical imaging, satellite imagery, and others, which demand high-speed real-time image compression system. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Low-Complexity Mode Decision for MVC

    Page(s): 837 - 843
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (290 KB) |  | HTML iconHTML  

    The finalized international standard for multiview video coding (MVC) is an extension of H.264. In the joint model of MVC, variable size motion estimation (ME) and disparity estimation (DE) are introduced to achieve the highest coding efficiency with the cost of very high computational complexity. A low complexity mode decision algorithm is proposed to reduce complexity of ME and DE. An experimental analysis is performed to study inter-view correlation in the coding information such as the prediction mode and rate-distortion (RD) cost. Based on the correlation, we propose four efficient mode decision techniques, including early SKIP mode decision, adaptive early termination, fast mode size decision, and selective intra coding in inter frame. Experimental results show that the proposed algorithm can significantly reduce computational complexity of MVC while maintaining almost the same RD performance. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Motion Adaptive Deinterlacing With Modular Neural Networks

    Page(s): 844 - 849
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (378 KB) |  | HTML iconHTML  

    In this letter, a motion adaptive deinterlacing algorithm based on modular neural networks is proposed. The proposed method uses different neural networks based on the amount of motion. Modular neural networks were selectively used depending on the differences between the adjacent fields. We also used motion vectors to select optimal input pixels from the adjacent fields. Motion estimation was used to find input blocks for the neural networks with minimum errors. Intra/inter-mode switching was employed to address inaccurate motion estimation problems. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Why we joined ... [advertisement]

    Page(s): 850
    Save to Project icon | Request Permissions | PDF file iconPDF (205 KB)  
    Freely Available from IEEE
  • 2011 IEEE membership form

    Page(s): 851 - 852
    Save to Project icon | Request Permissions | PDF file iconPDF (1361 KB)  
    Freely Available from IEEE
  • IEEE Circuits and Systems Society Information

    Page(s): C3
    Save to Project icon | Request Permissions | PDF file iconPDF (33 KB)  
    Freely Available from IEEE
  • IEEE Transactions on Circuits and Systems for Video Technology Information for authors

    Page(s): C4
    Save to Project icon | Request Permissions | PDF file iconPDF (33 KB)  
    Freely Available from IEEE

Aims & Scope

The emphasis is focused on, but not limited to:
1. Video A/D and D/ A
2. Video Compression Techniques and Signal Processing
3. Multi-Dimensional Filters and Transforms
4. High Speed Real-Tune Circuits
5. Multi-Processors Systems—Hardware and Software
6. VLSI Architecture and Implementation for Video Technology 

 

Full Aims & Scope

Meet Our Editors

Editor-in-Chief
Dan Schonfeld
Multimedia Communications Laboratory
ECE Dept. (M/C 154)
University of Illinois at Chicago (UIC)
Chicago, IL 60607-7053
tcsvt-eic@tcad.polito.it

Managing Editor
Jaqueline Zelkowitz
tcsvt@tcad.polito.it