By Topic

Circuits and Systems for Video Technology, IEEE Transactions on

Issue 1 • Date Jan. 2008

Filter Results

Displaying Results 1 - 20 of 20
  • Table of contents

    Page(s): C1
    Save to Project icon | Request Permissions | PDF file iconPDF (73 KB)  
    Freely Available from IEEE
  • IEEE Transactions on Circuits and Systems for Video Technology publication information

    Page(s): C2
    Save to Project icon | Request Permissions | PDF file iconPDF (38 KB)  
    Freely Available from IEEE
  • Message From the Editor-in-Chief

    Page(s): 1 - 2
    Save to Project icon | Request Permissions | PDF file iconPDF (216 KB)  
    Freely Available from IEEE
  • Which Components are Important for Interactive Image Searching?

    Page(s): 3 - 11
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1434 KB) |  | HTML iconHTML  

    With many potential industrial applications, content-based image retrieval (CBIR) has recently gained more attention for image management and web searching. As an important tool to capture users' preferences and thus to improve the performance of CBIR systems, a variety of relevance feedback (RF) schemes have been developed in recent years. One key issue in RF is: which features (or feature dimensions) can benefit this human-computer iteration procedure? In this paper, we make theoretical and practical comparisons between principal and complement components of image features in CBIR RF. Most of the previous RF approaches treat the positive and negative feedbacks equivalently although this assumption is not appropriate since the two groups of training feedbacks have very different properties. That is, all positive feedbacks share a homogeneous concept while negative feedbacks do not. We explore solutions to this important problem by proposing an orthogonal complement component analysis. Experimental results are reported on a real-world image collection to demonstrate that the proposed complement components method consistently outperforms the conventional principal components method in both linear and kernel spaces when users want to retrieve images with a homogeneous concept. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Robust Dominant Motion Estimation Using MPEG Information in Sport Sequences

    Page(s): 12 - 22
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1597 KB) |  | HTML iconHTML  

    In this paper, we introduce a new method to estimate a parametric description of the dominant motion existing in a video sequence, a key task needed to face more complex video analysis problems. In order to do so, we use motion data provided by the MPEG streams. We propose a method based on imaginary straight line tracking to retrieve the projective transformations that describe the dominant motion of a sequence by estimating 2-D homographies. Our method takes advantage not only of the MPEG motion data, but also of its structure. In order to overcome the noise introduced by the MPEG compression scheme, we also employ several robust estimators to attain reliability. We demonstrate its performance by displaying several image mosaics derived from the motion in real time. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • BaTex3: Bit Allocation for Progressive Transmission of Textured 3-D Models

    Page(s): 23 - 35
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1879 KB) |  | HTML iconHTML  

    The efficient progressive transmission of a textured 3D model, which has a texture image mapped on a geometric mesh, requires careful organization of the bit stream to quickly present high-resolution visualization on the user's screen. Bit rates for the mesh and the texture need to be properly balanced so that the decoded model has the best visual fidelity during transmission. This problem is addressed in the paper, and a bit allocation framework is proposed. In particular, the relative importance of the mesh geometry and the texture image on visual fidelity is estimated using a fast quality measure (FQM). Optimal bit distribution between the mesh and the texture is then computed under the criterion of maximizing the quality measured by FQM. Both the quality measure and the bit allocation algorithm are designed with low computation complexity. Empirical studies show that not only the bit allocation framework maximizes the receiving quality of the textured 3D model, but also it does not require sending additional information to indicate bit boundaries between the mesh and the texture in the multiplexed bit stream, which makes the streaming application more robust to bit errors that may occur randomly during transmission. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Reconstruction and Recognition of Tensor-Based Objects With Concurrent Subspaces Analysis

    Page(s): 36 - 47
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1008 KB) |  | HTML iconHTML  

    Principal components analysis (PCA) has traditionally been utilized with data expressed in the form of 1-D vectors, but there exists much data such as gray-level images, video sequences, Gabor-filtered images and so on, that are intrinsically in the form of second or higher order tensors. For representations of image objects in their intrinsic form and order rather than concatenating all the object data into a single vector, we propose in this paper a new optimal object reconstruction criterion with which the information of a high-dimensional tensor is represented as a much lower dimensional tensor computed from projections to multiple concurrent subspaces. In each of these subspaces, correlations with respect to one of the tensor dimensions are reduced, enabling better object reconstruction performance. Concurrent subspaces analysis (CSA) is presented to efficiently learn these subspaces in an iterative manner. In contrast to techniques such as PCA which vectorize tensor data, CSA's direct use of data in tensor form brings an enhanced ability to learn a representative subspace and an increased number of available projection directions. These properties enable CSA to outperform traditional algorithms in the common case of small sample sizes, where CSA can be effective even with only a single sample per class. Extensive experiments on images of faces and digital numbers encoded as second or third order tensors demonstrate that the proposed CSA outperforms PCA-based algorithms in object reconstruction and object recognition. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Aliasing Reduction via Frequency Roll-Off for Scalable Image/Video Coding

    Page(s): 48 - 58
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (3503 KB) |  | HTML iconHTML  

    The extracted low-resolution video from a motion compensated 3-D subband/wavelet scalable video coder is unnecessarily sharp and sometimes contains significant aliasing, compared to that obtained by the MPEG4 lowpass filter. In this paper, we propose a content adaptive method for aliasing reduction in subband/wavelet scalable image and video coding. We try to make the low-resolution frame (LL subband) visually and energy-wise similar to that of the MPEG4 decimation filter through frequency roll-off. Scaling of the subbands is introduced to make the variances of the subbands comparable in these two cases. Thanks to the embedded properties of the EZBC coder, we can achieve the needed scaling of energies in each subband by subbitplane shift in the extractor and value (coefficient) scaling in the decoder. Two methods are presented which offer substantial peak signal-to-noise ratio (PSNR) gain for lower spatial resolution, as well as substantial reduction in visible aliasing, with little or no reduction in full resolution PSNR. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Redundant Slice Optimal Allocation for H.264 Multiple Description Coding

    Page(s): 59 - 70
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1146 KB) |  | HTML iconHTML  

    In this paper, a novel H.264 multiple description technique is proposed. The coding approach is based on the redundant slice representation option, defined in the H.264 standard. In presence of losses, the redundant representation can be used to replace missing portions of the compressed bit stream, thus yielding a certain degree of error resilience. This paper addresses the creation of two balanced descriptions based on the concept of redundant slices, while keeping full compatibility with the H.264 standard syntax and decoding behavior in case of single description reception. When two descriptions are available still a standard H.264 decoder can be used, given a simple preprocessing of the received compressed bit streams. An analytical setup is employed in order to optimally select the amount of redundancy to be inserted in each frame, taking into account both the transmission condition and the video decoder error propagation. Experimental results demonstrate that the proposed technique favorably compares with other H.264 multiple description approaches. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Reduced-Reference Video Quality Assessment Using Discriminative Local Harmonic Strength With Motion Consideration

    Page(s): 71 - 83
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1074 KB) |  | HTML iconHTML  

    This paper presents a reduced-reference objective picture quality measurement tool of compressed video. We have used a discriminative analysis of harmonic strength computed from edge-detected pictures to create harmonics gain and loss information that could be associated with the picture. The harmonics gain/loss are derived through the harmonic analysis of the compressed and source pictures to be incorporated in the reduced-reference video quality meter. This information corresponds with the two most prominent compression distortions, namely blockiness and blurriness. We have also studied the impact of motion in a video sequence on these compression distortions and the way they should be weighted and combined to give the best objective quality metric model. The model has been calibrated using several video sequences with dominant blockiness and blurriness. Validation of the model is performed by applying the model to the 50 Hz video sequences of VQEG Test Phase-I. Our results show that the proposed model achieves good correlations with the subjective evaluations of the VQEG datasets and its performance is comparable to those of the full-reference models in the literature. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • The Technique of Prescaled Integer Transform: Concept, Design and Applications

    Page(s): 84 - 97
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1501 KB) |  | HTML iconHTML  

    Integer cosine transform (ICT) is adopted by H.264/AVC for its bit-exact implementation and significant complexity reduction compared to the discrete cosine transform (DCT) with an impact in peak signal-to-noise ratio (PSNR) of less than 0.02 dB. In this paper, a new technique, named prescaled integer transform (PIT), is proposed. With PIT, while all the merits of ICT are kept, the implementation complexity of decoder is further reduced compared to corresponding conventional ICT, which is especially important and beneficial for implementation on low-end processors. Since not all PIT kernels are good in respect of coding efficiency, design rules that lead to good PIT kernels are considered in this paper. Different types of PIT and their target applications are examined. Both fixed block-size transform and adaptive block-size transform (ABT) schemes of PIT are also studied. Experimental results show that no penalty in performance is observed with PIT when the PIT kernels employed are derived from the design rules. Up to 0.2 dB of improvement in PSNR for all intra frame coding compared to H.264/AVC can be achieved and the subjective quality is also slightly improved when PIT scheme is carefully designed. Using the same concept, a variation of PIT, post-scaled integer transform, can also be potentially designed to simplify the encoder in some special applications. PIT has been adopted in audio video coding standard (AVS), Chinese National Coding standard. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Efficient Architecture Design of Motion-Compensated Temporal Filtering/Motion Compensated Prediction Engine

    Page(s): 98 - 109
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1915 KB) |  | HTML iconHTML  

    Since motion-compensated temporal filtering (MCTF) becomes an important temporal prediction scheme in video coding algorithms, this paper presents an efficient temporal prediction engine which not only is the first MCTF hardware work but also supports traditional motion-compensated prediction (MCP) scheme to provide computation scalability. For the prediction stage of MCTF and MCP schemes, modified extended double current Frames is adopted to reduce the system memory bandwidth, and a frame-interleaved macroblock pipelining scheme is proposed to eliminate the induced data buffer overhead. In addition, the proposed update stage architecture with pipelined scheduling and motion estimation (ME)-like motion compensation (MC) with level C+ scheme can also save about half external memory bandwidth and eliminate irregular memory access for MC. Moreover, 76.4% hardware area of the update stage is saved by reusing the hardware resources of the prediction stage. This MCTF chip can process CIF 30 fps in real-time, and the searching range is [-32, 32) for 5/3 MCTF with four-decomposition level and also support 1/3 MCTF, hierarchical B-frames, and MCP coding schemes in JSVM and H.264/AVC. The gate count is 352-K gates with 16.8 KBytes internal memory, and the maximum operating frequency is 60 MHz. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Simultaneous Geometric and Radiometric Adaptation to Dynamic Surfaces With a Mobile Projector-Camera System

    Page(s): 110 - 115
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (646 KB) |  | HTML iconHTML  

    Existing geometric and radiometric compensation methods for direct-projected augmented reality focus on static projection surfaces rather than dynamic surfaces (with varying geometry in time). We aim at providing an effective framework for projecting a sequence of augmented reality images onto dynamic surfaces without geometric and radiometric distortion. We present our design of a special pattern image for simultaneous geometric and radiometric compensation and evaluate two different techniques for embedding the pattern image into augmented reality images. The validity of the proposed method is examined through a variety of experiments with a mobile projector-camera system. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Rate Control of H.264/AVC Scalable Extension

    Page(s): 116 - 121
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (225 KB) |  | HTML iconHTML  

    This paper presents a rate control scheme for H.264/AVC scalable extension. Based on our previous work on H.264/AVC rate control, a switched model is proposed to predict the mean absolute difference (MAD) of the residual texture from the available MAD information of the previous frame in the same layer and the same frame in its ldquobase layer.rdquo Thus, abrupt MAD fluctuations could be predicted properly in the enhancement layer. Moreover, a bit allocation scheme is proposed for the hierarchical B-frames structure by taking into consideration the relative importance of each frame. With our algorithm, the rate control for all the coarse-grain-scalability, spatial, temporal and combined enhancement layer could be realized, and the target bit rate for each layer can be achieved. Our method encodes the sequence only once and the buffer is well controlled to prevent it from overflowing and under flowing. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Efficient Image Deblocking Based on Postfiltering in Shifted Windows

    Page(s): 122 - 126
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (4095 KB) |  | HTML iconHTML  

    We propose a simple yet effective deblocking method for JPEG compressed image through postfiltering in shifted windows (PSW) of image blocks. The MSE is compared between the original image block and the image blocks in shifted windows, so as to decide whether these altered blocks are used in the smoothing procedure. Our research indicates that there exists strong correlation between the optimal mean squared error threshold and the image quality factor Q, which is selected in the encoding end and can be computed from the quantization table embedded in the JPEG file. Also we use the standard deviation of each original block to adjust the threshold locally so as to avoid the over-smoothing of image details. With various image and bit-rate conditions, the processed image exhibits both great visual effect improvement and significant peak signal-to-noise ratio gain with fairly low computational complexity. Extensive experiments and comparison with other deblocking methods are conducted to justify the effectiveness of the proposed PSW method in both objective and subjective measures. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Fast Selective Intra-Mode Search Algorithm Based on Adaptive Thresholding Scheme for H.264/AVC Encoding

    Page(s): 127 - 133
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1403 KB) |  | HTML iconHTML  

    A fast selective-intra mode search algorithm based on the rate-distortion (RD) cost for an inter-frame is proposed for H.264/AVC video encoding. In addition to the inter-mode search procedure with variable block size, an intra mode search causes a significant increase in the complexity and computational load for an inter-frame. To reduce the computational load of the intra mode search at the inter-frame, the RD costs of the neighborhood mac-roblocks (MBs) for the current MB are used and we propose an adaptive thresholding scheme for skipping intra mode search. For the IPPP sequence type, the overall encoding time can be reduced up to 40% and 42% for the IBBPBBP sequence type through comparative analysis of experimental results with JM reference software. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Region-of-Interest Based Resource Allocation for Conversational Video Communication of H.264/AVC

    Page(s): 134 - 139
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (691 KB) |  | HTML iconHTML  

    Due to the complexity of H.264/AVC, it is very challenging to apply this standard to design a conversational video communication system. This problem is addressed in this paper by using region-of-interest (ROI) based bit allocation and computational power allocation schemes. In our system, the ROI is first detected by using the direct frame difference and skin-tone information. Several coding parameters including quantization parameter, candidates for mode decision, the number of referencing frames, accuracy of motion vectors and the search range of motion estimation are adaptively adjusted at the macroblock (MB) level according to the relative importance of each MB. Subsequently, the encoder could allocate more resources such as bits and computational power to the ROI, and the decoding complexity is also optimized at the encoder side by utilizing an ROI based rate-distortion-complexity (R-D-C) cost function. The encoder is thus simplified and decoding-friendly, and the overall subjective visual quality can also be improved. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Rate-Distortion Optimization of Rate Control for H.264 With Adaptive Initial Quantization Parameter Determination

    Page(s): 140 - 144
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (238 KB) |  | HTML iconHTML  

    A rate-distortion (R-D) optimization rate control (RC) algorithm with adaptive initialization is presented for H.264. First, a linear distortion-quantization (D-Q) model is introduced and thus a close-form solution is developed to derive optimal quantization parameters (Qp)for encoding each macroblock. Then we exploit to determine the initial Qp efficiently and adaptively according to the content of video sequences. The experimental results demonstrate that the proposed algorithm can achieve better R-D performance than that of other two RC algorithms including the algorithm JVT-G012 which is the current recommended RC scheme implemented in the H.264 reference software JM9.5. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • IEEE Circuits and Systems Society Information

    Page(s): C3
    Save to Project icon | Request Permissions | PDF file iconPDF (30 KB)  
    Freely Available from IEEE
  • IEEE Transactions on Circuits and Systems for Video Technology Information for authors

    Page(s): C4
    Save to Project icon | Request Permissions | PDF file iconPDF (31 KB)  
    Freely Available from IEEE

Aims & Scope

The emphasis is focused on, but not limited to:
1. Video A/D and D/ A
2. Video Compression Techniques and Signal Processing
3. Multi-Dimensional Filters and Transforms
4. High Speed Real-Tune Circuits
5. Multi-Processors Systems—Hardware and Software
6. VLSI Architecture and Implementation for Video Technology 

 

Full Aims & Scope

Meet Our Editors

Editor-in-Chief
Dan Schonfeld
Multimedia Communications Laboratory
ECE Dept. (M/C 154)
University of Illinois at Chicago (UIC)
Chicago, IL 60607-7053
tcsvt-eic@tcad.polito.it

Managing Editor
Jaqueline Zelkowitz
tcsvt@tcad.polito.it